Matching question

Infosphere's Quality Product

Moderators: chulett, rschirm

Post Reply
sigma
Premium Member
Premium Member
Posts: 83
Joined: Thu Aug 07, 2008 1:22 pm

Matching question

Post by sigma »

Dear all

I am trying to use qualitystage to find matches within our customer master

The challenge I have is I do not have a whole lot of input to go off. All I have is a file with two columns customer name and state

For this post please assume data is specific to US only.

I have not used standardization at all... just taking raw data from the file which has the customer name and state and trying to find a match.

I narrow the blocking criteria I am taking the first 4 characters of the name from each source

So my blocking is COUNTRY, STATE and first four characters

In my example I pass one record

HARVARD UNIV, MA

It does get decent matches but will not match HARVARD BIOLOGY

Even if I want oto keep all matche scores at zero

I realize this is a not a great example but I want to understand why it would not pick up a match on HARVARD BIOLOGY
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Blocking should group the HARV values together into a set. What match rules are you using? Are there any overflow blocks (review the match statistics)? Any records in an overflow block will be treated automatically as residuals. If that's occurring, review your blocking and matching strategies.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply