multiple passes question

Infosphere's Quality Product

Moderators: chulett, rschirm

Post Reply
vijaydasari
Participant
Posts: 29
Joined: Sun Jul 22, 2007 3:25 pm

multiple passes question

Post by vijaydasari »

I have a match specification with two passes , match type I am using is Unduplicate Dependent . In first pass I am doing address details match , based on address match I am seeing master and duplicates records.

second pass is customer id match , if there are different address with same customer id , then objective is to treat single customer information.

problem is master records having weights in first pass are becoming residual records to second pass , even though they have same customer id.

how to make them to master and duplicate pair.
Vijaya K Dasari
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Can you please supply more details about the blocking fields and match rules used for each pass?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
vijaydasari
Participant
Posts: 29
Joined: Sun Jul 22, 2007 3:25 pm

Post by vijaydasari »

pass1

block commadsprimaryname_USNAME
matchfirstname_USNAME
zipcode_USAREA
housenumber_usaddr
streetname_usaddr

match commands

address1
state code
postal code

pass2

block commands

Customer id

no match commands
Vijaya K Dasari
stuartjvnorton
Participant
Posts: 527
Joined: Thu Apr 19, 2007 1:25 am
Location: Melbourne

Post by stuartjvnorton »

No match fields in the second pass give it no way to get a score other than 0. And if the cutoffs are also zero, then it will be a residual instead of a dupe.

So either add a matching field (eg the customer ID again) to manufacture a score or reduce the cutoffs below zero to force dupes.
vijaydasari
Participant
Posts: 29
Joined: Sun Jul 22, 2007 3:25 pm

Post by vijaydasari »

Thanks you very much for your suggestion . I tried to have match command as customer id , but still same issue.

I will try the option cutoffs below zero.
Vijaya K Dasari
BI-RMA
Premium Member
Premium Member
Posts: 463
Joined: Sun Nov 01, 2009 3:55 pm
Location: Hamburg

Re: multiple passes question

Post by BI-RMA »

vijaydasari wrote:I have a match specification with two passes , match type I am using is Unduplicate Dependent . problem is master records having weights in first pass are becoming residual records to second pass , even though they have same customer id.
Hi Vijay,

if You use Unduplicate Dependent, data will only run through the second match-pass when match-pass one was unsuccessful. Duplicates are removed from further match-considerations after the first pass. They donot become residual in the second pass.

If You want your second match-pass to contain all records, You have to use Unduplicate Independent.
"It is not the lucky ones are grateful.
There are the grateful those are happy." Francis Bacon
vijaydasari
Participant
Posts: 29
Joined: Sun Jul 22, 2007 3:25 pm

Post by vijaydasari »

I included customer ID in match commands and gave agreement & disagreement weight as 10 and -15 respectively.

Also cutoff values section match and clerical fields accept values from 0 to 999999.99.

I tried with different values for match and clerical field , but still facing same issue.
Vijaya K Dasari
vijaydasari
Participant
Posts: 29
Joined: Sun Jul 22, 2007 3:25 pm

Post by vijaydasari »

I included customer ID in match commands and gave agreement & disagreement weight as 10 and -15 respectively.

Also cutoff values section match and clerical fields accept values from 0 to 999999.99.

I tried with different values for match and clerical field , but still facing same issue.
Vijaya K Dasari
vijaydasari
Participant
Posts: 29
Joined: Sun Jul 22, 2007 3:25 pm

Re: multiple passes question

Post by vijaydasari »

Hi BI-RMA,

In first pass I have 5 master records (record type MP) have setids 1,2,4,6 & 9 . these five records came as residual records in 2nd pass with same set id number but record type changed to RA.

my objective is to make one of the record to MP and rest to DA.
Vijaya K Dasari
BI-RMA
Premium Member
Premium Member
Posts: 463
Joined: Sun Nov 01, 2009 3:55 pm
Location: Hamburg

Re: multiple passes question

Post by BI-RMA »

vijaydasari wrote:Hi BI-RMA,

In first pass I have 5 master records (record type MP) have setids 1,2,4,6 & 9 . these five records came as residual records in 2nd pass with same set id number but record type changed to RA.

my objective is to make one of the record to MP and rest to DA.
Hi Vijay,

The more I think of it the more it seems to me that running your data through a second match-pass may be a bad idea altogether.

What you want to achieve is basically to get a common key for groups of records sharing the same customer number and this is completely unrelated to your first match-pass, right?

You can achieve that in DataStage by sorting by CustomerId and evaluating key-change.
"It is not the lucky ones are grateful.
There are the grateful those are happy." Francis Bacon
stuartjvnorton
Participant
Posts: 527
Joined: Thu Apr 19, 2007 1:25 am
Location: Melbourne

Post by stuartjvnorton »

Completely agree with BI_RMA: that's a much better fit than trying to shoe-horn it through a manufactured match pass.
vijaydasari
Participant
Posts: 29
Joined: Sun Jul 22, 2007 3:25 pm

Post by vijaydasari »

Thank you very much for responses. Issue resolved by changing match type to Unduplicate independent
Vijaya K Dasari
Post Reply