Match Specification Weight Issue

Infosphere's Quality Product

Moderators: chulett, rschirm

Post Reply
kanasai167
Participant
Posts: 63
Joined: Mon Sep 12, 2011 2:11 am

Match Specification Weight Issue

Post by kanasai167 »

My input data is :
ALEXANDER BARTON
ALEX BARTON
ALEX BARTON
ALEXENDER BARTON

In Column Details , the value for Agreement Weight and Disagreement Weight are :
InputName Agreement Disagreement
ALEXANDER,3.3,-3.18
ALEX ,2.3,-3.03
ALEXENDER,3.3,-3.18

http://imageshack.us/photo/my-images/88 ... cation.png

I would like to know how the -1.11 generate from ?
And also the weight for ALEXENDER. Why the weight is 0.19 ?
Because from what i see , the Agreement Weight for ALEXENDER is 3.3 and BARTON is 1.3. SO the weight suppose to be 4.6 ? Same as ALEXANDER BARTON?
thank you.
kanasai167
Participant
Posts: 63
Joined: Mon Sep 12, 2011 2:11 am

Post by kanasai167 »

*extra information*

for the Match Command , the comparison types i use is NAME_UNCERT.
i try with different Param 1 values : 700 , 750 , 800 and 850.
The master record is ALEXANDER BARTON , the weight is 4.6 for all the time.
Param 1 Weight for ALEXENDER BARTON
700,2.39
750,1.66
800,0.19
850,0


For Param = 850 , my record type become RA.
How the Param values affect the weight values ? What is the formula used?

http://publib.boulder.ibm.com/infocente ... rison.html

Param 1. The minimum threshold, which is a number 0 - 900. In other words, a higher Param 1 value causes the match to tolerate fewer differences than it would with a lower Param 1 value.

900. The two strings are identical.
850. The two strings can be considered the same.
800. The two strings are probably the same.
750. The two strings are probably different.
700. The two strings are different.

Example

The assigned weight is proportioned linearly between the agreement and disagreement weights. For example, if you specify 700 and the score is 700 or less, then the full disagreement weight is assigned. If the strings agree exactly, the full agreement weight is assigned.

Suppose you specify 850 for the MatchParm, which means that the tolerance is relatively low. A score of 800 would get the full disagreement weight because it is lower than the parameter that you specified. Even though a score of 800 means that the strings are probably the same, you require a low tolerance.
Post Reply