What is the best practice for joining Nullable keys

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
balajimadhav
Participant
Posts: 15
Joined: Thu Jul 06, 2006 8:59 am

What is the best practice for joining Nullable keys

Post by balajimadhav »

Hello,

I have a job design where source and target are oracle tables. There are two additional tables to be joined with the source. Job design is source ->join1->join2->copy->target. In both the joins the key column identified is nullable and the existing sample records has no null values in the data though these join keys has been defined as nulls. At the pre join step, data has been partitioned (hash) and sorted as per the join keys in each join stage as the join key varies between join1 and join2.

I would like to clarify how to handle this case as per the best practice as i have applied modify stage on both links of join considering the rule that key column should not be null.Point to be noted is the target column to be mapped is also null in target.

Also this scenario can be handled without modify stage keeping all keys right from source to target as nullable and i get the same result in both cases (with and without modify stages). My assumption is DS will drop records with null values in join keys and even i tried to simulate the join keys as null and still i could see the data is processing successful

Recommending the table definition to be changed for nullability of key columns is not possible in current scenario and hence no changes could be made to tables. Expected data volume in each table is around 100K records in each table

please share your thoughts. i have checked all previous mail threads but couldnt get this clarified and hence created a new post
Balaji.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

In my experience, one would be converting any NULL values to an in-band value that you know doesn't naturally occur in the data before the join and then converting them back afterwards. Assuming you want the NULL values to be considered equal and be joined.
-craig

"You can never have too many knives" -- Logan Nine Fingers
balajimadhav
Participant
Posts: 15
Joined: Thu Jul 06, 2006 8:59 am

Post by balajimadhav »

Thanks for clarifying. As i understand in case if null values not in scope for my join then the null to not null conversion is not required. Also how does DS handles when two nullable keys with data being NULL when joined? i observed that it joins and produces the result as NULL. shouldn't it drop the record?
Balaji.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Please note, too, that sometimes you have to specify nullable, for example when the input link is the outer side of an outer join or if you are effecting a full outer join.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply