Page 1 of 1

Lookup stage

Posted: Thu Sep 16, 2010 2:37 am
by srinivas.nettalam
Hi ,
Please let me know whether the conception below is right.
If the primary and reference links have the same column names for non-key columns, then even though the reference column is pulled the data in it is the data from primary link's column.
Please let me know on this ....

Posted: Thu Sep 16, 2010 3:46 am
by ray.wurlod
Welcome aboard.

You really haven't given enough information - I am presuming that your question refers to a Lookup stage.

The answer is "maybe". On the output link of the stage you can control which input columns get mapped to output columns.

Question on Lookup stage

Posted: Thu Sep 16, 2010 4:09 am
by srinivas.nettalam
Sorry for not being clear.Here is my concern.Primary link has 2 columns EMP_ID ,EMP_NAME and Reference link has 2 columns EMP_ID and EMP_NAME.key to lookup is EMP_ID and EMP_NAME from reference is mapped to output.Though reference.EMP_NAME is mapped ,the data in that would be the data from primary link,is that true?[/quote]

Posted: Thu Sep 16, 2010 4:18 am
by PhilHibbs
Yes, because if the lookup fails (and you have Continue selected in the input link's Lookup Failure option) then you will get the stream's value and not the NULL of a failed lookup. You can map the lookup link's value to another field if you want, so that you can see which ones failed the lookup further downstream.

Posted: Thu Sep 16, 2010 4:28 am
by srinivas.nettalam
PhilHibbs,
Thank you very much.

Posted: Thu Sep 16, 2010 6:02 am
by srinivas.nettalam
Does it happen only when "continue" is selected?what data would be present if any of the other options is selected?

Posted: Thu Sep 16, 2010 6:29 am
by jpraveen
Hi Srinivas

Continue is like LEFT JOIN,
Drop is like INNER JOIN,
Reject- the record goes to Reject link,
Fail-we can make Lookup to fail when the condition not met and tha job grt Aborted.

Posted: Thu Sep 16, 2010 6:30 am
by kwwilliams
In your case continue would not be recommended. You should chose reject which would allow you to separate out your matched and unmatched records from the lookup.

You can then handle your rejects (set the empid and name to null or some other value) and then funnel the results together. In general this type of construct gives more visibility into what is occurring in the job. Many people prefer to continue and then handle the lookup failure in a downstream transformer. I prefer to lookup and reject, because it give more verbose information to whomever would be supporting the ETL I write.

Posted: Thu Sep 16, 2010 7:25 am
by srinivas.nettalam
Please pardon me for not being clear with the question from the beginning.
My question is only regarding the functionality of lookup stage only and not regarding any requirement.

If the primary and reference have same column name as per my previous question ,and reference.EMP_NAME is mapped to ouput,the data that flows to output is from stream.EMP_NAME though stream.EMP_NAME is not mapped to any of the output columns

Posted: Thu Sep 16, 2010 9:20 am
by kwwilliams
Interview question??

If you use continue and mapped the reference as the output, the result would be null if the lookup failed and you would have to handle it downstream.

Reject would only allow matches out the primary stream while creating a reject stream.

Fail would cause the entire job to fail.

Posted: Thu Sep 16, 2010 4:19 pm
by kumar_s
Yes, Datastage will drop the columns from the Lookup internally and will Keep the columns from Main stream to the output.