Hi
I have a job which is reading from oracle source table and looking up with a another oracle table. the data in the reference table is extremely large as compared to source.
I want to know which option wud be more efficient ,a sparselookup or a merge stage.
sparse Lookup v/s Merge
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 3593
- Joined: Thu Jan 23, 2003 5:25 pm
- Location: Australia, Melbourne
- Contact:
It depends on how large your source data is. If it has hundreds of thousands of rows then a sparse lookup is going to hit your reference database with hundreds of thousands of queries. A merge stage is going to pull millions of rows out of the Oracle database. A third option is to bulk load the input data into Oracle and do a join between the two tables - this may be the fastest with the smallest impact on the database.
Certus Solutions
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn:Vincent McBurney LinkedIn
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn:Vincent McBurney LinkedIn
-
- Premium Member
- Posts: 62
- Joined: Thu Dec 28, 2006 11:54 pm
actually i want to have a reject file just in case there are no matching records in lookup(normally there should be no rejects )
I tried sparse lookup but it takes very long and gives me undesired number of rows in output.
I am using the merge stage as of now. Do I need to sort the data before using merge? Also, I read that there should be no duplicates for merge to take place correctly. However,my data is such that the column based on which I am merging can have the same value in hundreds of rows. Will this cause a problem?
I tried sparse lookup but it takes very long and gives me undesired number of rows in output.
I am using the merge stage as of now. Do I need to sort the data before using merge? Also, I read that there should be no duplicates for merge to take place correctly. However,my data is such that the column based on which I am merging can have the same value in hundreds of rows. Will this cause a problem?
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Merge stage requires de-duplicated data in the "update" input(s), and in the "master" input if there are more than one update inputs. Merge stage requires that all inputs be sorted on the Key column(s). Getting the right results also requires that all inputs be partitioned on the Key column(s).
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.