Comparision of Hash and OCI lookup

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
NBALA
Participant
Posts: 48
Joined: Tue Jul 11, 2006 11:52 am
Location: IL, USA

Comparision of Hash and OCI lookup

Post by NBALA »

Hi,

Hope this might be discussed earlier, when searching I could not get right answer.

Situation : Consider a fact table contains 7 dimension and need to lookup all 7 dimensions.

While comparing hask lookup and OCI lookup which one will be best choice? and which will give good performance?

Which will more suitable to implement SCD?

If we used hash table, what are the issues can one expect?

Thanks
-NB
satish_valavala
Participant
Posts: 123
Joined: Wed May 18, 2005 7:41 am
Location: USA

Post by satish_valavala »

Hi Bala,

Do you mean lookup dimension table(s) using OCI stage(s)?
What is the volume of your dimension table(s)?

Best practice in performing lookup is, using Hashed Files. By using Hashed Files, you always write lookup data into Hashed Files, where as with OCI stages ( OCI SQL / user-defined SQL) you will directly access physical DB tables.

Best Regards
VS
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

Either works. It is a matter of preferrence. It is also a performance issue. Hashed files are faster but they take some time to build. If you add up the overall times for both then that should be your deciding factor. The fewer rows in the dimension the faster it loads. So I would say until you get to over 100,000 rows hashed files are faster then you need to check your performance of both.
Mamu Kim
deepak.shanthamurthy
Participant
Posts: 17
Joined: Tue Apr 11, 2006 2:35 pm

Post by deepak.shanthamurthy »

you might also want to consider the update frequency of your dimensions as a factor
Deepak
NBALA
Participant
Posts: 48
Joined: Tue Jul 11, 2006 11:52 am
Location: IL, USA

Post by NBALA »

Thanks all !

The dimension has around 100,000 rows(in one of the Dim), I tried to use OCI but getting poor performance like 8 rows/sec. It started with 121 rows/sec then it started gradually decreasing the number of rows.. fact is having 1.3 million to load.

I will try with hash and check the performance.

-NB
lcallif2
Participant
Posts: 4
Joined: Fri Sep 22, 2006 4:05 pm
Location: MST
Contact:

Re: Comparision of Hash and OCI lookup

Post by lcallif2 »

You could have this done in the source stage, outerjoining the 7 dimension tables. Thsi will allow the Database to perform all the lookup work, which it can perform as LOT better then the Job via a hash or OCI lookup....

Thanks,

Lester


Hope this might be discussed earlier, when searching I could not get right answer.

Situation : Consider a fact table contains 7 dimension and need to lookup all 7 dimensions.

While comparing hask lookup and OCI lookup which one will be best choice? and which will give good performance?

Which will more suitable to implement SCD?

If we used hash table, what are the issues can one expect?

Thanks
-NB[/quote]
Lester
Post Reply