Hash File Tuning

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
Kalyan3699
Participant
Posts: 2
Joined: Fri Dec 09, 2005 3:12 pm

Hash File Tuning

Post by Kalyan3699 »

I am trying to read 2 million row table (60 Columns of which 4 Columns are keys) into a hash file.its been taking around 2 hrs to load the data into the file with the default options that are provided with the Hash file.I tried to optimize the hash file using HFC.exe(Hash File Calculator tool),it said either to use a File Type of 14,18 or Modulo of 367097,still the performance is no different.Could someone provide me any options in tuning the hash file.

Thanks
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

pre-sizing the modulo on a dynamic file and/or using a static hashed file of appropriate size works well. Also, do you have row buffering enabled? That will increase your speed. Are you sure that your hashed file write is the bottleneck? If you do short test and change your hashed file stage into a sequential file writing to /dev/null do you get a much better speed?
ray.wurlod
Participant
Posts: 54595
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

You might also try using Write Cache, having allowed the maximum possible cache size (999MB).
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Kalyan3699
Participant
Posts: 2
Joined: Fri Dec 09, 2005 3:12 pm

Post by Kalyan3699 »

Ray,
I am writing to the same hash file and reading from the same hash file by which i cant select the write cache option.

Thanks
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Ok... that would have been a nice fact to mention up front. Next question, why are you doing that? You are obviously doing a wee bit more than simply reading 2 million rows into a hashed file...
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54595
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

With a hashed file that size, do the initial population in a separate job, WITH write cache enabled. Be amazed!
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply