Slow Dynamic Hash File

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
asorrell
Posts: 1707
Joined: Fri Apr 04, 2003 2:00 pm
Location: Colleyville, Texas

Slow Dynamic Hash File

Post by asorrell »

OK - We have a moderately simple job that reads from a sequential file and builds a dynamic file from it. Only one column (big integer), and it has about 12 million records. On the old DS 7.5/Windows system it ran at about 18,000 rps. On the new DS 8.1/Redhat Linux system it starts off fast, but ends up running 500 rps at the end.

I have transferred uvconfig settings (MFILES, MAXRLOCK, etc.) from the old system and verified them.

One symptom - if I set a minimum modulus on the file to approx 101,000 (what the file ends up with at completion) the job will start at 500 rps and stay there (in other words - even worse than without MM increased).

I've been on the PX side for too long and I want to make sure I'm not forgetting anything.

Any suggestions as to what to look at / try to improve performance?
Andy Sorrell
Certified DataStage Consultant
IBM Analytics Champion 2009 - 2020
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

You using write cache?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

Look at the i/o on the filesystem holding the hashed file and make sure it's not running 100%. Install top and watch the cpu utilization (you could look at DS Monitor and look at cpu usage, but that's unreliable. See if the process writing to the hashed file is max'ing out the cpu, if not, look to see what the i/o is doing.

Write delay caching lets the writes queue up and commit in large chunks, so you could be seeing a massive write-delay. The final write to disk could be bottlenecking massively.

There's really nothing to do in your situation except set the minimum modulus and leave the rest of the settings on the hashed file alone.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Well, there's always the static hashed file approach, but I agree with Ken's assessment. Maintaining static hashed files is more trouble than it's worth (this includes getting the initial size right).
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
asorrell
Posts: 1707
Joined: Fri Apr 04, 2003 2:00 pm
Location: Colleyville, Texas

Post by asorrell »

More Info!

What is really strange is that the job was running a LOT faster on the smaller Dev box than on the larger Production box (about 28k rps). The Dev box uses local disk whereas the Production box uses SAN disk, but I did a test and the SAN file-system wasn't affecting performance.

I was thinking in the same vein both of you were - wondering if the I/O cache was affecting it. I figured it was safe to turn on the Stage Write Cache since the job is just loading the file.

When I ran the job the "load" finished running at around 90K rps. Then of course it then took several minutes for UNIX to completely write the cache to disk so at the end the performance averaged 35K rps. That doesn't necessarily fix the problem, but it does compensate enough to make the job runnable in the allotted timeframe.

In playing around with the following settings I saw no appreciable difference:
- SEQ.NUM vs. GENERAL hash (the key is numeric, mostly sequential)
- Clearing file vs. Delete / Create
- Local disk vs. SAN disk

So at this point, I'm waiting for the IT support team to get in at 7:00 so we can monitor production to see if we can spot the issue. My initial guess would be something different in O/S configuration.
Andy Sorrell
Certified DataStage Consultant
IBM Analytics Champion 2009 - 2020
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Keep in mind also that server jobs in version 8 operate in a "cocoon" in the parallel execution environment. Anything that has to use the services, such as metadata checking, has to go via translation into and back from strongly typed. This will necessarily introduce some overhead, though not of the magnitude you reported. But it would be a part of it. Ditto if you are generating operational metadata.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
asorrell
Posts: 1707
Joined: Fri Apr 04, 2003 2:00 pm
Location: Colleyville, Texas

Post by asorrell »

I did not know that! However, we aren't generating operational metadata at this time. Not to derail the thread - but any idea how much overhead that might add in a typical environment?
Andy Sorrell
Certified DataStage Consultant
IBM Analytics Champion 2009 - 2020
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Some.

The rest of the answer is tied up in the "how long is a piece of string" nature of any "impact on performance" question - what's the job doing, what else is happening, how busy is the metadata delivery service, how busy is the network if XMETA is on a different machine, and so on.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

You might find the resolution to this post interesting.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
asorrell
Posts: 1707
Joined: Fri Apr 04, 2003 2:00 pm
Location: Colleyville, Texas

Post by asorrell »

I thought I'd post the solution to this one... Tier2 tech support suggested creating a RAM disk and putting the dynamic hash file on the RAM disk to isolate it from the file system. When we did that, it ran extremely well. This enabled us to pove to the NAS / SAN guys that it was NOT DataStage that had the problem.

To make a long story short, it turned out to be heavy traffic on the network that was supporting the production SAN. The high I/O wait times were resolved by moving to storage that was on a filer with less network traffic.
Andy Sorrell
Certified DataStage Consultant
IBM Analytics Champion 2009 - 2020
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Nice one. I've seen that done on Windows and Linux using a flash memory stick. Even 16GB is reasonably cheap these days. Arnd has used one for scratch space in a parallel job!
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply