I did a search on performance monitoring and still can't seem to find what I am looking for.
We have a Windows server with 32 gigs of RAM and Quad core and 600 GB of drive space.
During our process, we sometimes receive a resource allocation error that aborts the process.
What kinds of monitoring could I ask the network people for. i.e. memory, cpu etc. Is there specific area that they could look at to help determine some of our issues.
I know very little about the network area and would like to ask the proper questions in regards to Datastage 8.1
Thanks
Performance Monitoring
Moderators: chulett, rschirm, roy
Performance Monitoring
Jim Stewart
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
We have a similarly sized Windows server (dual quad core, actually) and receive intermittent resource errors that are extremely difficult to reproduce/replicate.
IBM support indicates job design every time, but I feel that it is the easiest answer for them to give.
We run multiple invocations at a time (no more than two to three invocations simultaneously), each using the 2-node configuration file. Still, we have had jobs abort, mostly in cases when "larger design" jobs in the separate invocations happen to run at the same time (most of the time the sequences' processing is staggered, so things work fine, because a larger job from one invocation may run during a smaller job of the other invocation, etc.).
From our experiences, it seems to be more regarding job size rather than number of rows being processed--i.e. I once had a job fail that was running less than 100 rows. So perhaps it's the overhead of starting up the job and related processes? There is plenty of available memory and disk but sometimes, CPU may be very high for short intervals (10-20 seconds). However, I feel that this is normal in server computing and shouldn't "break" the system.
All we can do is be extremely aware of job design, number of stages in the job, (number of system processes that are being created at runtime), and usage of sort/RD/aggregator stages which we feel are more resource intensive than most.
And we're continuing to collect information on the occurrences as they happen. The hardest part about it is that it gives us no indication of WHAT resource is lacking (hardware vs. logical/internal to Datastage).
I'll keep monitoring this thread too, as I have a high interest in any information learned/shared.
Thanks,
David Wagner
IBM support indicates job design every time, but I feel that it is the easiest answer for them to give.
We run multiple invocations at a time (no more than two to three invocations simultaneously), each using the 2-node configuration file. Still, we have had jobs abort, mostly in cases when "larger design" jobs in the separate invocations happen to run at the same time (most of the time the sequences' processing is staggered, so things work fine, because a larger job from one invocation may run during a smaller job of the other invocation, etc.).
From our experiences, it seems to be more regarding job size rather than number of rows being processed--i.e. I once had a job fail that was running less than 100 rows. So perhaps it's the overhead of starting up the job and related processes? There is plenty of available memory and disk but sometimes, CPU may be very high for short intervals (10-20 seconds). However, I feel that this is normal in server computing and shouldn't "break" the system.
All we can do is be extremely aware of job design, number of stages in the job, (number of system processes that are being created at runtime), and usage of sort/RD/aggregator stages which we feel are more resource intensive than most.
And we're continuing to collect information on the occurrences as they happen. The hardest part about it is that it gives us no indication of WHAT resource is lacking (hardware vs. logical/internal to Datastage).
I'll keep monitoring this thread too, as I have a high interest in any information learned/shared.
Thanks,
David Wagner