Aggregator Performance

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
raju4u
Participant
Posts: 56
Joined: Thu Dec 13, 2007 12:30 am

Aggregator Performance

Post by raju4u »

Hi,

In the job we are giving 19 crore data to aggregator stage,it is taking 3 hrs time.here we are giving sorted data and hash partitioned data to agg and method in agg is sort method..please let me know if i can reduce the time in any other manner..

Thanks,
Rajashekar.
N R REDDY
SURA
Premium Member
Premium Member
Posts: 1229
Joined: Sat Jul 14, 2007 5:16 am
Location: Sydney

Re: Aggregator Performance

Post by SURA »

How about the data volume. no of columns used to aggregation?

Find out where the time is consumed more?

Split the job may help to reduce the time.

DS User
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Please advise what the grouping columns for aggregation are.

Essentially, though, you need to partition on the first only of these (unless it has very few distinct values) and sort on all of them in order, to be able use Sort as the aggregation method.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
keshav0307
Premium Member
Premium Member
Posts: 783
Joined: Mon Jan 16, 2006 10:17 pm
Location: Sydney, Australia

Post by keshav0307 »

did you try increase number of nodes
kommven
Charter Member
Charter Member
Posts: 125
Joined: Mon Jul 12, 2004 12:37 pm

Post by kommven »

Compare with a simple select Job Vs Aggregator in Job.
I assume the throughput from Source stage is a well to note measure in depicting overall performance of your Job.

I will also suggest dumping the data into dataset and using that as a source to compare your results and see if there is any improvement oppurtunity.
Post Reply