aggregator performance
Moderators: chulett, rschirm, roy
aggregator performance
I am using an aggregator stage just to count the number of rows from a particular link.
The design is like this
Seq file-->transformer-->aggregator-->seq file
Here i need the aggregator to count the total rows from transformer(the key is same for all the records), so it would pass through only one partition.
I am dealing with millions of records. Now we are doing development, but wanted to know how this would affect the performance. Or is there any other way to do this?
The design is like this
Seq file-->transformer-->aggregator-->seq file
Here i need the aggregator to count the total rows from transformer(the key is same for all the records), so it would pass through only one partition.
I am dealing with millions of records. Now we are doing development, but wanted to know how this would affect the performance. Or is there any other way to do this?
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
-
- Participant
- Posts: 2
- Joined: Thu Jul 24, 2008 9:43 am
Re: aggregator performance
Even I experienced same issue.But I tried keeping 2 Agg Stages and making first one with hash partitioning and second one with sequential that works fantastic.
Just try this.
Thanks.
Just try this.
Thanks.
Skumar
sima and sjaladurgam
So, the two aggregator stages would not hinder the performance while doing for millions of records???. i am just worried since the data is very huge..anyway, thanks for your input.
Ray, i am not sure how we can calculate while actual processing, because anyways i have to calculate withouth the partitioning to get the total count.
So, the two aggregator stages would not hinder the performance while doing for millions of records???. i am just worried since the data is very huge..anyway, thanks for your input.
Ray, i am not sure how we can calculate while actual processing, because anyways i have to calculate withouth the partitioning to get the total count.
-
- Participant
- Posts: 3337
- Joined: Mon Jan 17, 2005 4:49 am
- Location: United Kingdom
Re: aggregator performance
sjaladurgam wrote:...and second one with sequential ...
-
- Participant
- Posts: 3337
- Joined: Mon Jan 17, 2005 4:49 am
- Location: United Kingdom