Page 1 of 1

Warning in aggregator stage (Hash table has grown etc..)

Posted: Tue Jan 04, 2011 2:10 pm
by Marley777
Hi, thanks for reading. Getting the following warning.

agg_UWMargin,0: Hash table has grown to 16384 entries. [groupby/hashgroup2.C:966]

So I changed the method to 'sort' instead of 'hash' and the warning went away. However I'm worried that when running on a 4 node config file and method=sort, the data will not be partitioned correctly..thought that was method=hash would do? Do I need to do anything extra in teh aggregator stage to make sure the data is sorted and partitioned (spread accross the nodes) correctly??

Here is how my env variable are set
APT_NO_SORT_INSERTION = FALSE
APT_NO_PART_INSERTION = FALSE

Posted: Tue Jan 04, 2011 3:11 pm
by ray.wurlod
The data have to be partitioned on the first grouping key and sorted on all the grouping keys when sort method is used. But your environment variables prevent DataStage from inserting tsort and partitioner operators to achieve this, so YOU must do the explicit sorting and partitioning.

Posted: Tue Jan 04, 2011 3:32 pm
by Marley777
Ray,

1- I can do this on the partitioning tab correct?

2- I have 4 key fields, can I use 'usage=sorting, partitioning' on all 4 fields?

I want to sum totals for a 4 field key grouping, thinking they will all have to be on the same partition, and usage 'sorting partitioning' will take care of this for me?

Posted: Tue Jan 04, 2011 5:01 pm
by ray.wurlod
Only partition on the first of the fields, provided that this generates enough distinct values to cover all your processing nodes. Sort on all four.