Page 1 of 1

Aggregator group outputs empty string value

Posted: Thu Jul 22, 2010 7:35 pm
by fphelm
Hi all.

I don't think I understand the aggregator stage because some of the output records contain empty strings when there are no empty strings in the source input.

Here is the input data:

Code: Select all

ID      CategoryCode ProgramCode ProgramS Year Code SpecNo
1571511 degr         ba          162233   1953 W    701
1571511 degr         ba          162233   1954 W    701
1571511 degr         ba          162233   1953 W    701
1571511 degr         ba          162233   1954 W    701
This is the output of the aggregator stage:

Code: Select all

ID      CategoryCode ProgramCode ProgramS Year Code SpecCount
1571511 ""           ba          162233   ""   ""   2
1571511 ""           ba          162233   ""   ""   2
This is the data set that I expect:

Code: Select all

ID      CategoryCode ProgramCode ProgramS Year Code SpecCount
1571511 degr         ba          162233   1953 W    2
1571511 degr         ba          162233   1954 W    2
The number of processing nodes is 2. The columns that I set inside the stage to group is ID CategoryCode ProgramCode ProgramS Year Code. The aggregation type has been set to Count Rows, the Count Output Column value has been to so SpecCount. I have tried under the Options folder using Method = Sort or Hash and gotten different results. I have set up a Sort stage ahead of the aggregator stage with an ascending sort order where the sorting of the columns is in the same order as I am grouping. I hope that I am not missing any other important information.

My question is why in the output dataset do empty string values get returned when no empty values exist in the input dataset?

My version of DataStage is 8.1.0.0 in case I'm missing any patches. No hotfixes or APARs have been applied.

Thank you in advance for any ideas that you have that I should look at.

Fred Helm
Newbie

Posted: Thu Jul 22, 2010 7:51 pm
by ray.wurlod
Welcome aboard.

Please tell us how your Aggregator stage properties are set up, and how the input data are partitioned and sorted.