How can I pass through columns in Aggregator?

qutesanju · Post by **qutesanju** » Wed Jan 09, 2013 4:55 pm

Let us say from input stream there are 10 columns coming and I'm grouping based only on 2 columns and assume plus one column as SUM,
how to map remaining 7 columns from input to output? Or through the aggregator?

As aggregator will not allow remaining columns directly mapped from input to output because they should have some group by function

Job flow like

Code: Select all

input stream --> aggregator                 --> output stream
(10 col)     --> (2 group by +1 sum column) -->  (how to map remaining 7 columns?)

chulett · Post by **chulett** » Wed Jan 09, 2013 5:00 pm

As the number of rows out is less than the number in you can't simply map ungrouped columns across. Typically one would use MIN() or MAX() for those columns so that only one value is passed per group.

ray.wurlod · Post by **ray.wurlod** » Wed Jan 09, 2013 5:28 pm

For each "other" column which value from all the records in the group do you want associated with the grouped value?

Maybe you could look at a "fork join" job design.

But think, how would you achieve the results in SQL? You can't. For exactly the same reasons you can not achieve the results using an Aggregator stage.

prasson_ibm · Post by **prasson_ibm** » Thu Jan 10, 2013 8:25 am

Hi,

You can design your job in this way:

Code: Select all

input---> copy---->Aggregator     ----->left outer Join ----->output
               ---> pass 10 cols

Thanks
Prasoon

qutesanju · Post by **qutesanju** » Thu Jan 10, 2013 9:26 am

I used max function for rest of other columns and data populated as null

will max or min function work for varchar type of data?

chulett · Post by **chulett** » Thu Jan 10, 2013 9:57 am

Not by default. Default output is a double which you can make a decimal. For string fields you need to find and enable the 'Preserve type' property.

qutesanju · Post by **qutesanju** » Thu Jan 10, 2013 10:41 am

perfect chulett ,thanks for solution,
but i m unable to find where should i set and enable the 'Preserve type' ?
i can see only decimal property for SUM column?

when i used max(name) or max of any varchar data type ,it be default populated dfloat or double in column filed i manuaaly set its datatype to varchar

but apart from that where should i set and enable the 'Preserve type' ?

chulett · Post by **chulett** » Thu Jan 10, 2013 11:15 am

I have no DataStage access and my documentation is at home which I am not, so I'm going to have to defer to the wisdom of others to answer that question at this very moment.

ArndW · Post by **ArndW** » Thu Jan 10, 2013 11:29 am

In the drop-down list "available properties to add" where you select the "max" value, you can (and should) also choose "preserve type"

qutesanju · Post by **qutesanju** » Thu Jan 10, 2013 11:36 am

Got it

Stage --> properties --> Agrregations ---> aggregation type calculation --> click on column for calculation --> see in available properties to add .

qutesanju · Post by **qutesanju** » Thu Jan 10, 2013 11:51 am

Thanks all for your inputs ,so the solution is if you want to keep varchar data type in your group by flow use min amd max functions and use preserve type=TRUE only for varchar data type
and preserve type=FALSE only for numeric data type

this way all columns get populated to taget

Thanks for best DS forum

DSXchange

How can I pass through columns in Aggregator?

How can I pass through columns in Aggregator?

Re: Aggregator