How can I pass through columns in Aggregator?

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
qutesanju
Participant
Posts: 373
Joined: Tue Aug 26, 2008 4:52 am

How can I pass through columns in Aggregator?

Post by qutesanju »

Let us say from input stream there are 10 columns coming and I'm grouping based only on 2 columns and assume plus one column as SUM,
how to map remaining 7 columns from input to output? Or through the aggregator?

As aggregator will not allow remaining columns directly mapped from input to output because they should have some group by function


Job flow like

Code: Select all

input stream --> aggregator                 --> output stream
(10 col)     --> (2 group by +1 sum column) -->  (how to map remaining 7 columns?)
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

As the number of rows out is less than the number in you can't simply map ungrouped columns across. Typically one would use MIN() or MAX() for those columns so that only one value is passed per group.
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

For each "other" column which value from all the records in the group do you want associated with the grouped value?

Maybe you could look at a "fork join" job design.

But think, how would you achieve the results in SQL? You can't. For exactly the same reasons you can not achieve the results using an Aggregator stage.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
prasson_ibm
Premium Member
Premium Member
Posts: 536
Joined: Thu Oct 11, 2007 1:48 am
Location: Bangalore

Re: Aggregator

Post by prasson_ibm »

Hi,

You can design your job in this way:

Code: Select all

input---> copy---->Aggregator     ----->left outer Join ----->output
               ---> pass 10 cols
  


Thanks
Prasoon
qutesanju
Participant
Posts: 373
Joined: Tue Aug 26, 2008 4:52 am

Post by qutesanju »

I used max function for rest of other columns and data populated as null

will max or min function work for varchar type of data?
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Not by default. Default output is a double which you can make a decimal. For string fields you need to find and enable the 'Preserve type' property.
-craig

"You can never have too many knives" -- Logan Nine Fingers
qutesanju
Participant
Posts: 373
Joined: Tue Aug 26, 2008 4:52 am

Post by qutesanju »

perfect chulett ,thanks for solution,
but i m unable to find where should i set and enable the 'Preserve type' ?
i can see only decimal property for SUM column?

when i used max(name) or max of any varchar data type ,it be default populated dfloat or double in column filed i manuaaly set its datatype to varchar

but apart from that where should i set and enable the 'Preserve type' ?
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

I have no DataStage access and my documentation is at home which I am not, so I'm going to have to defer to the wisdom of others to answer that question at this very moment.
-craig

"You can never have too many knives" -- Logan Nine Fingers
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

In the drop-down list "available properties to add" where you select the "max" value, you can (and should) also choose "preserve type"
qutesanju
Participant
Posts: 373
Joined: Tue Aug 26, 2008 4:52 am

Post by qutesanju »

Got it

Stage --> properties --> Agrregations ---> aggregation type calculation --> click on column for calculation --> see in available properties to add .
qutesanju
Participant
Posts: 373
Joined: Tue Aug 26, 2008 4:52 am

Post by qutesanju »

Thanks all for your inputs ,so the solution is if you want to keep varchar data type in your group by flow use min amd max functions and use preserve type=TRUE only for varchar data type
and preserve type=FALSE only for numeric data type

this way all columns get populated to taget

Thanks for best DS forum
Post Reply