Combine multiple rows into one row (Vertical Pivoting)

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
rohit_mca2003
Participant
Posts: 41
Joined: Wed Oct 08, 2008 9:19 am

Combine multiple rows into one row (Vertical Pivoting)

Post by rohit_mca2003 »

Hi,
I have a requirement to pivot the multiple rows into a single row. I have input data as below:

Input Data:
-------------
Id Fname Sname Section Ques Ans
--------------------------------------------------------------------
100 Sam Roy S1 Q11 A11
100 Sam Roy S1 Q12 A12
100 Sam Roy S1 Q13 A13
100 Sam Roy S2 Q21 A21
100 Sam Roy S2 Q22 A22


I like output data as follows in one row and different columns:
-----------------------------------------------------------------------
100 Sam Roy S1 Q11 A11 Q12 A12 Q13 A13 S2 Q21 A21 Q22 A22


Could you please help me to resolve this.

Regards.
tehavele
Premium Member
Premium Member
Posts: 79
Joined: Wed Nov 12, 2008 5:41 am
Location: Melbourne

Re: Combine multiple rows into one row (Vertical Pivoting)

Post by tehavele »

try using column export stage.
Tejas
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Search the forums for "vertical pivot", it has already been discussed here. Alot.
-craig

"You can never have too many knives" -- Logan Nine Fingers
rohit_mca2003
Participant
Posts: 41
Joined: Wed Oct 08, 2008 9:19 am

Combine multiple rows into one row (Vertical Pivoting)

Post by rohit_mca2003 »

Thanks for all your replies.
I have searched the topics but I have not got any suitable reply for my input set of data.

If you see the input set it has two pivot columns ID and SECTION. It would be greatful if anyone can provide solution to this example.

Input Data:
-------------
Id Fname Sname Section Ques Ans
--------------------------------------------------------------------
100 Sam Roy S1 Q11 A11
100 Sam Roy S1 Q12 A12
100 Sam Roy S1 Q13 A13
100 Sam Roy S2 Q21 A21
100 Sam Roy S2 Q22 A22


I like output data as follows in one row and different columns:
-----------------------------------------------------------------------
100 Sam Roy S1 Q11 A11 Q12 A12 Q13 A13 S2 Q21 A21 Q22 A22
Raghavendra
Participant
Posts: 147
Joined: Sat Apr 30, 2005 1:23 am
Location: Bangalore,India

Post by Raghavendra »

In this example I see you have a set of 5 records. In your data, is this set of records fixed.
Raghavendra
Dare to dream and care to achieve ...
Sathishkumarins
Premium Member
Premium Member
Posts: 41
Joined: Tue Jul 08, 2008 5:45 am
Location: Columbus

Re: Combine multiple rows into one row (Vertical Pivoting)

Post by Sathishkumarins »

"Input Data:
-------------
Id Fname Sname Section Ques Ans
--------------------------------------------------------------------
100 Sam Roy S1 Q11 A11
100 Sam Roy S1 Q12 A12
100 Sam Roy S1 Q13 A13
100 Sam Roy S2 Q21 A21
100 Sam Roy S2 Q22 A22


I like output data as follows in one row and different columns:
-----------------------------------------------------------------------
100 Sam Roy S1 Q11 A11 Q12 A12 Q13 A13 S2 Q21 A21 Q22 A22"

Hey I'm working on this as an assignment.
Will get back to you if i get a solution.
boxtoby
Premium Member
Premium Member
Posts: 138
Joined: Mon Mar 13, 2006 5:11 pm
Location: UK

Post by boxtoby »

If I understand tour requirement correctly then I have done this several times before!

I also imagine your data looks like this:

100......Q11 A11
100...............
100..............
100...............
100..............
101.......Q11 A11
101..............
101.............
101.............
101...............

Do you always have 5 records per "group"? (100, 101 etc)

Let's assume so for the moment.

What you need is a transformer with stage variables.

use an sv to count the rows as they come in. When the count = 5 then output a row.

While rows 1 - 5 are coming in use other stage vars to populate the output columns, the output columns are related to the row count.

It's a bit fidley and you'll need to check carefully what happens and when, but it works.

If the row count is not always 5 then you need to geberate the row count before the transformer:

Split the rows in to an "ID column only" and "all columns" feed using a copy stage. On the ID column only use an Aggregator to count rows per ID. (On the input to the copy stage repartition using the hash option on the ID and sort by ID to ensure the aggregation and subsequent look up works.)

Use a look up stage to join the two feeds and collect the row count, now procede as before.

Hope this helps!
Bob Oxtoby
kandyshandy
Participant
Posts: 597
Joined: Fri Apr 29, 2005 6:19 am
Location: Singapore

Post by kandyshandy »

Whether the set of records is fixed or not, the simplest way is using stage variables. One way of doing this is "if previous KEY does not match with current KEY, then assign input columns to stage var Else concatenate input columns to stagevar"

The you can use RemDuplicates or aggregator to get the last record. Once you have the last record, you can use INDEX and FIELD functions to populate your target fields (based up on your requirements).

If your requirement is different, then let us know
Kandy
_________________
Try and Try again…You will succeed atlast!!
Sathishkumarins
Premium Member
Premium Member
Posts: 41
Joined: Tue Jul 08, 2008 5:45 am
Location: Columbus

Post by Sathishkumarins »

Kandyshandy provided the right method.

But in order to get your o/p in a simple way and with same order you have mentioned then you can have source and link partitioner with 5 seq after and 5 hash file and followed by a transformer where you can pull the data from the hash files and finally a seq file to get the desired output.

Message me if you need that dsx :)
kandyshandy
Participant
Posts: 597
Joined: Fri Apr 29, 2005 6:19 am
Location: Singapore

Post by kandyshandy »

Good Sathish. I mistook that your question was for a PX job.

If it is a server job, there are many methods to achieve the same !! e.g. you group set is fixed.. for every ID, you will have only 5 records. If there is a sixth record then you may be need to introduce a new hash file which needs redesign.

In your method, you don't need 5 seq files at all. You can directly load to hash files.
Kandy
_________________
Try and Try again…You will succeed atlast!!
Sathishkumarins
Premium Member
Premium Member
Posts: 41
Joined: Tue Jul 08, 2008 5:45 am
Location: Columbus

Post by Sathishkumarins »

kandyshandy wrote: In your method, you don't need 5 seq files at all. You can directly load to hash files.
Yes you are right. I removed that and tried and it's working and faster too....

Thanks,
Kandy.

@rohit_mca2003: Hope this might be suffcient enough to get your desired output :)
Post Reply