Combine multiple rows into one row (Vertical Pivoting)

rohit_mca2003 · Post by **rohit_mca2003** » Thu Dec 11, 2008 11:29 am

Hi,
I have a requirement to pivot the multiple rows into a single row. I have input data as below:

Input Data:
-------------
Id Fname Sname Section Ques Ans
--------------------------------------------------------------------
100 Sam Roy S1 Q11 A11
100 Sam Roy S1 Q12 A12
100 Sam Roy S1 Q13 A13
100 Sam Roy S2 Q21 A21
100 Sam Roy S2 Q22 A22

I like output data as follows in one row and different columns:
-----------------------------------------------------------------------
100 Sam Roy S1 Q11 A11 Q12 A12 Q13 A13 S2 Q21 A21 Q22 A22

Could you please help me to resolve this.

Regards.

tehavele · Post by **tehavele** » Thu Dec 11, 2008 11:44 am

try using column export stage.

chulett · Post by **chulett** » Thu Dec 11, 2008 11:46 am

Search the forums for "vertical pivot", it has already been discussed here. Alot.

rohit_mca2003 · Post by **rohit_mca2003** » Fri Dec 12, 2008 5:39 am

Thanks for all your replies.
I have searched the topics but I have not got any suitable reply for my input set of data.

If you see the input set it has two pivot columns ID and SECTION. It would be greatful if anyone can provide solution to this example.

Input Data:
-------------
Id Fname Sname Section Ques Ans
--------------------------------------------------------------------
100 Sam Roy S1 Q11 A11
100 Sam Roy S1 Q12 A12
100 Sam Roy S1 Q13 A13
100 Sam Roy S2 Q21 A21
100 Sam Roy S2 Q22 A22

I like output data as follows in one row and different columns:
-----------------------------------------------------------------------
100 Sam Roy S1 Q11 A11 Q12 A12 Q13 A13 S2 Q21 A21 Q22 A22

Raghavendra · Post by **Raghavendra** » Fri Dec 12, 2008 6:53 am

In this example I see you have a set of 5 records. In your data, is this set of records fixed.

Sathishkumarins · Post by **Sathishkumarins** » Fri Dec 12, 2008 8:02 am

"Input Data:
-------------
Id Fname Sname Section Ques Ans
--------------------------------------------------------------------
100 Sam Roy S1 Q11 A11
100 Sam Roy S1 Q12 A12
100 Sam Roy S1 Q13 A13
100 Sam Roy S2 Q21 A21
100 Sam Roy S2 Q22 A22

I like output data as follows in one row and different columns:
-----------------------------------------------------------------------
100 Sam Roy S1 Q11 A11 Q12 A12 Q13 A13 S2 Q21 A21 Q22 A22"

Hey I'm working on this as an assignment.
Will get back to you if i get a solution.

boxtoby · Post by **boxtoby** » Fri Dec 12, 2008 8:04 am

If I understand tour requirement correctly then I have done this several times before!

I also imagine your data looks like this:

100......Q11 A11
100...............
100..............
100...............
100..............
101.......Q11 A11
101..............
101.............
101.............
101...............

Do you always have 5 records per "group"? (100, 101 etc)

Let's assume so for the moment.

What you need is a transformer with stage variables.

use an sv to count the rows as they come in. When the count = 5 then output a row.

While rows 1 - 5 are coming in use other stage vars to populate the output columns, the output columns are related to the row count.

It's a bit fidley and you'll need to check carefully what happens and when, but it works.

If the row count is not always 5 then you need to geberate the row count before the transformer:

Split the rows in to an "ID column only" and "all columns" feed using a copy stage. On the ID column only use an Aggregator to count rows per ID. (On the input to the copy stage repartition using the hash option on the ID and sort by ID to ensure the aggregation and subsequent look up works.)

Use a look up stage to join the two feeds and collect the row count, now procede as before.

Hope this helps!

kandyshandy · Post by **kandyshandy** » Fri Dec 12, 2008 8:37 am

Whether the set of records is fixed or not, the simplest way is using stage variables. One way of doing this is "if previous KEY does not match with current KEY, then assign input columns to stage var Else concatenate input columns to stagevar"

The you can use RemDuplicates or aggregator to get the last record. Once you have the last record, you can use INDEX and FIELD functions to populate your target fields (based up on your requirements).

If your requirement is different, then let us know

Sathishkumarins · Post by **Sathishkumarins** » Fri Dec 12, 2008 10:40 am

Kandyshandy provided the right method.

But in order to get your o/p in a simple way and with same order you have mentioned then you can have source and link partitioner with 5 seq after and 5 hash file and followed by a transformer where you can pull the data from the hash files and finally a seq file to get the desired output.

Message me if you need that dsx

kandyshandy · Post by **kandyshandy** » Fri Dec 12, 2008 11:04 am

Good Sathish. I mistook that your question was for a PX job.

If it is a server job, there are many methods to achieve the same !! e.g. you group set is fixed.. for every ID, you will have only 5 records. If there is a sixth record then you may be need to introduce a new hash file which needs redesign.

In your method, you don't need 5 seq files at all. You can directly load to hash files.

Sathishkumarins · Post by **Sathishkumarins** » Fri Dec 12, 2008 11:16 am

kandyshandy wrote: In your method, you don't need 5 seq files at all. You can directly load to hash files.

Yes you are right. I removed that and tried and it's working and faster too....

Thanks,
Kandy.

@rohit_mca2003: Hope this might be suffcient enough to get your desired output

DSXchange