Restructuring of data

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
harikhk
Participant
Posts: 64
Joined: Tue Jun 04, 2013 11:36 am

Restructuring of data

Post by harikhk »

I have to rearrange data from an activities table to have an activity and its consecutive activity in a single row from historical data which has each activity recorded (1 record for each activity)

I have data similar to below in the history table

id,app_id,task
1,1234,entry
2,3245,entry
3,1234,accepted
4,1234,closed

The expected output is

app_id,task,next_task

1234,entry,accepted
1234,accepted,closed
1234,closed,null
3245,entry,null

if there is a consecutive task, then show the task value in the column
next_task, if does not exist display as null

I am clueless how to achieve it

Please share your thoughts
Thanks,
HK
*Go GREEN..Save Earth*
rkashyap
Premium Member
Premium Member
Posts: 532
Joined: Fri Dec 02, 2011 12:02 pm
Location: Richmond VA

Post by rkashyap »

In a transformer, stage variables are evaluated top to bottom, so system can cache value of "previous row", while processing the "current row" data, so you can use data from two rows at the same time.

There are numerous discussion threads on this topic. One example viewtopic.php?p=448233
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Begin by sorting the data by app_id then by id.
Then use stage variables to detect change of app_id (new group) and to process adjacent pairs of rows.

Note that there is no "look ahead" in DataStage - but you can always sort in reverse order!
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply