| Author |
Message |
kashif007
Participant
Joined: 07 Jun 2006
Posts: 207
Location: teaneck
Points: 1840
|
|
| DataStage® Release: 7x |
| Job Type: Parallel |
| OS: Unix |
|
Hi
I want to achieve the following logic in my job:
Input Data:
Col1 Col2
A Value1
A Value2
A Value3
B Value4
B Value5
C Value6
C Value7
C Value8
C Value9
Output Data:
Col1 col2 col3 col4 col5
A Value1 Value2 Value3
B Value4 Value5
C Value6 Value7 Value8 Value9
I tried to do vertical pivot using Transfomer but was not successfull. Can anyone tell me how can I achieve this logic.
Thanks
|
_________________ Regards
Kashif Khan |
|
|
|
 |
ray.wurlod
Participant
Group memberships: Premium Members, Inner Circle, Australia Usergroup, Server to Parallel Transition Group
Joined: 23 Oct 2002
Posts: 40572
Location: Melbourne, Australia
Points: 226407
|
|
|
|
|
|
|
Partition and sort by the key column, and generate a key change column in the Sort stage. Downstream of that, accumulate your list of non-key columns in stage variables, initializing to null on key c ...
|
_________________ Zenith Solutions (Australia) Pty Ltd
-- from obscurity to
clarity
|
|
|
|
 |
wahi80
Participant
Joined: 07 Feb 2008
Posts: 141
Points: 1260
|
|
|
|
|
|
|
Another workaround could be to write a query to convert the columns to rows
|
|
|
|
|
 |
kashif007
Participant
Joined: 07 Jun 2006
Posts: 207
Location: teaneck
Points: 1840
|
|
|
|
|
|
|
I did as you said Ray. But your process only removes the duplicates. But I still can't figure out how do I convert the vertical rows into horizontal columns.
|
_________________ Regards
Kashif Khan |
|
|
|
 |
jhmckeever

Group memberships: Premium Members
Joined: 14 Jul 2005
Posts: 233
Location: Melbourne, Australia
Points: 2424
|
|
|
|
|
|
| Quote: |
| ... accumulate your list of non-key columns in stage variables, initializing to null on key change. |
Ray described how to use a transformer to perform the pivot you require. Did you follow the instructions, particularly the part quoted above, closely?
|
_________________ John McKeever
Zenith Solutions, Certified DS 7.5, 8.0
Before you criticize someone, walk a mile in their shoes. That way you're a mile away, and you have their shoes too. |
|
|
|
 |
jhmckeever

Group memberships: Premium Members
Joined: 14 Jul 2005
Posts: 233
Location: Melbourne, Australia
Points: 2424
|
|
|
|
|
|
Here's a summary of how your stage variables would be accumulated over the course of your example input data: ('-' means NULL)
| Code: |
INPUT COLUMNS STAGE VARIABLES
---------------------- --------------------------------------
Key Value keyChange svCol1 svCol2 svCol3 svCol4 svCol5
A Value1 1 Value1 - - - -
A Value2 0 Value1 Value2 - - -
A Value3 0 Value1 Value2 Value3 - -
B Value4 1 Value4 - - - -
B Value5 0 Value4 Value5 - - -
C Value6 1 Value6 - - - -
C Value7 0 Value6 Value7 - - -
C Value8 0 Value6 Value7 Value8 - -
C Value9 0 Value6 Value7 Value8 Value9 - |
You'll see how using the 'keep last' approach in the Remove Duplicates stage deliveres the output you require.
J.
|
_________________ John McKeever
Zenith Solutions, Certified DS 7.5, 8.0
Before you criticize someone, walk a mile in their shoes. That way you're a mile away, and you have their shoes too. |
|
|
|
 |
kashif007
Participant
Joined: 07 Jun 2006
Posts: 207
Location: teaneck
Points: 1840
|
|
|
|
|
|
The maximum possible occurance of Non-key Col2 would be 15. So I created 15 stage variables and in each stage variable I have the following logic:
If keyChange = '1' Then SetNull() Else Col2
Then I map them to all 15 output columns in the transformer. After the remove duplicate logic, I get the same data in all the 15 columns. Following is an example:
col1(key) col2(sv1) col3(sv2) col4(sv3) col5(sv4) col6(sv5) col7(sv6).................till 15 occurance
A Value3 Value3 Value3 Value3 Value3 Value3
B Value5 Value5 Value5 Value5 Value5 Value5
C Value9 Value9 Value9 Value9 Value9 Value9
One value pertains for all the occurances. What you have drawn in the picture is absolutely correct but I am not able to apply the logic as you have described. I am still missing something, please advice.
Thanks
|
_________________ Regards
Kashif Khan |
|
|
|
 |
jhmckeever

Group memberships: Premium Members
Joined: 14 Jul 2005
Posts: 233
Location: Melbourne, Australia
Points: 2424
|
|
|
|
|
|
Try the variables below. I've put some absurd indenting in to try and make the logic clearer. I'm not guaranteeing this to be the optimum solution, just something that
should
work
| Code: |
svColId = If inLink.KeyChange Then 1 Else svColId + 1
svCol1 = If keyChange Then inLink.Value Else svCol1
svCol2 = If inLink.keyChange
Then SetNull()
Else If svColId=2
Then inLink.Value
Else svCol2
svCol3 = If inLink.keyChange
Then SetNull()
Else If svColId=3
Then inLink.Value
Else svCol3
etc.
|
This (hopefully!) delivers:
| Code: |
INPUT COLUMNS STAGE VARIABLES
---------------------- ----------------------------------------------
Key Value keyChange svColId svCol1 svCol2 svCol3 svCol4 svCol5
A Value1 1 1 Value1 - - - -
A Value2 0 2 Value1 Value2 - - -
A Value3 0 3 Value1 Value2 Value3 - -
B Value4 1 1 Value4 - - - -
B Value5 0 2 Value4 Value5 - - -
C Value6 1 1 Value6 - - - -
C Value7 0 2 Value6 Value7 - - -
C Value8 0 3 Value6 Value7 Value8 - -
C Value9 0 4 Value6 Value7 Value8 Value9 -
|
This is off the top of my head so no warranty implied ...
J.
|
_________________ John McKeever
Zenith Solutions, Certified DS 7.5, 8.0
Before you criticize someone, walk a mile in their shoes. That way you're a mile away, and you have their shoes too. |
|
|
|
 |
kashif007
Participant
Joined: 07 Jun 2006
Posts: 207
Location: teaneck
Points: 1840
|
|
|
|
|
|
|
Brilliant Brilliant Brilliant. Works beautifully. Thanks
|
_________________ Regards
Kashif Khan |
|
|
|
 |
ShaneMuir

Group memberships: Premium Members
Joined: 15 Jun 2004
Posts: 271
Location: London
Points: 2067
|
|
|
|
|
|
You could also do it by replace the transform suggested above with a transform containing only 3 or 4 stage variables and a downstream column import stage.
You would still need to know the max number of output columns though.
Set up stage variables in your transform to:
1. Count the number of incoming columns per key value (using the key change. Eg svCount = If Link.keyChange then 1 else svCount+1 (initialise the stage variable to 1)
2. Build the number of columns. eg svBuildColumns = If Link.keyChange then Link.PivotValue else svBuildColumns :'|': Link.PivotValue - where '|' is your delimiter of choice
3. An output stage variable. Because the column import stage expects the number of delimiters to be consistent, you would need to build a consistent string of columns by padding the output with your delimiter value. eg svOutput = svBuildColumns : str('|', MaxColumns - svCount)
Downstream from the transform set up a column import stage which passes through the key values, and splits your pivoted output column from the transform into separate columns based on the delimiter you chose.
Not sure that this design has any better performance, but I think setting it up is easier, especially when you have a larger number of columns to pivot.
|
|
|
|
|
 |
SettValleyConsulting

Group memberships: Premium Members
Joined: 04 Sep 2003
Posts: 52
Location: UK & Europe
Points: 711
|
|
|
|
|
|
Another way to skin the Vertical Pivot cat...
Sort and add a KeyChange column as in previous suggestions.
Add a transformer that increments a stage variable and resets on key change giving a Counter field in the output:
If KeyChange=1 then svCOunt = 1 Else svCount = svCount + 1
Add a switch stage with as many output links as the maximum instances of Value that you need. This will Switch on the counter column and send first row down link 1, second down link 2 etc. Also renames the Value field to Value1 in Link1, Value2 in Link2 etc.
All outputs from the Switch feed into a Merge that merges on your key and moves
Link1.Value1 -> Value1,
Link2.Value2 -> Value2
etc.
Not claiming any particular performance benefits for this technique, but it does not generate any supernumary rows that need to be deduped out, and does more of the work using native EE Stages rather than a parallel Transformer so might be worth a try.
Phil Clarke.
|
|
|
|
|
 |
|
|