DSXchange: DataStage and IBM Websphere Data Integration Forum
View next topic
View previous topic
Add To Favorites
This topic has been marked "Resolved."
Author Message
kashif007
Participant



Joined: 07 Jun 2006
Posts: 207
Location: teaneck
Points: 1840

Post Posted: Wed May 07, 2008 2:51 pm Reply with quote    Back to top    

DataStage® Release: 7x
Job Type: Parallel
OS: Unix
Hi

I want to achieve the following logic in my job:

Input Data:

Col1 Col2
A Value1
A Value2
A Value3
B Value4
B Value5
C Value6
C Value7
C Value8
C Value9

Output Data:

Col1 col2 col3 col4 col5
A Value1 Value2 Value3
B Value4 Value5
C Value6 Value7 Value8 Value9

I tried to do vertical pivot using Transfomer but was not successfull. Can anyone tell me how can I achieve this logic.
Thanks

_________________
Regards
Kashif Khan
ray.wurlod

Premium Poster
Participant

Group memberships:
Premium Members, Inner Circle, Australia Usergroup, Server to Parallel Transition Group

Joined: 23 Oct 2002
Posts: 40572
Location: Melbourne, Australia
Points: 226407

Post Posted: Wed May 07, 2008 4:08 pm Reply with quote    Back to top    

Partition and sort by the key column, and generate a key change column in the Sort stage. Downstream of that, accumulate your list of non-key columns in stage variables, initializing to null on key c ...

_________________
Zenith Solutions (Australia) Pty Ltd
-- from obscurity to clarity
Rate this response:  
wahi80
Participant



Joined: 07 Feb 2008
Posts: 141

Points: 1260

Post Posted: Wed May 07, 2008 4:37 pm Reply with quote    Back to top    

Another workaround could be to write a query to convert the columns to rows
Rate this response:  
Not yet rated
kashif007
Participant



Joined: 07 Jun 2006
Posts: 207
Location: teaneck
Points: 1840

Post Posted: Wed May 07, 2008 5:53 pm Reply with quote    Back to top    

I did as you said Ray. But your process only removes the duplicates. But I still can't figure out how do I convert the vertical rows into horizontal columns.

_________________
Regards
Kashif Khan
Rate this response:  
Not yet rated
jhmckeever



Group memberships:
Premium Members

Joined: 14 Jul 2005
Posts: 233
Location: Melbourne, Australia
Points: 2424

Post Posted: Wed May 07, 2008 6:38 pm Reply with quote    Back to top    

Quote:
... accumulate your list of non-key columns in stage variables, initializing to null on key change.


Ray described how to use a transformer to perform the pivot you require. Did you follow the instructions, particularly the part quoted above, closely?

_________________
John McKeever
Zenith Solutions, Certified DS 7.5, 8.0
Before you criticize someone, walk a mile in their shoes. That way you're a mile away, and you have their shoes too.
Rate this response:  
Not yet rated
jhmckeever



Group memberships:
Premium Members

Joined: 14 Jul 2005
Posts: 233
Location: Melbourne, Australia
Points: 2424

Post Posted: Wed May 07, 2008 6:55 pm Reply with quote    Back to top    

Here's a summary of how your stage variables would be accumulated over the course of your example input data: ('-' means NULL)

Code:
INPUT COLUMNS           STAGE VARIABLES
----------------------  --------------------------------------
Key  Value   keyChange  svCol1  svCol2  svCol3  svCol4  svCol5
A    Value1  1          Value1  -       -       -       -
A    Value2  0          Value1  Value2  -       -       -
A    Value3  0          Value1  Value2  Value3  -       -
B    Value4  1          Value4  -       -       -       -
B    Value5  0          Value4  Value5  -       -       -
C    Value6  1          Value6  -       -       -       -
C    Value7  0          Value6  Value7  -       -       -
C    Value8  0          Value6  Value7  Value8  -       -
C    Value9  0          Value6  Value7  Value8  Value9  -


You'll see how using the 'keep last' approach in the Remove Duplicates stage deliveres the output you require.

J.

_________________
John McKeever
Zenith Solutions, Certified DS 7.5, 8.0
Before you criticize someone, walk a mile in their shoes. That way you're a mile away, and you have their shoes too.
Rate this response:  
Not yet rated
kashif007
Participant



Joined: 07 Jun 2006
Posts: 207
Location: teaneck
Points: 1840

Post Posted: Wed May 07, 2008 9:25 pm Reply with quote    Back to top    

The maximum possible occurance of Non-key Col2 would be 15. So I created 15 stage variables and in each stage variable I have the following logic:

If keyChange = '1' Then SetNull() Else Col2

Then I map them to all 15 output columns in the transformer. After the remove duplicate logic, I get the same data in all the 15 columns. Following is an example:

col1(key) col2(sv1) col3(sv2) col4(sv3) col5(sv4) col6(sv5) col7(sv6).................till 15 occurance
A Value3 Value3 Value3 Value3 Value3 Value3
B Value5 Value5 Value5 Value5 Value5 Value5
C Value9 Value9 Value9 Value9 Value9 Value9

One value pertains for all the occurances. What you have drawn in the picture is absolutely correct but I am not able to apply the logic as you have described. I am still missing something, please advice.

Thanks

_________________
Regards
Kashif Khan
Rate this response:  
Not yet rated
jhmckeever



Group memberships:
Premium Members

Joined: 14 Jul 2005
Posts: 233
Location: Melbourne, Australia
Points: 2424

Post Posted: Wed May 07, 2008 10:25 pm Reply with quote    Back to top    

Try the variables below. I've put some absurd indenting in to try and make the logic clearer. I'm not guaranteeing this to be the optimum solution, just something that should work Smile

Code:
svColId = If inLink.KeyChange Then 1 Else svColId + 1

svCol1 = If keyChange Then inLink.Value Else svCol1

svCol2 = If   inLink.keyChange
         Then SetNull()
         Else If   svColId=2
              Then inLink.Value
              Else svCol2

svCol3 = If   inLink.keyChange
         Then SetNull()
         Else If   svColId=3
              Then inLink.Value
              Else svCol3

etc.


This (hopefully!) delivers:

Code:
INPUT COLUMNS           STAGE VARIABLES
----------------------  ----------------------------------------------
Key  Value   keyChange  svColId svCol1  svCol2  svCol3  svCol4  svCol5
A    Value1  1          1       Value1  -       -       -       -
A    Value2  0          2       Value1  Value2  -       -       -
A    Value3  0          3       Value1  Value2  Value3  -       -
B    Value4  1          1       Value4  -       -       -       -
B    Value5  0          2       Value4  Value5  -       -       -
C    Value6  1          1       Value6  -       -       -       -
C    Value7  0          2       Value6  Value7  -       -       -
C    Value8  0          3       Value6  Value7  Value8  -       -
C    Value9  0          4       Value6  Value7  Value8  Value9  -


This is off the top of my head so no warranty implied ... Smile

J.

_________________
John McKeever
Zenith Solutions, Certified DS 7.5, 8.0
Before you criticize someone, walk a mile in their shoes. That way you're a mile away, and you have their shoes too.
Rate this response:  
kashif007
Participant



Joined: 07 Jun 2006
Posts: 207
Location: teaneck
Points: 1840

Post Posted: Thu May 08, 2008 12:13 pm Reply with quote    Back to top    

Brilliant Brilliant Brilliant. Works beautifully. Thanks

_________________
Regards
Kashif Khan
Rate this response:  
Not yet rated
ShaneMuir



Group memberships:
Premium Members

Joined: 15 Jun 2004
Posts: 271
Location: London
Points: 2067

Post Posted: Thu May 08, 2008 11:56 pm Reply with quote    Back to top    

You could also do it by replace the transform suggested above with a transform containing only 3 or 4 stage variables and a downstream column import stage.

You would still need to know the max number of output columns though.

Set up stage variables in your transform to:
1. Count the number of incoming columns per key value (using the key change. Eg svCount = If Link.keyChange then 1 else svCount+1 (initialise the stage variable to 1)
2. Build the number of columns. eg svBuildColumns = If Link.keyChange then Link.PivotValue else svBuildColumns :'|': Link.PivotValue - where '|' is your delimiter of choice
3. An output stage variable. Because the column import stage expects the number of delimiters to be consistent, you would need to build a consistent string of columns by padding the output with your delimiter value. eg svOutput = svBuildColumns : str('|', MaxColumns - svCount)

Downstream from the transform set up a column import stage which passes through the key values, and splits your pivoted output column from the transform into separate columns based on the delimiter you chose.

Not sure that this design has any better performance, but I think setting it up is easier, especially when you have a larger number of columns to pivot.
Rate this response:  
Not yet rated
SettValleyConsulting



Group memberships:
Premium Members

Joined: 04 Sep 2003
Posts: 52
Location: UK & Europe
Points: 711

Post Posted: Mon May 12, 2008 9:57 am Reply with quote    Back to top    

Another way to skin the Vertical Pivot cat...

Sort and add a KeyChange column as in previous suggestions.

Add a transformer that increments a stage variable and resets on key change giving a Counter field in the output:

If KeyChange=1 then svCOunt = 1 Else svCount = svCount + 1

Add a switch stage with as many output links as the maximum instances of Value that you need. This will Switch on the counter column and send first row down link 1, second down link 2 etc. Also renames the Value field to Value1 in Link1, Value2 in Link2 etc.

All outputs from the Switch feed into a Merge that merges on your key and moves

Link1.Value1 -> Value1,
Link2.Value2 -> Value2

etc.


Not claiming any particular performance benefits for this technique, but it does not generate any supernumary rows that need to be deduped out, and does more of the work using native EE Stages rather than a parallel Transformer so might be worth a try.


Phil Clarke.
Rate this response:  
Display posts from previous:       

Add To Favorites
View next topic
View previous topic
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum



Powered by phpBB © 2001, 2002 phpBB Group
Theme & Graphics by Daz :: Portal by Smartor
All times are GMT - 6 Hours