DSXchange: DataStage and IBM Websphere Data Integration Forum
View next topic
View previous topic
Add To Favorites
Author Message
DWS



Group memberships:
Premium Members

Joined: 13 Oct 2011
Posts: 17

Points: 208

Post Posted: Thu May 24, 2012 1:29 pm Reply with quote    Back to top    

DataStage® Release: 8x
Job Type: Parallel
OS: Unix
Hi There,
I have a request of produing outputs from inputs both listed below. The key columns for source is the first 3 columns and the target is first 2 columns.

Indeed, I don't know how Seq_Num could be, some cases it is over a thousand. I think the solution for this case is to use Loop in Transformer Stage. But I have not used for this case.
I am asking if any of you have had this experience.

Any feedback is great appreciated.

Thank you very much in advance.
--------------------------------------------------------------


Input:
----------
Emp_ID Name Seq_Num Desc(Varchar(10)
100 Tiger 1 I am
100 Tiger 2 en
100 Tiger 3 ETL
100 Tiger 4 developer
100 Tiger 5 working fr
100 Tiger 6 om
100 Tiger 7 home
100 Tiger 8 everyday
100 Tiger 9 from 10 am
100 Tiger 10 to 10 pm.

Output:
----------
Emp_ID Name Desc(LongVarchar(10000))
100 Tiger I am an ETL developer working from home everyday from 10 am to 10 pm.
-----------------------------------------------------------------------------
jwiles



Group memberships:
Premium Members

Joined: 14 Nov 2004
Posts: 1236

Points: 9942

Post Posted: Thu May 24, 2012 3:08 pm Reply with quote    Back to top    

Using the transformer's loop functionality would be one way, although you could also use simple key break detection (keyChange from Sort stage and LastRowInGroup() in Transformer) as well. Looping and key break detection are both well-documented in the Transformer stage documentation in the Parallel Job Developer's Guide .

Regards,

_________________
- james wiles


All generalizations are false, including this one - Mark Twain.
Rate this response:  
Not yet rated
DWS



Group memberships:
Premium Members

Joined: 13 Oct 2011
Posts: 17

Points: 208

Post Posted: Fri May 25, 2012 9:19 am Reply with quote    Back to top    

Thank you, James.

You are right.
Both ways are good for this case.

I have done key break detection method and also Loop for horizontal pivot but not Vertical pivot.
Do you have any sample of this vertical looping?

Thanks,
Rate this response:  
Not yet rated
jwiles



Group memberships:
Premium Members

Joined: 14 Nov 2004
Posts: 1236

Points: 9942

Post Posted: Fri May 25, 2012 2:44 pm Reply with quote    Back to top    

I don't have specific examples of vertical pivot loops. You should be able to queue a key's worth of rows, then simply build your final output string by concatenating the pieces from each row as you loop through (very similar to what you would do with keybreak detection).

The documentation for transformer loops includes an example for data aggregation, which is what this basically is.

Regards,

_________________
- james wiles


All generalizations are false, including this one - Mark Twain.
Rate this response:  
Not yet rated
mobashshar
Participant



Joined: 20 Apr 2005
Posts: 80
Location: U.S.
Points: 412

Post Posted: Sat May 26, 2012 4:38 pm Reply with quote    Back to top    

Use KeyBreak detection logic using stage variables to concatenate DESC field and use lastrowingroup function to get the last row.
Please let me know if you need more info
Rate this response:  
Not yet rated
DWS



Group memberships:
Premium Members

Joined: 13 Oct 2011
Posts: 17

Points: 208

Post Posted: Mon May 28, 2012 6:58 am Reply with quote    Back to top    

Thank you very much for your reply.
Do you have any sample of
Quote:
Use KeyBreak detection logic using stage variables to concatenate DESC field and use lastrowingroup function to get the last row.


Thanks,
Rate this response:  
Not yet rated
chetan.c
Participant



Joined: 17 Jan 2012
Posts: 112
Location: Bangalore
Points: 929

Post Posted: Tue May 29, 2012 2:27 am Reply with quote    Back to top    

Hi,
StageVaribale:
svconcat
svconcat:Inputlink.desc

Output derivation
Map all columns to output except number as per your requirement but in place of desc column use the stage variable svconcat .

Constraint:
LastRowInGroup(Inputlink.id)

Sort incoming data as it is a necessary for LastRowInGroup() else job will abort.

From Documentation
Quote:
LastRowInGroup(column) returns TRUE if the value of the specified column is about to change


Thanks,
Chetan.C
Rate this response:  
Not yet rated
DWS



Group memberships:
Premium Members

Joined: 13 Oct 2011
Posts: 17

Points: 208

Post Posted: Thu May 31, 2012 3:44 pm Reply with quote    Back to top    

Thanks, Chetan.

This issue has been solved. It has taken too long to solve this case. Because I have never realized that it needs only ONE node for the job.
I also added more stage variables.

Is it possible to to use mutiple nodes instead of one node?

Does anyone know why can not use mutiple nodes?



Thanks in advance.

Very Happy
Rate this response:  
Not yet rated
jwiles



Group memberships:
Premium Members

Joined: 14 Nov 2004
Posts: 1236

Points: 9942

Post Posted: Thu May 31, 2012 4:03 pm Reply with quote    Back to top    

You should be able to use multiple nodes provided that you properly partition your data so that all data for a key value remains in the same partition--a Core Concept of DataStage parallel processing!!!!!

By running in a single node, you have forced all of the data into a single partition, which meets the above requirement but limits the throughput of the job.

Based on your decriptions of the rules, the expected results and the example data provided, you could probably partition your data on:
a) EmpID alone or
b) EmpID and Name
and meet the requirements of the necessary job logic. This of course means not using "Auto" partitioning Smile

Regards,

_________________
- james wiles


All generalizations are false, including this one - Mark Twain.
Rate this response:  
Not yet rated
mobashshar
Participant



Joined: 20 Apr 2005
Posts: 80
Location: U.S.
Points: 412

Post Posted: Thu May 31, 2012 5:21 pm Reply with quote    Back to top    

Ok.. Here is the logic to do key break

Define four stage variables in the exact order.
svNewKey input.Emp_ID
svKeyChange svNewKey <> svLastKey or svLastKey = ""
svLastKey input.Emp_ID
svDesc if svKeyChange then input.Desc else svCounter : input.Desc

Make sure to Partition and Sort on Emp_ID and Sort Only on Seq_Num in Transformer Input

Use LastRowInGroup(input.Emp_ID) as constraint to pass out the last row per employee id.

Output columns will be mapped as follows:

Emp_ID = input.Emp_ID
Name = input.Name
Desc = svDesc

Hope this will help you as it will work on multiple nodes
Rate this response:  
Not yet rated
mobashshar
Participant



Joined: 20 Apr 2005
Posts: 80
Location: U.S.
Points: 412

Post Posted: Thu May 31, 2012 5:25 pm Reply with quote    Back to top    

typo correction:

Define four stage variables in the exact order.
svNewKey = input.Emp_ID
svKeyChange = svNewKey <> svLastKey or svLastKey = ""
svLastKey = input.Emp_ID
svDesc = if svKeyChange then input.Desc else svDesc : input.Desc
[/list]
Rate this response:  
Not yet rated
jwiles



Group memberships:
Premium Members

Joined: 14 Nov 2004
Posts: 1236

Points: 9942

Post Posted: Thu May 31, 2012 10:09 pm Reply with quote    Back to top    

In 8.5+, it's a little bit simpler:

svDesc: if svNewKey=1 then input.Desc else svDesc : input.Desc
svNewKey (InitialValue=1): if LastRowInGroup(input.EmpID) then 1 else 0

Output Link Constraint: LastRowInGroup(input.EmpID)

Be sure to sort prior to the transformer (a requirement of the transformer's key detection routines).

Regards,

_________________
- james wiles


All generalizations are false, including this one - Mark Twain.
Rate this response:  
Not yet rated
DWS



Group memberships:
Premium Members

Joined: 13 Oct 2011
Posts: 17

Points: 208

Post Posted: Fri Jun 01, 2012 5:20 am Reply with quote    Back to top    

Thank you, everyone for your input.

This case has been solved in single node only which means the input data has not properly partitioning in term of leting all grouped data in same partition.

I have to modify the job and give the proper partition set up.

Anyway, thanks for everyone's help and will let you the know final result.

Regards,
Rate this response:  
Not yet rated
chulett

Premium Poster


since January 2006

Group memberships:
Premium Members, Inner Circle, Server to Parallel Transition Group

Joined: 12 Nov 2002
Posts: 36593
Location: Denver, CO
Points: 186532

Post Posted: Fri Jun 01, 2012 6:24 am Reply with quote    Back to top    

mobashshar wrote:
typo correction:

Idea FYI - there's no need to create a 'typo correction' post, simply go back and edit your post, correct it there.

_________________
-craig

It's a scheme of devices to get at low prices all goods from cough mixtures to cables
Which tickled the sailors by treating retailers as though they were all veg-e-tables
Rate this response:  
Not yet rated
DWS



Group memberships:
Premium Members

Joined: 13 Oct 2011
Posts: 17

Points: 208

Post Posted: Fri Jun 01, 2012 11:39 am Reply with quote    Back to top    

Finally it works perfectly now.
Thanks for everyone who had input for this topic.

If anyone in future wants to know the exactlly steps, I could post those in the forum.

Thanks,
Rate this response:  
Not yet rated
Display posts from previous:       

Add To Favorites
View next topic
View previous topic
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum



Powered by phpBB © 2001, 2002 phpBB Group
Theme & Graphics by Daz :: Portal by Smartor
All times are GMT - 6 Hours