PX:Merge records

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
Shadab_Farooque
Participant
Posts: 21
Joined: Tue Apr 24, 2007 12:39 am

PX:Merge records

Post by Shadab_Farooque »

I have input file with CSV file format as shown below
H,1,2....
D,1....
D,2....
H,2,1...
D,1...

Record 1 says its the 1st header record with 2 detail records to follow.
D1,D2 are its coresponding 2 detail records.
Then 4th line is again a header record with 1 details record.

I want the output as
H,1,2...D,1..
H,1,2...D,2..
H,2,1...D,1..

Please help how to achieve this
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Sort data by record type descending then by field2 ascending. In a Transformer stage use stage variables to assemble the desired output records and to store the header record. Constraint the output such that record type = "D".
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Shadab_Farooque
Participant
Posts: 21
Joined: Tue Apr 24, 2007 12:39 am

Post by Shadab_Farooque »

Hi Ray,
Requesting you to elaborate more on using stage variables to achieve the above.

Thanks and Regards
Shadab Farooque
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Stage variables store information from header row. Each is updated only if it's a header row. For example svHCol2 might be drived as

Code: Select all

If InLink.Col1 = "H" Then InLink.Col2 Else svCol2
When processing a detail row, use the stage variables to supply the header row values. Output link is constrained as InLink.Col1 = "D".
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
vijayrc
Participant
Posts: 197
Joined: Sun Apr 02, 2006 10:31 am
Location: NJ

Post by vijayrc »

ray.wurlod wrote:Stage variables store information from header row. Each is updated only if it's a header row. For example svHCol2 might be drived as

Code: Select all

If InLink.Col1 = "H" Then InLink.Col2 Else svCol2 ...[/quote]

Not to hijack this thread, but while going thru this, I have a question. When I am creating a .CSV file [comma separated with no white space], is there a way to keep a running count of each record length or Is there a function that's available in Tfr stage or so that could do this...Any directions on this is highly appreciated. Thx.
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

You mean add the record length of each row to the end of the row, or keep a total of the sum of all lengths in a stage variable?
vijayrc
Participant
Posts: 197
Joined: Sun Apr 02, 2006 10:31 am
Location: NJ

Post by vijayrc »

ArndW wrote:You mean add the record length of each row to the end of the row, or keep a total of the sum of all lengths in a stage variable? ...
Requirement would be to keep a running total of record lengths of all the rows pertaining to a group of records that belong to an account, so as to create an index file for the .csv data file..

e.g
Acct1,11,11,1,1,111,111,111,11,1,1,11
Acct1,1,1,1,1,1,1,1,1,1,1,111,11,1,11
Acct1,111111111111
Acct2,1111,11111,11111,11111,11111
Acct2,11111111,11,111,111,111,1111,1111,111
Acct2,1,11,111,1,11

If the above .csv data file i create, I would need to create an index file corresponding to the above
Acct1
Start Byte:0000000
End Byte : 00000250 [Sum of record length of first 3 rows]
Acct2
Start Byte: 000000250
End Byte:000000475[Sum of record length of first 3 rows]
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

You have hijacked this thread. Please start a new one. Your question is unrelated to the original post.

The method would depend on precisely how you create the rows for the CSV file but it's do-able in any particular case.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply