Increment sequence

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
dsdevper
Premium Member
Premium Member
Posts: 86
Joined: Tue Aug 19, 2008 9:31 am

Increment sequence

Post by dsdevper »

hi

I have to increment a integer field coloumn value by one if the there is a duplicate row.

Ex:INPUT(COLA,COLB,COLC ,COLD--KEY COLUMNS)

COLA COLB COLC COLD
-----------------------------------
111 EEE 8XX SSSS
222 AAA 9XX SGDH
222 AAA 9XX HUDG
222 AAA 9XX INKH
333 KKK 7XX HJHJ
333 KKK 7XX JHLJ
333 III 6XX JKKK

OUTPUT:(COLA,COLB,COLC,INT---KEY COLOUMNS)

COLA COLB COLC INT
-----------------------------------
111 EEE 8XX 1
222 AAA 9XX 1
222 AAA 9XX 2
222 AAA 9XX 3
333 KKK 7XX 1
333 KKK 7XX 2
333 III 6XX 1


please give me some sujestions.
us1aslam1us
Charter Member
Charter Member
Posts: 822
Joined: Sat Sep 17, 2005 5:25 pm
Location: USA

Post by us1aslam1us »

Simple solution will be to use stage variables. Try something like this....
Sort the data on your three columns.
Initialize sVValue as 1

Code: Select all

sVCurrent = COLA:COLB:COLC
sVValue = If sVCurrent = sVPrevious then sVValue+1 else sVValue
sVPrevious = sVCurrent
And use sVValue in your column derivation of INT column.
I haven't failed, I've found 10,000 ways that don't work.
Thomas Alva Edison(1847-1931)
srinivas.g
Participant
Posts: 251
Joined: Mon Jun 09, 2008 5:52 am

Post by srinivas.g »

Design is

Sequential File-->sort -->transformer--->Dataset

In Sort stage ,

set create cluster key change column = true

and in transformer use stage variable
Dup = If clusterKeyChange=1 Then clusterKeyChange Else Dup+1

Dup value is your output.
Srinu Gadipudi
Post Reply