Transformation Logic

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
devsonali
Premium Member
Premium Member
Posts: 155
Joined: Sun Mar 18, 2007 2:31 pm

Transformation Logic

Post by devsonali »

Hello all,

I have a job that does some calculations based on row numbers , lets say Column A has these values , I want another column B which counts based on change in value of A

So Input say

Column A , Column B


1 , 1
1 , 2
2 , 1
2 , 2
2 , 3
55 , 1
55 , 2
95 , 1
98 , 1

Basically Column B is a counter counting Occurrences of Column A , i did some searches on Key words "loops" "counter" but unable to find similar logic in here.
Thank you for looking .
rkashyap
Premium Member
Premium Member
Posts: 532
Joined: Fri Dec 02, 2011 12:02 pm
Location: Richmond VA

Post by rkashyap »

ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

If the data are accurate and accurately sorted you could use a Remove Duplicates stage and keep just the Last value from Column B for each group of Column A values.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
devsonali
Premium Member
Premium Member
Posts: 155
Joined: Sun Mar 18, 2007 2:31 pm

Post by devsonali »

Thank you for the pointers
However , the scenario (I think) is slightly different .

Col 1 values above are calculated based on row numbers (a different requirement) , for example lets say I am calculating the value of Col 1 as floor(row number/5) in a stage variable .
As a result of that first 5 records will have a certain value , next 5 another value and so on and so forth depending on how many input records are being processed ( I am ensuring that i running transformer in sequential mode )

For every incoming record , I would need Col A (calculated value) , Column B (counter based on Col A)

Lets say I calculate Col A in in stage variable 'svColA' -- > floor(@inrownum/5)
For col B
I declare stage variables svCurrentRow --> svPreviousRow
svPreviousRo --> Inputlink.svColA
svColAChangeCheck --> if svCurrentRow = svPreviousRow then 0 else 1
svCounter ---> if svColAChangeCheck = 1 then svCounter + 1 else svCounter

And I am trying to assign svCounter to Col B

**Correction - I think the above logic is flawed because as soon as Col A value changes the Counter should reset to 1

So probably
svCounter ---> if svColAChangeCheck = 0 then svCounter + 1 else 1

Am I coding this right ?
Last edited by devsonali on Thu Jun 30, 2016 10:12 am, edited 1 time in total.
UCDI
Premium Member
Premium Member
Posts: 383
Joined: Mon Mar 21, 2016 2:00 pm

Post by UCDI »

you can do something like a previous & current SV pair. This is rough concept of it..

svars look like
result = calculation of a
counter = if result = prev, counter + 1, 1 (hopefully close on the syntax)
prev = result

...
derive
result | col a
counter | col b

and you might need a special handler for the first row but the idea should work .. it just resets the counter whenever your calculation is not the same as the last time around. The assumption here is that the data is sorted on the input to calculation of a & that the result of the calculation is unique. If not, more info is needed...
devsonali
Premium Member
Premium Member
Posts: 155
Joined: Sun Mar 18, 2007 2:31 pm

Post by devsonali »

Yeah , I think I realized the error (corrected the logic )
However something I am yet to understand is why do we
need to declare at the end
prev = result (according to your code )

instead of declaring it the first variable ?
svPreviousRo --> Inputlink.svColA
UCDI
Premium Member
Premium Member
Posts: 383
Joined: Mon Mar 21, 2016 2:00 pm

Post by UCDI »

the stage variables look at the current record only. Its not looping here.
So in order to compare the previous to the current, you need to save a copy of the previous record. Then compare it. Then update it to current for the next record, and so on.

or visually, for records 1 and 2 and 3...

result = 1
counter = something
previous = 1

...
result = 2
compare previous (still set to 1!) against result (now 2!)
..
prev = 2
...

result = 3
compare 3 and 2
prev = 3
...

Or that was the idea anyway. Assuming I understood your data example and what you were trying to do.
Post Reply