Compare previous rows and with another column

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
sam334
Premium Member
Premium Member
Posts: 145
Joined: Mon Aug 26, 2013 7:42 pm

Compare previous rows and with another column

Post by sam334 »

All,
Have a question on an issue I am encountering now. Have two columns , ID and Status as below.

ID Status
1 | A
1 | A
1 | R
2 | R
3 | A
3 | R
3 | R
3 | R

Now, I want to compare with previous row and the status column. So, for ID 1, as there is one R(Rejected) but another 2 are Accepted , ID 1 is Accepted. For ID 2 which is only one row, its already R rejected. For ID 3 as there is one Accepted but other 3 are Rejected, its still rejected.

Any lead how to achieve this. Basically for each ID if there is an A then its accepted.

Thanks.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Probably easiest to use stage variables (a) to determine that you're in a new group, and (b) to determine whether status A has been seen. Use the LastRowInGroup() function to restrict output to those rows.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
sam334
Premium Member
Premium Member
Posts: 145
Joined: Mon Aug 26, 2013 7:42 pm

Post by sam334 »

Ray, thanks for your thoughts. So, SV(b) is Status='A', but what do you mean by SV(a) ' In a new group'. Also, would LastRowInGroup() function be in one of the stage variables?

Thanks again.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

'In a new group', meaning has the value of ID changed. You can do it all manually via properly sequenced stage variables or you can leverage the Sort stage (even telling it to not sort) with the Key Change column property enabled. It will add a new column to the stream that when true means the first value in a new group and false when you are still in the current group, greatly simplifying your check. For example:

ID Status Key Change
1 | A | 1
1 | A | 0
1 | R | 0
2 | R | 1
3 | A | 1
3 | R | 0
3 | R | 0
3 | R | 0

Then track if an 'A' is encountered in the group. Need to clean up your rules a bit as you say "if there is an A then it is accepted" but you also declare 3 as "still rejected". Regardless, all that should get you closer to your goal. And yes, the function would be leveraged in a stage variable to help drive the constraint so you only get one output record per group... assuming that is what you need.
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Majority rules? Whatever the rule is, you need to document it!

You may need to count the A values and the R values in each group to make your final decision.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply