Find the highest value in a group
Moderators: chulett, rschirm, roy
Find the highest value in a group
Hello all,
I have a set of records that I am sorting and creating a change key. I want to select the record that has the highest value in a different column (Col_2 below). See example below. Is there a way to select that record in the group without having to resort? Thanks in advance for any help.
Input after the sort and key change:
keyChange, Col_1, Col_2, Col_3
=-=-=-=--=-=-=-
1, AAAA, 1, Data
0, AAAA, 3, Data
0, AAAA, 2, Data
1, BBBB, 1, Data
0, BBBB, 2, Data
Desired output:
Col_1, Col_2
=-=-=-=--=-=-=-
AAAA, 3, Data
BBBB, 2, Data
I have a set of records that I am sorting and creating a change key. I want to select the record that has the highest value in a different column (Col_2 below). See example below. Is there a way to select that record in the group without having to resort? Thanks in advance for any help.
Input after the sort and key change:
keyChange, Col_1, Col_2, Col_3
=-=-=-=--=-=-=-
1, AAAA, 1, Data
0, AAAA, 3, Data
0, AAAA, 2, Data
1, BBBB, 1, Data
0, BBBB, 2, Data
Desired output:
Col_1, Col_2
=-=-=-=--=-=-=-
AAAA, 3, Data
BBBB, 2, Data
thanks chulett
Empl_rec, Daily_Change_Ver_Num, Changed Data
03456, 1, Empl_Status
03456, 3, Empl_Status, Leave_of_Absone_Date, New_Salary
03456, 2, Empl_Status , Leave_of_Absence_Date
45633, 1, New_Address
45633, 2, New_Address, Empl_Status
There may be multiple changes to a record between runes. Daily_Change_Ver_Num represents which one happed first, second, etc. The last version has all the changes from the previous versions.
Empl_rec, Daily_Change_Ver_Num, Changed Data
03456, 1, Empl_Status
03456, 3, Empl_Status, Leave_of_Absone_Date, New_Salary
03456, 2, Empl_Status , Leave_of_Absence_Date
45633, 1, New_Address
45633, 2, New_Address, Empl_Status
There may be multiple changes to a record between runes. Daily_Change_Ver_Num represents which one happed first, second, etc. The last version has all the changes from the previous versions.
So your desired output would be...
Empl_rec, Daily_Change_Ver_Num, Changed Data
03456, 3, "Empl_Status, Leave_of_Absone_Date, New_Salary"
45633, 2, "New_Address, Empl_Status"
Correct? And your changed data field, is it in fact one field containing a comma delimited list of the changes made? I specifically put quotes around the values assuming that was the case. If so, you should be able to have the Aggregator take the max() value from that column as well.
Empl_rec, Daily_Change_Ver_Num, Changed Data
03456, 3, "Empl_Status, Leave_of_Absone_Date, New_Salary"
45633, 2, "New_Address, Empl_Status"
Correct? And your changed data field, is it in fact one field containing a comma delimited list of the changes made? I specifically put quotes around the values assuming that was the case. If so, you should be able to have the Aggregator take the max() value from that column as well.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
-
ray.wurlod
- Participant
- Posts: 54595
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
As Craig noted, you don't need to re-sort (unless you want to run the Aggregator stage in Sort mode). What IS important is that the data are partitioned using a key-based algorithm on the grouping column(s).
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.

