Calculating top vaues
Moderators: chulett, rschirm, roy
-
- Premium Member
- Posts: 497
- Joined: Sun Dec 17, 2006 11:52 pm
- Location: Kolkata
- Contact:
Calculating top vaues
Hi all,
How can we extract out top 10 % value of any field in DataStage ?
thanx in advance .
Pravin
How can we extract out top 10 % value of any field in DataStage ?
thanx in advance .
Pravin
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
DataStage is not a reporting tool.
You do this in SQL. Hopefully in some variant of SQL that gives you some display functions such as RANK in its query language, and some means to restrict a query based on rank.
You do this in SQL. Hopefully in some variant of SQL that gives you some display functions such as RANK in its query language, and some means to restrict a query based on rank.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Premium Member
- Posts: 497
- Joined: Sun Dec 17, 2006 11:52 pm
- Location: Kolkata
- Contact:
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Of course you CAN use SQL - via an ODBC driver for text file. Just don't expect it to be fast. But for sufficiently small text files performance may be acceptable.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Premium Member
- Posts: 497
- Joined: Sun Dec 17, 2006 11:52 pm
- Location: Kolkata
- Contact:
-
- Premium Member
- Posts: 497
- Joined: Sun Dec 17, 2006 11:52 pm
- Location: Kolkata
- Contact:
You need to sort these values ascending, line them up in a single row, get the max value (last value), calculate the 10 percent breaking point, filter all the records equal to or less than that breaking point.pravin1581 wrote: It is top 10% of a particular column in a file.
If we are talking about Top 10% on values then you have to follow the method I gave above. If its Top 10% positional values, then all you need to do is get the number of records, sort again and get the 10% of the number of records. Get all records before that breaking point.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
Get the record count before the job runs. For UNIX or MKS 'wc -l' was suggested earlier so I didn't repost that. Pass that number in as a job parameter. After you sort, use a constraint where @OUTROWNUM <= 10% of the input rows.pravin1581 wrote:I am facing problems in record count also as DSLINKINFo always returns 0 . How can we restrict the data to top 10 % .chulett wrote:In Server get the record count and Sort the data descending on the value in question. Constrain your output to the 'top 10%' of the records. ...
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers