I would very much like to calculate standard deviation inside a DataStage job and/or a DataStage function. Has anyone here either written such a function or have any recommendations on how to achieve this?
I know I can do this in Excel but I would like to keep this contained inside a DataStage job.
Standard Deviation in DataStage?
Moderators: chulett, rschirm, roy
HI Arnd,
Not sure if you ready to use intermediat stage/ files/Jobs, I gess it could be done through Datastage, if you are ready to pass the total count and average of the value that you need to find the SD as parameter.
If you able to find the following for each record
and sum it using Aggregator, and if you find I guess you can get the value.
Where pAvgx and pTotalCount are parameters.
Not sure if you ready to use intermediat stage/ files/Jobs, I gess it could be done through Datastage, if you are ready to pass the total count and average of the value that you need to find the SD as parameter.
If you able to find the following for each record
Code: Select all
(x - pAvgx)*(x - pAvgx)and sum it using Aggregator, and if you find
Code: Select all
SQRT(sum value/pTotalCount)Where pAvgx and pTotalCount are parameters.
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
I looked at the formula on Wikipedia and it wasn't too bad, so I wrote:
Code: Select all
*************************************************************************
** accept a delimited string of values to and a standard deviation. **
** is started. **
** **
** Ver. Date Author Comments **
** ===== ========== ======== ========================================= **
** 1.0.0 2006/10/10 AWussing Initial coding and testing **
** **
*************************************************************************
Ans = '' ;** assume an invalid return
WorkList = CONVERT(' ,;:':@VM:@TM:@IM,@FM:@FM:@FM:@FM:@FM:@FM,DynamicList)
ListCount = DCOUNT(WorkList,@FM)
IF ListCount > 1
THEN
Median = SUM(WorkList)/ListCount
Summation = 0
FOR x = 1 TO ListCount
WorkNumber = WorkList<x>
Summation += (WorkNumber-Median)**2
NEXT x
Ans = (Summation / ListCount) ** .5
END ;** of if-then we have a valid list-
ray.wurlod
- Participant
- Posts: 54595
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
</a>