Set/Get Stage variables?
Moderators: chulett, rschirm, roy
Set/Get Stage variables?
I have searched the forum, and I have been unable to find a DS Server Edition BASIC code snippet for setting and getting stage variables inside a routine / function. I know this must be a simple process, since it is referred to in many posts, but I have not been able to locate it on DSExchange and neither DSSetVar() or DSGetVar() seem to work as advertised in the aging PDF documentation from Ascential. Believe me I would love to have an opportunity to attend IBM training, but so far that is not in the budget.
What I had in mind was a relatively standardized design:
1. create one DS hashed table for each required Oracle dimension table containing only the required business key columns and associated surrogate key.
2. place max surrogate key value per dimension table/file one per table in its own stage variable. Example:
Table: dim_customer would have a dimCustomerMaxKey stage variable (only one job updating at a time, so no worries)
3. process incoming sequential dimension files using DS hashed dimension tables to determine new, updated or obsoleted rows to create a new set of updated dimensions in the DS hashed tables.
For each new dimension dim_customer row, I would be able to increment a stage variable from 1 to represent the offset to be added to dimCustomerMaxKey.
I had considered adding additional columns in the Xform to contain the results of the validation functions, but this would seem to be overkill, and I'm not sure how scalable. My intent is to produce a standard template to serve as a guide for the remaining dimension packages.
Any advice is appreciated.
Best regards.
What I had in mind was a relatively standardized design:
1. create one DS hashed table for each required Oracle dimension table containing only the required business key columns and associated surrogate key.
2. place max surrogate key value per dimension table/file one per table in its own stage variable. Example:
Table: dim_customer would have a dimCustomerMaxKey stage variable (only one job updating at a time, so no worries)
3. process incoming sequential dimension files using DS hashed dimension tables to determine new, updated or obsoleted rows to create a new set of updated dimensions in the DS hashed tables.
For each new dimension dim_customer row, I would be able to increment a stage variable from 1 to represent the offset to be added to dimCustomerMaxKey.
I had considered adding additional columns in the Xform to contain the results of the validation functions, but this would seem to be overkill, and I'm not sure how scalable. My intent is to produce a standard template to serve as a guide for the remaining dimension packages.
Any advice is appreciated.
Best regards.
Re: Set/Get Stage variables?
dmcmunn wrote:I have searched the forum, and I have been unable to find a DS Server Edition BASIC code snippet for setting and getting stage variables inside a routine / function. I know this must be a simple process, since it is referred to in many posts, but I have not been able to locate it on DSExchange...
I'm curious where you are seeing this referenced as there really isn't any such ability. Stage variables are not visible to routines. They can be passed as arguments to routines, of course, but that's about it.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
What I would like to have is an edit function for each column which posts messages and diagnostics to a common memory area. I did attempt to do this via a COMMON dimensioned array...
The Function would be called via Xformer as follows:
Where:
function starts here...
When I attempted to later evaluate the COMMON data structure in a stage variable evaluation, it was empty, so I assume these DS areas don't share a common namespace?
Guess I'll have to pass the return values back as a @VM separated list?
Suggestions appreciated.
Thanks.
The Function would be called via Xformer as follows:
Code: Select all
MyTableEdit(ColIndex, ColName, InputRowNo, ExpectedValues, ActualValue)
Where:
- ColIndex will be the actual ordinal occurrence of the column in the input data;
ColName - actual column name
InputRowNo - @INROWNUM
ExpectedValues - string of delimited list of expected values
ActualValue - DS Input dataset column
function starts here...
Code: Select all
FUNCTION MyTableEdit(ColIndex, ColName, InputRowNo, ExpectedValues, ActualValue)
Common /SALMDLLog/ CommonInitialized, ErrLog(1000,6), ErrCount, MaxErrCount
Equate RoutineName To 'MyTableEdit'
* constants used to indicate classes of failure
Equate FailStage To -1
Equate FailRow To 1
Equate FailColumn To 2
Equate InfoDefaulted To 3
* Original storage statement to dynamic array in COMMON area
* ErrLog(1, 1) = Column/Field Name
* ErrLog(1, 2) = Column Ordinal Value (1 relative)
* ErrLog(1, 3) = Input Row Number
* ErrLog(1, 4) = Actual Input Value
* ErrLog(1, 5) = Reject Severity Code
* ErrLog(1, 6) = Reject Reason Description
IF CommonInitialized = 0 THEN; * is this the first time here?
CommonInitialized = 1 ; * set flag indicating common already initialized
ErrorCount = 0; * reset the error count
MaxErrCount = 1000; * set limit on number of errors to record
END
sActualValue = Trim(ActualValue);
bRejected = @FALSE;
bFailed = @TRUE; * default to failure
iExpectedLength = Len(ExpectedValues)
Ans = "*ERR*"; * default to return error indicator as value
sRejectReason = "";* default to no reject reason
BEGIN CASE
CASE UPCASE(ColName) = "COLUMNNAME1"
* check for errors first, let default handle ok condition
BEGIN CASE
* value is missing/null?
Case IsNull(sActualValue) = @TRUE OR Len(sActualValue) = 0
sRejectReason := "missing/null value;"
RejectSeverity = FailRow
* value too short?
Case Len(sActualValue) < iExpectedLength; * only one expected value so use length
sRejectReason := "value too short (" : Len(sActualValue) : ")" : " [" : sActualValue : "];"
RejectSeverity = FailRow
* value too long?
Case Len(sActualValue) > iExpectedLength
sRejectReason := "value too long (" : Len(sActualValue) : ")" : " [" : sActualValue : "];"
RejectSeverity = FailRow
* value not expected?
Case sActualValue[1,iExpectedLength] <> ExpectedValues
sRejectReason := "expected (" : sActualValue[1,iExpectedLength] : ") <>" : " [" : ExpectedValues : "];"
RejectSeverity = FailRow
* no? must be ok then
Case @True
Ans = sActualValue;* set return value to plug in outbound column
bFailed = @False
END Case
CASE UPCASE(ColName) = "COLUMNNAME2"
* check for errors first, let default handle ok condition
BEGIN CASE
* value is missing/null?
Case IsNull(sActualValue) = @TRUE OR Len(sActualValue) = 0
sRejectReason := "missing/null value;"
RejectSeverity = FailRow
* value too short?
Case Len(sActualValue) < iExpectedLength; * only one expected value so use length
sRejectReason := "value too short (" : Len(sActualValue) : ")" : " [" : sActualValue : "];"
RejectSeverity = FailRow
* value too long?
Case Len(sActualValue) > iExpectedLength
sRejectReason := "value too long (" : Len(sActualValue) : ")" : " [" : sActualValue : "];"
RejectSeverity = FailRow
* value not expected?
Case sActualValue[1,iExpectedLength] <> ExpectedValues
sRejectReason := "expected (" : sActualValue[1,iExpectedLength] : ") <>" : " [" : ExpectedValues : "];"
RejectSeverity = FailRow
* no? must be ok then
Case @True
Ans = sActualValue;* set return value to plug in outbound column
bFailed = @False
END Case
CASE UPCASE(ColName) = "COLUMNNAME3"
* check for errors first, let default handle ok condition
BEGIN CASE
* value null?
Case IsNull(sActualValue) = @TRUE OR Len(sActualValue) = 0
sRejectReason := "missing/null value;"
RejectSeverity = FailRow
* value too short?
Case Len(sActualValue) < 1; * only one expected value so use length
sRejectReason := "value too short (<1) Len:(" : Len(sActualValue) : ") [" : sActualValue : "];"
RejectSeverity = FailRow
* value too long?
Case Len(sActualValue) > 9
sRejectReason := "value too long (>9) Len:(" : Len(sActualValue) : ") [" : sActualValue : "];"
RejectSeverity = FailRow
* value not expected?
* Case sActualValue[1,iExpectedLength] <> ExpectedValues
* sRejectReason := "expected (" : sActualValue[1,iExpectedLength] : ") =" : " [" : ExpectedValues : "];"
* RejectSeverity = FailRow
* no? must be ok then
Case @True
Ans = sActualValue
bFailed = @False
END Case
END CASE
* if the column failed, add it to the log unless max errors already reached
IF bFailed = @True THEN
ErrCount += 1
IF ErrCount < MaxErrCount THEN
ErrLog(ErrCount,1) = ColName
ErrLog(ErrCount,2) = ColIndex
ErrLog(ErrCount,3) = InputRowNo
ErrLog(ErrCount,4) = ActualValue
ErrLog(ErrCount,5) = RejectSeverity
ErrLog(ErrCount,6) = sRejectReason
END
END
RETURN(ANS)
When I attempted to later evaluate the COMMON data structure in a stage variable evaluation, it was empty, so I assume these DS areas don't share a common namespace?
Guess I'll have to pass the return values back as a @VM separated list?
Suggestions appreciated.
Thanks.
Couldn't get them to compile regardless of the definitions in JOBCONTROL.H
Tried:
Ran a test and received the following error:
Even though the UI function compiler didn't choke when compiling, the stage failed with an abort in the log. I couldn't find any other reference that would give me a clue about the function prototype, since I'm new to DS and realize we're running on VERY old product (6.x).
Tried:
Code: Select all
$INCLUDE DSINCLUDE JOBCONTROL.H
DEFFUN GetStageVar(varname) Calling "*DataStage*DSGetVar"
sVarVal = GetStageVar("sInRejectMsg")
Ran a test and received the following error:
Test failed.
Program "DSU.MyEdit": Line 19, "*DataStage*DSGetVar" is not in the CATALOG space.
[ENOENT] No such file or directory
Program "DSU.MyEdit": Line 19, Incorrect VOC entry for *DataStage*DSGetVar.
Program "DSU.MyEdit": Line 19, Unable to load subroutine.
Even though the UI function compiler didn't choke when compiling, the stage failed with an abort in the log. I couldn't find any other reference that would give me a clue about the function prototype, since I'm new to DS and realize we're running on VERY old product (6.x).
-
ray.wurlod
- Participant
- Posts: 54595
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Stage variables' scope is only within the Transformer stage (and therefore in its process). Similarly, any variables defined as COMMON in the Transformer stage are only accessible in its process; that is, in a before-stage or after-stage subroutine invoked from that Transformer stage itself, or in the row-by-row processing.
Are you executing in an after-stage subroutine?
You will need to dump the values somewhere persistent before exiting if you want to examine them subsequently.
Are you executing in an after-stage subroutine?
You will need to dump the values somewhere persistent before exiting if you want to examine them subsequently.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Ray,
I'd really prefer not to have to manage a potentially large dynamic array until end of stage, preferring instead to collect any/all errors on a row-by-row basis and pump each one individually out in a rejected row table for later consumption.
BTW, Is there a FAQ somewhere on COMMON processing I can consume?
I guess I flubbed the COMMON processing...I had a COMMON data structure defined in each column's validation function (on OP) and was setting data within a dynamic hashed array in named COMMON area /MYCOMMON/ :
Then in the Stage variable section I generated two more derivative functions to iterate through the /MYCOMMON/ hashed array (ErrLog<...>) for each column building a single reject reason from each column's errors for the row and another to set the number of errors received on the row; IOW: two new OP columns (1) RejectReason and (2) ErrCount to be sent to the reject table in Oracle.
The idea was to have the stage variable-based derivative functions access the COMMON data structures and populate the outbound reject row, without having to repeat all of the tests in CONSTRAINTS.
I know I'm missing something obvious, and making something simple harder than it should be. I feel like I'm trying to drive a nail with a screwdriver! Good thing there's no sharp edges on DS. :D
I appreciate your time.
Best regards.
I'd really prefer not to have to manage a potentially large dynamic array until end of stage, preferring instead to collect any/all errors on a row-by-row basis and pump each one individually out in a rejected row table for later consumption.
BTW, Is there a FAQ somewhere on COMMON processing I can consume?
I guess I flubbed the COMMON processing...I had a COMMON data structure defined in each column's validation function (on OP) and was setting data within a dynamic hashed array in named COMMON area /MYCOMMON/ :
Code: Select all
ErrLog<1,1> = inputrecno;
ErrLog<1,2> = 'columnname';
ErrLog<1,3> = 'jobstarttimestamp';
ErrLog<1,4> = 'datasourcename';
ErrLog<1,5> = numericSeverityCode;
ErrLog<1,6> = 'rejectReason';
ErrLog<1,6> = 'actualColumnValue';
...
Then in the Stage variable section I generated two more derivative functions to iterate through the /MYCOMMON/ hashed array (ErrLog<...>) for each column building a single reject reason from each column's errors for the row and another to set the number of errors received on the row; IOW: two new OP columns (1) RejectReason and (2) ErrCount to be sent to the reject table in Oracle.
The idea was to have the stage variable-based derivative functions access the COMMON data structures and populate the outbound reject row, without having to repeat all of the tests in CONSTRAINTS.
I know I'm missing something obvious, and making something simple harder than it should be. I feel like I'm trying to drive a nail with a screwdriver! Good thing there's no sharp edges on DS. :D
I appreciate your time.
Best regards.
-
ray.wurlod
- Participant
- Posts: 54595
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
There's no problem using dynamic arrays in COMMON areas.
While there's no COMMON FAQ, there is good coverage in the DataStage BASIC manual. The important thing is that, if the same COMMON area is used in more than one routine, it must be identically defined (in terms of variable types rather than names) everywhere it is use. To this end some folks like to set up INCLUDE files in DSU_BP containing the COMMON declarations.
Your approach is completely valid. Sounds like you just need to get your head around the techniques.
Note that DataStage BASIC statements do not require ";" line terminators. This character separates statements on the same line, so you are generating a no-op statement at the end of each actual statement.
I note in the routine you posted that you are overwriting Ans in each Case. Is that what you intended? And you're still referring to the dimensioned arrays within the code; do you want to use dimensioned arrays or dynamic arrays (no upper limit on entries) or some combination thereof?
All that said, you don't really need COMMON at all, if it's row-by-row processing. Any stage variable will have a sufficient degree of persistence, and can hold a dynamic array of any size. Initialize the stage variable to "" (a dynamic array containing no fields) and off you go.
While there's no COMMON FAQ, there is good coverage in the DataStage BASIC manual. The important thing is that, if the same COMMON area is used in more than one routine, it must be identically defined (in terms of variable types rather than names) everywhere it is use. To this end some folks like to set up INCLUDE files in DSU_BP containing the COMMON declarations.
Your approach is completely valid. Sounds like you just need to get your head around the techniques.
Note that DataStage BASIC statements do not require ";" line terminators. This character separates statements on the same line, so you are generating a no-op statement at the end of each actual statement.
I note in the routine you posted that you are overwriting Ans in each Case. Is that what you intended? And you're still referring to the dimensioned arrays within the code; do you want to use dimensioned arrays or dynamic arrays (no upper limit on entries) or some combination thereof?
All that said, you don't really need COMMON at all, if it's row-by-row processing. Any stage variable will have a sufficient degree of persistence, and can hold a dynamic array of any size. Initialize the stage variable to "" (a dynamic array containing no fields) and off you go.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Ray,
I'm sure you are right.
What I have is a pretty simple test job stage...
Input Sequential File ---> XFormer ---> PassedRows
|
+--------->RejectedRows
However, you touched on the very subject that has escaped me...how do I get/set Stage Variables from column-level validation functions? Is there some other, more preferred mechanism for handling this or have I just concocted a rube-goldberg approach?
I can see using ValidateField()-type function on each Output field, but if I want to capture metadata about the reject reason in a Stage Variable dynamic array or a dimensioned table and refer to this collection in the rejected row derivation, HOW is this accomplished? I have been through the BASIC manual several times and through the SDK\Utility routines, but I found NOTHING about this concept at all. Again, I know there is some fundamental concept that is so simple to you DS veterans that you don't even think about it. Kind of like, 'OHHH breathing, you want me to explain BREATHING?'
As for data typing in a named COMMON area, I have seen nothing to indicate data types in 6.x BASIC. Pretty loosely typed. Does it follow some sort of character suffix convention for data typing like in the BASIC of yore?
Thanks again for your help.
Best regards.
I'm sure you are right.
What I have is a pretty simple test job stage...
Input Sequential File ---> XFormer ---> PassedRows
|
+--------->RejectedRows
However, you touched on the very subject that has escaped me...how do I get/set Stage Variables from column-level validation functions? Is there some other, more preferred mechanism for handling this or have I just concocted a rube-goldberg approach?
I can see using ValidateField()-type function on each Output field, but if I want to capture metadata about the reject reason in a Stage Variable dynamic array or a dimensioned table and refer to this collection in the rejected row derivation, HOW is this accomplished? I have been through the BASIC manual several times and through the SDK\Utility routines, but I found NOTHING about this concept at all. Again, I know there is some fundamental concept that is so simple to you DS veterans that you don't even think about it. Kind of like, 'OHHH breathing, you want me to explain BREATHING?'
As for data typing in a named COMMON area, I have seen nothing to indicate data types in 6.x BASIC. Pretty loosely typed. Does it follow some sort of character suffix convention for data typing like in the BASIC of yore?
Thanks again for your help.
Best regards.
-
ray.wurlod
- Participant
- Posts: 54595
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
First, there are NO data types in DataStage BASIC. That makes life easier or more difficult, depending upon your mindset.
In my project, DSGetVar is not cataloged in VOC the same as yours is, and yours is - I believe - incorrect. First, check whether files called DSGetVar.B and DSSetVar.B exist in a directory called DSG_BP.O in your project (on the server).
If they are there, try these commands from TCL:
This should make copies of your current entries, then create correct VOC catalog entries.
Let us know how it went.
In my project, DSGetVar is not cataloged in VOC the same as yours is, and yours is - I believe - incorrect. First, check whether files called DSGetVar.B and DSSetVar.B exist in a directory called DSG_BP.O in your project (on the server).
If they are there, try these commands from TCL:
Code: Select all
COPY FROM VOC DSGetVar,DSGetVar.BAK
COPY FROM VOC DSSetVar,DSSetVar.BAK
CATALOG DSG_BP DSGetVar.B DSGetVar LOCAL
CATALOG DSG_BP DSSetVar.B DSSetVar LOCALThis should make copies of your current entries, then create correct VOC catalog entries.
Let us know how it went.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
While attempting the first command:
I received:
I assume this means DSGetVar is already defined in the VOC. I will muddle along in the UniBasic manual and see what I can find out.
Thanks again for your help.
Best regards.
Code: Select all
COPY FROM VOC DSGetVar, DSGetVar.BAK
I received:
Record "DSGetVar" already exists in file "VOC"; not COPIED.
Record "DSGetVar.BAK" not found in file "VOC".
0 records copied.
I assume this means DSGetVar is already defined in the VOC. I will muddle along in the UniBasic manual and see what I can find out.
Thanks again for your help.
Best regards.
-
ray.wurlod
- Participant
- Posts: 54595
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Nice. I do love a tool with parameter whitespace sensitivity...reminds me of the bad ole RPG days of the mid-80's.
I was able to get DSGetVar() working today, but DSSetVar() has not proven as maleable, and my hammer is getting heavy. Stone-age Perl is looking better all the time.
The template is almost done now, if I can just get this last inter-locking stone in place...
I do thank you for the continuous support.
Best regards,
-Don
I was able to get DSGetVar() working today, but DSSetVar() has not proven as maleable, and my hammer is getting heavy. Stone-age Perl is looking better all the time.
The template is almost done now, if I can just get this last inter-locking stone in place...
I do thank you for the continuous support.
Best regards,
-Don
-
ray.wurlod
- Participant
- Posts: 54595
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Ray et al,
Ok, full disclosure...I have this working now. Thanks for all your help.
At the risk of seeming uber-geeky...
I believe this approach to be overly complex and will continue to refactor it until there is nothing left to throw away. Kind of like the processes used in poetry, sculpting or painting...so I'm told by my more creative acquaintences. Keep at it until only the purest form remains.
General Overview
1. There are a pair (1 function and 1 transform) used to edit each column.
2. There are another pair (1 function and 1 transform) for each reject row.
3. There are three stage variables to track state of column and row validations.
4. Each column/field to be validated in the output is passed through the same function.
5. Errors are accumulated in a dynamic, hashed array (ErrLog) defined in named COMMON
6. The dynamic array (ErrLog) is held in named COMMON and accessed through a transformation function in the Reject link to compile a consolidated list of column errors to be associated with the row rejected.
Stage Variables:
RejectInputRow has the following stage variable derivation defined:
RowErrCount > 0
Both InRejectMsg and RowErrCount are stage variables managed by the functions auditing each column/row. This management is accomplished via DSGetVar() and DSSetVar().
Example of usage in Output sink:
ValidateTblxxx(Input.CODE_TYPE, "code_type", @INROWNUM, "ExpectedValueHere", 1)
ValidateTblxxx(Input.FACILITY_CODE, "facility_code", @INROWNUM, "RR", 2)
ValidateTblxxx(Input.SALES_CODE, "SALES_CODE", @INROWNUM, "",3)
...
Output sink has the following Constraint defined:
RejectInputRow = @FALSE
Example of usage in Reject link:
Reject sink Column Name: Message
Derivation:
"ColErr: " : TblxxxRowCK(@INROWNUM) : " |DSLastLinkErr: " : DSLinkLastErr : "|DSStageLastErr: " : DSStageLastErr
Function/Transform used to validate each column:
Next, is the routine to fire in the Reject Link...
Additional metadata captured in the Reject link include Input row number, event datetimestamp, and the message compiled in the immediate routine above, along with DSHostname, DSProjectName, DSStageName, DSLinkName, RoutineName. I will probably add the input source name as a job parameter in the near future.
Best regards,
--Don
Ok, full disclosure...I have this working now. Thanks for all your help.
At the risk of seeming uber-geeky...
I believe this approach to be overly complex and will continue to refactor it until there is nothing left to throw away. Kind of like the processes used in poetry, sculpting or painting...so I'm told by my more creative acquaintences. Keep at it until only the purest form remains.
General Overview
1. There are a pair (1 function and 1 transform) used to edit each column.
2. There are another pair (1 function and 1 transform) for each reject row.
3. There are three stage variables to track state of column and row validations.
4. Each column/field to be validated in the output is passed through the same function.
5. Errors are accumulated in a dynamic, hashed array (ErrLog) defined in named COMMON
6. The dynamic array (ErrLog) is held in named COMMON and accessed through a transformation function in the Reject link to compile a consolidated list of column errors to be associated with the row rejected.
Stage Variables:
Code: Select all
Name Initial Value Description
InRejectMsg "" Input record reject message
RowErrCount 0 Number of errors encountered while editing row
RejectInputRow 0 Boolean True/False for rejected record
RejectInputRow has the following stage variable derivation defined:
RowErrCount > 0
Both InRejectMsg and RowErrCount are stage variables managed by the functions auditing each column/row. This management is accomplished via DSGetVar() and DSSetVar().
Example of usage in Output sink:
ValidateTblxxx(Input.CODE_TYPE, "code_type", @INROWNUM, "ExpectedValueHere", 1)
ValidateTblxxx(Input.FACILITY_CODE, "facility_code", @INROWNUM, "RR", 2)
ValidateTblxxx(Input.SALES_CODE, "SALES_CODE", @INROWNUM, "",3)
...
Output sink has the following Constraint defined:
RejectInputRow = @FALSE
Example of usage in Reject link:
Reject sink Column Name: Message
Derivation:
"ColErr: " : TblxxxRowCK(@INROWNUM) : " |DSLastLinkErr: " : DSLinkLastErr : "|DSStageLastErr: " : DSStageLastErr
Function/Transform used to validate each column:
Code: Select all
* Function Name: TblxxxColumnValidationEdit
* Interface: TblxxxColumnValidationEdit(ColIndex, ColName, InputRowNo, ExpectedValues, ActualValue)
* Referenced later by a custom Transformation: ValidateTblxxx()
* ValidateTblxxx(%column_ordinal%, %col_name%, %input_row_num%, %expected_value%, %actual_value%)
*
* This routine will validate the xxxtable columns
* If an invalid value is found, the input reject message stage variable
* will contain a descriptive reason
*
Common /TblxxxLog/ CommonInitialized, ErrLog, ErrCount, MaxErrCount, RowErrCount, PrevInputRowNumber
$INCLUDE DSINCLUDE JOBCONTROL.H
Equate RoutineName To 'TblXXXColumnValidationEdit'
* ------------------------------------------------------------------
* Define external function links here
* ------------------------------------------------------------------
DEFFUN GetStageVar(varname) Calling "DSGetVar"
DEFFUN SetStageVar(varname, varvalue) Calling "DSSetVar"
DEFFUN UtilityMessageToLog(msg) Calling "DSX.UtilityMessageToLog"
* ------------------------------------------------------------------
* constants used to indicate column action state
* ------------------------------------------------------------------
Equate PASSED To 0
Equate FAILSTAGE To -1
Equate FAILROW To 1
Equate FAILCOLUMN To 2
Equate DEFAULTED To 3
* ------------------------------------------------------------------
* Define useful stage variables here
* ------------------------------------------------------------------
Equate SV_ROWREJECTMSG To "InRejectMsg"
Equate SV_ROWERRORCOUNT To "RowErrCount"
* ------------------------------------------------------------------
* Obtain local copies of stage variable values
* ------------------------------------------------------------------
svRejectMsg = GetStageVar(SV_ROWREJECTMSG)
svRowErrCount = GetStageVar(SV_ROWERRORCOUNT)
* ------------------------------------------------------------------
* build location information
* ------------------------------------------------------------------
sMsgLocation = DSHostName : "." : DSProjectName : "." : DSJobName : "." : DSStageName : "." : DSLinkName : "." RoutineName
*<== remove to debug ==>* status = UtilityMessageToLog(sMsg)
* ------------------------------------------------------------------
* determine if the named COMMON data area requires initialization
* ------------------------------------------------------------------
IF CommonInitialized = 0 THEN; * is this the first time here?
CommonInitialized = 1; * set flag indicating common already initialized
ErrorCount = 0; * reset the error count
MaxErrCount = 1000; * set limit on number of errors to record
ErrLog=""; * initialize the dynamic, hashed array in COMMON
PrevInputRowNumber = -1
END
* ------------------------------------------------------------------
* Make local copies of args
* ------------------------------------------------------------------
iColIndex = ColIndex
sColName = UPCASE(Trim(ColName))
iInputRowNo = InputRowNo
sExpectedValues = ExpectedValues
sActualValue = Trim(ActualValue)
bPriorError = Len(svRejectMsg) > 0; * errors encountered for this row on prior column?
If iInputRowNo <> PrevInputRowNumber THEN
PrevInputRowNumber = iInputRowNo; * initialize current row number
s = SetStageVar(SV_ROWREJECTMSG, ""); * clear the row error message
s = SetStageVar(SV_ROWERRORCOUNT, 0); * reset row-specific error count
RowErrCount = 0; * reset COMMON column error count
END
* ------------------------------------------------------------------
* determine the number of potential values
* if there is only one expected value, assume the expected value
* length to be the same as that of the expected value
* ------------------------------------------------------------------
iExpectedValues = DCount(sExpectedValues,@VM)
IF iExpectedValues = 1 THEN
iExpectedLength = Len(sExpectedValues)
End
* -----------------------------------------------
* if an expected value list was provided,
* Calculate min/max expected value lengths
* -----------------------------------------------
IF iExpectedValues > 0 THEN
iMaxExpectedValueLength = -1; * set impossible maximum expected value length
iMinExpectedValueLength = 99999; * set improbable minimum expected value length
FOR i=1 TO iExpectedValues
iExpectedValueLength = Len(Field(sExpectedValues,@VM,i,1))
IF iExpectedValueLength > iMaxExpectedValueLength THEN
iMaxExpectedValueLength = iExpectedValueLength
END
IF iExpectedValueLength < iMinExpectedValueLength THEN
iMinExpectedValueLength = iExpectedValueLength
END
NEXT
* determine if the column value is in the list of expected values;
* default to found (@true); @false if not
bExpectedValueFound = @TRUE
FIND sActualValue in sExpectedValues Setting F,F,F ELSE bExpectedValueFound = @FALSE END
ELSE
* if no expected values were specified, default to a broad range
iMaxExpectedValueLength = 99999;
iMinExpectedValueLength = 0;
bExpectedValueFound = @TRUE
END; * IF iExpectedValues > 0 THEN
sColRejectReason = ""; * default to no reject reason
iColRejectSeverity = PASSED; * default to column pass
* ----------------------------------------------------------
* column-specific testing begins here...
* ----------------------------------------------------------
BEGIN CASE
CASE sColName = "CODE_TYPE"
* check for errors first, let default handle ok condition
BEGIN CASE
* value is missing/null?
Case IsNull(sActualValue) OR Len(sActualValue) = 0
ssColRejectReason := "missing/null value;"
RejectSeverity = FAILROW
* value too short?
Case Len(sActualValue) < iMinExpectedValueLength
sColRejectReason := "value too short (" : Len(sActualValue) : ")" : " [" : sActualValue : "];"
iColRejectSeverity = FAILROW
* value too long?
Case Len(sActualValue) > iMaxExpectedValueLength
sColRejectReason := "value too long (" : Len(sActualValue) : ")" : " [" : sActualValue : "];"
iColRejectSeverity = FAILROW
* value not expected?
Case bExpectedValueFound = @FALSE
sColRejectReason := "expected (" : sActualValue : ") <>" : " [" : sExpectedValues : "];"
iColRejectSeverity = FAILROW
* must be ok then
Case @True
Ans = sActualValue; * set return value to plug in outbound column
iColRejectSeverity = PASSED;
END Case
CASE sColName = "FACILITY_CODE"
* check for errors first, let default handle ok condition
BEGIN CASE
* value is missing/null?
Case IsNull(sActualValue) OR Len(sActualValue) = 0
sColRejectReason := "missing/null value;"
iColRejectSeverity = FAILROW
* value too short?
Case Len(sActualValue) < iMinExpectedValueLength
sColRejectReason := "value too short (" : Len(sActualValue) : ")" : " [" : sActualValue : "];"
iColRejectSeverity = FAILROW
* value too long?
Case Len(sActualValue) > iMaxExpectedValueLength
sColRejectReason := "value too long (" : Len(sActualValue) : ")" : " [" : sActualValue : "];"
iColRejectSeverity = FAILROW
* value not expected?
Case bExpectedValueFound = @FALSE
sColRejectReason := "expected (" : sActualValue : ") <>" : " [" : sExpectedValues : "];"
iColRejectSeverity = FAILROW
* must be ok then
Case @True
Ans = sActualValue; * set return value to plug in outbound column
iColRejectSeverity = PASSED
END Case
CASE sColName = "SALES_CODE"
* check for errors first, let default handle ok condition
BEGIN CASE
* value null?
Case IsNull(sActualValue) = @TRUE OR Len(sActualValue) = 0
sColRejectReason := "missing/null value;"
iColRejectSeverity = FAILROW
* value too short?
Case Len(sActualValue) < 1
sColRejectReason := "value too short (" : Len(sActualValue) : ")" : " [" : sActualValue : "];"
iColRejectSeverity = FAILROW
* value too long?
Case Len(sActualValue) > 9
sColRejectReason := "value too long (" : Len(sActualValue) : ")" : " [" : sActualValue : "];"
iColRejectSeverity = FAILROW
* value not expected?
Case bExpectedValueFound = @FALSE
sColRejectReason := "expected (" : sActualValue : ") <>" : " [" : sExpectedValues : "];"
iColRejectSeverity = FAILROW
* no? must be ok then
Case @True
Ans = sActualValue; * set return value to plug in outbound column
iColRejectSeverity = PASSED
END Case
END CASE
* --------------------------------------------------------------------------
* if the column fails the row:
* increment the number of errors encountered
* add it to the error log dynamic array, unless max errors already reached
* --------------------------------------------------------------------------
IF iColRejectSeverity = FAILROW THEN
ErrCount += 1; * increment total errors (all rows)
IF ErrCount < MaxErrCount THEN
RowErrCount += 1; * increment # errors for this row
s = SetStageVar(SV_ROWERRORCOUNT, RowErrCount); * re-save the row error count
sMsg = sMsgLocation : "." : sColName : ".(" : iColRejectSeverity : ")." : sColRejectReason : ".RowErrCount=(" : RowErrCount : ").AllErrCount=(" : ErrCount : ")"
s = UtilityMessageToLog(sMsg)
END
* --------------------------------------------------------------------------
* save column error metadata in hashed dynamic array in COMMON for later
* compilation by the reject output link
* --------------------------------------------------------------------------
ErrLog<ErrCount,1> = sColName
ErrLog<ErrCount,2> = iColIndex
ErrLog<ErrCount,3> = iInputRowNo
ErrLog<ErrCount,4> = sActualValue
ErrLog<ErrCount,5> = iColRejectSeverity
ErrLog<ErrCount,6> = sColRejectReason
END
Return (Ans)
Next, is the routine to fire in the Reject Link...
Code: Select all
* Interface: FUNCTION TblxxxRowCK(InputRowNo)
Common /TblxxxLog/ CommonInitialized, ErrLog, ErrCount, MaxErrCount, RowErrCount, PrevInputRowNumber
* Transformation Interface: TblxxxReject(%InputRowNumber%)
$INCLUDE DSINCLUDE JOBCONTROL.H
Equate RoutineName To 'TblxxxRowCK'
* ------------------------------------------------------------------
* Define external function links here
* ------------------------------------------------------------------
DEFFUN GetStageVar(varname) Calling "DSGetVar"
DEFFUN SetStageVar(varname, varvalue) Calling "DSSetVar"
DEFFUN UtilityMessageToLog(msg) Calling "DSX.UtilityMessageToLog"
* ------------------------------------------------------------------
* Define useful stage variables here
* ------------------------------------------------------------------
Equate SV_ROWREJECTMSG To "InRejectMsg"
Equate SV_ROWERRORCOUNT To "RowErrCount"
* ------------------------------------------------------------------
* Obtain local copies of stage variable values
* ------------------------------------------------------------------
svRejectMsg = GetStageVar(SV_ROWREJECTMSG)
svRowErrCount = GetStageVar(SV_ROWERRORCOUNT)
* ------------------------------------------------------------------
* constants used to indicate column action state
* ------------------------------------------------------------------
Equate PASSED To 0
Equate FAILSTAGE To -1
Equate FAILROW To 1
Equate FAILCOLUMN To 2
Equate DEFAULTED To 3
* ------------------------------------------------------------------
* Format of dynamic array in COMMON area...
* ------------------------------------------------------------------
* Original storage statement to dynamic array in COMMON area
* ErrLog<1, 1> = Column/Field Name
* ErrLog<1, 2> = Column Ordinal Value (1 relative)
* ErrLog<1, 3> = Input Row Number
* ErrLog<1, 4> = Actual Input Value
* ErrLog<1, 5> = Reject Severity Code
* ErrLog<1, 6> = Reject Reason Description
* Values in each column entry in the ErrLog<> dynamic array
Equate ICOLNAME To 1
Equate ICOLINDEX To 2
Equate IROWNUM To 3
Equate ICOLVALUE To 4
Equate ISEVCODE To 5
Equate IREASON To 6
* ------------------------------------------------------------------
* build location information
* ------------------------------------------------------------------
sMsgLocation = DSHostName : "." : DSProjectName : "." : DSJobName : "." : DSStageName : "." : DSLinkName : "." RoutineName
* status = UtilityMessageToLog(sMsg)
* ------------------------------------------------------------------
* determine if the named COMMON data area is initialized, bail if not
* ------------------------------------------------------------------
IF CommonInitialized = 0 THEN ANS=""
else
* ------------------------------------------------------------------
* Make local copies of args
* ------------------------------------------------------------------
iInputRowNo = InputRowNo
iErrorCount = ErrCount
ErrorCode = 0 ;* default to pass; set non-zero to stop the stage/job
sRejectMsg = ""; * initialize the reject msg
IF svRowErrCount > 0 THEN
FOR i = 1 TO iErrorCount
IF ErrLog<i,IROWNUM> = iInputRowNo THEN
iRejectSeverity = ErrLog<i,ISEVCODE>
sRejectReason = ErrLog<i,IREASON>
sRejectMsg := "Sev:" : iRejectSeverity : " |Col: " : ErrLog<i,ICOLNAME> : " |Msg: " : sRejectReason
END
NEXT
END
* store the composite error message back to the stage variable
s = SetStageVar(SV_ROWREJECTMSG, sRejectMsg)
Ans = sRejectMsg
END; * IF CommonInitialized = 0 THEN...
Additional metadata captured in the Reject link include Input row number, event datetimestamp, and the message compiled in the immediate routine above, along with DSHostname, DSProjectName, DSStageName, DSLinkName, RoutineName. I will probably add the input source name as a job parameter in the near future.
Best regards,
--Don
-
ray.wurlod
- Participant
- Posts: 54595
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
I still don't believe you need the named COMMON. I believe you could use a stage variable called svErrLog in exactly the same way. The named COMMON is not wrong, but does preclude you using this job or a fragment of it in a server shared container in a parallel job, if that's a consideration.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.

