calling a parallel job many times using parameters

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
Nat_1
Participant
Posts: 4
Joined: Wed Jan 06, 2010 12:11 pm

calling a parallel job many times using parameters

Post by Nat_1 »

Hi all,

I'm new to this environment and still trying to figure out how to do things. Here is what I'm doing:

I've got 1000 files that need to be processed by my job, and I need to pass the filename and several other parameters (that are changed for every file) to it. The job outputs a file for each input file, so I'm left with 1000 output files. The job works fine.

The problem I'm having is this: I'd like to call it once, and have the job called multiple times somehow. How do I call it from within a loop where I can pass in variables to be used as the parameters? I've already tried calling my job within a Batch job in DataStage Director, but I can't seem to add any code to loop through my variables (I tried adding an array and some variables and putting in a loop, but it won't compile). I know I can input parameters 1000 and it'll generate the code for me, but that's just not practical. I can't find any examples of adding your own code to the pre-generated code in the batch job, so I'm thinking maybe you just can't add your own code to it.

My second idea was this: I've thought of using a server job and using a Start Loop stage, etc. to call my parallel stage, but I have seen no examples of how to call a stage using a loop and passing parameters from the loop. I've tried to muddle my way through it, but I'm not getting anywwhere. Any thoughts?
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

If you want some help with your batch job you'd need to post the code or at least enough of it to emcompass the "won't compile" part and people here could get you going there.

However, a Sequence job (not a Server job) with the Start/End Loop and UserVariables stage to run your Parallel job should be just as viable. How are you handling these multiple parameter values? How are they associated with the files that are being processed? I'm mostly wondering how, inside the loop, you'll know which set of parameter values to use for any given file. The documentation for building Sequence jobs should include examples of all this, at least it certainly used to.
-craig

"You can never have too many knives" -- Logan Nine Fingers
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

This might help and could possibly springboard you to search for other similar topics as well.

viewtopic.php?t=127425
-craig

"You can never have too many knives" -- Logan Nine Fingers
Nat_1
Participant
Posts: 4
Joined: Wed Jan 06, 2010 12:11 pm

Post by Nat_1 »

Here is the code I put in the batch job (I only used 2 filenames in my array, not all 1000):

String[] files = {"S10001ECL.txt", "S10005ECL.txt"};
String fltr = "";

for (String s : files ) {

fltr = s.substring(0,5)

//this part is autogenerated:
hJob2 = DSAttachJob("New2", DSJ.ERRFATAL)
If NOT(hJob2) Then
Call DSLogFatal("Job Attach Failed: New2", "JobControl")
Abort
End
ErrCode = DSSetParam(hJob2, "ChecklistParams", "(As pre-defined)")
ErrCode = DSSetParam(hJob2, "ChecklistParams.file", s)
ErrCode = DSSetParam(hJob2, "ChecklistParams.filter", fltr)
ErrCode = DSSetDisableProjectHandler(hJob2, @FALSE)
ErrCode = DSSetDisableJobHandler(hJob2, @FALSE)
ErrCode = DSRunJob(hJob2, DSJ.RUNNORMAL)
ErrCode = DSWaitForJob(hJob2)
Status = DSGetJobInfo(hJob2, DSJ.JOBSTATUS)
If Status = DSJS.RUNFAILED Or Status = DSJS.CRASHED Then

Call DSLogFatal("Job Failed: New2", "JobControl")
End
//end of autogenerated code

}


Maybe it's just a syntax thing; here's the compile error:

Error compiling Job Control Subroutine Batch::New2
Compiling: Source = 'RT_BP906/JOB.889590643.DT.1534770407', Object = 'RT_BP906.O/JOB.889590643.DT.1534770407'
****************************************************************************************************??*
0008 String[] files = {'S10001ECL.txt', 'S10005ECL.txt'};
^
']' unexpected, Was expecting: Assignment Operator
0009 String fltr = '';
^
Variable Name (UNDEFINED) unexpected, Was expecting: Assignment Operator
0011 for (String s : names) {
^
'(' unexpected, Was expecting: Array Name, Variable name, New variable name,
';', Statement label, "ABORT", "ABORTE", "ABORTM", "BEGIN", "BREAK",
"CALL", "CHAIN", "CLEAR", "CLEARCOM", "CLEARDATA", "CLEARFILE",
"CLEARPROMPTS", "CLEARSELECT", "CLOSE", "CLOSESEQ", "COM", "COMMON",
"CONVERT", "CREATE", "CRT", "DATA", "DEBUG", "DEL", "DELETE",
"DELETEU", "DIMENSION", "ECHO", "ERRMSG", "ENTER", "EQUATE",
"EXECUTE", "EXIT", "FILELOCK", "FILEUNLOCK", "FLUSH", "FOOTING",
"FOR", "GET", "GETX", "GOSUB", "GOTO", "GROUPSTORE", "HEADING",
"HEADINGE", "HEADINGN", "IF", "INPUT", "INPUTDP", "INPUTCLEAR",
"INPUTERR", "INPUTIF", "INPUTNULL", "INPUTTRAP", "INS", "KEYEDIT",
"KEYTRAP", "LET", "LOCATE", "LOCATEP", "LOCK", "LOOP", "MAT",
"MATBUILD", "MATPARSE", "MATREAD", "MATREADU", "MATWRITE",
"MATWRITEU", "NAP", "NOBUF", "NULL", "ON", "OPEN", "OPENDEV",
"OPENPATH", "OPENSEQ", "PAGE", "PERFORM", "PRECISION", "PRINT",
"PRINTER", "PRINTERIO", "PRINTERR", "PRINTERRX", "PROCREAD",
"PROCWRITE", "PROMPT", "RANDOMIZE", "READ", "READBLK", "READNEXT",
"READSEQ", "READT", "READU", "READV", "READVU", "RELEASE", "REMOVE",
"RETURN", "REWIND", "SEEK", "uSEEK", "SELECT", "SELECTN", "SELECTV",
"SELECTE", "SLEEP", "SSELECT", "SSELECTN", "SSELECTV", "STATUS",
"STOP", "STOPE", "STOPM", "STORAGE", "TABSTOP", "TTYCTL", "UNLOCK",
"WEOF", "WEOFSEQ", "WRITE", "WRITEBLK", "WRITESEQ", "WRITET",
"WRITEU", "WRITEV", "WRITEVU", "TPRINT", "INPUTDISP", "KEYEXIT",
"TIMEOUT", "FIND", "FINDSTR", "GETLIST", "DELETELIST", "READLIST",
"WRITELIST", "DECLARE", "TTYGET", "TTYSET", "HUSH", "ASSIGN",
"SELIND", "LOOPEOL", "uINPUT", "uINPUTDP", "CONTINUE", "DEFFUN",
"TRANSACTION", "OPENCHECK", "READL", "BSCAN", "REVREMOVE", "SETREM",
"AUTHORIZATION", "PCDRIVER", "READVL", "MATREADL", "RECORDLOCKL",
"RECORDLOCKU", "WRITESEQF", "WORDSIZE", "RECIO", "SETIT", "SEND",
"UPRINT", "AUXMAP"
0035 }
^
'}' unexpected, Was expecting: Array Name, Variable name, New variable name,
';', Statement label, "ABORT", "ABORTE", "ABORTM", "BEGIN", "BREAK",
"CALL", "CHAIN", "CLEAR", "CLEARCOM", "CLEARDATA", "CLEARFILE",
"CLEARPROMPTS", "CLEARSELECT", "CLOSE", "CLOSESEQ", "COM", "COMMON",
"CONVERT", "CREATE", "CRT", "DATA", "DEBUG", "DEL", "DELETE",
"DELETEU", "DIMENSION", "ECHO", "ERRMSG", "ENTER", "EQUATE",
"EXECUTE", "EXIT", "FILELOCK", "FILEUNLOCK", "FLUSH", "FOOTING",
"FOR", "GET", "GETX", "GOSUB", "GOTO", "GROUPSTORE", "HEADING",
"HEADINGE", "HEADINGN", "IF", "INPUT", "INPUTDP", "INPUTCLEAR",
"INPUTERR", "INPUTIF", "INPUTNULL", "INPUTTRAP", "INS", "KEYEDIT",
"KEYTRAP", "LET", "LOCATE", "LOCATEP", "LOCK", "LOOP", "MAT",
"MATBUILD", "MATPARSE", "MATREAD", "MATREADU", "MATWRITE",
"MATWRITEU", "NAP", "NOBUF", "NULL", "ON", "OPEN", "OPENDEV",
"OPENPATH", "OPENSEQ", "PAGE", "PERFORM", "PRECISION", "PRINT",
"PRINTER", "PRINTERIO", "PRINTERR", "PRINTERRX", "PROCREAD",
"PROCWRITE", "PROMPT", "RANDOMIZE", "READ", "READBLK", "READNEXT",
"READSEQ", "READT", "READU", "READV", "READVU", "RELEASE", "REMOVE",
"RETURN", "REWIND", "SEEK", "uSEEK", "SELECT", "SELECTN", "SELECTV",
"SELECTE", "SLEEP", "SSELECT", "SSELECTN", "SSELECTV", "STATUS",
"STOP", "STOPE", "STOPM", "STORAGE", "TABSTOP", "TTYCTL", "UNLOCK",
"WEOF", "WEOFSEQ", "WRITE", "WRITEBLK", "WRITESEQ", "WRITET",
"WRITEU", "WRITEV", "WRITEVU", "TPRINT", "INPUTDISP", "KEYEXIT",
"TIMEOUT", "FIND", "FINDSTR", "GETLIST", "DELETELIST", "READLIST",
"WRITELIST", "DECLARE", "TTYGET", "TTYSET", "HUSH", "ASSIGN",
"SELIND", "LOOPEOL", "uINPUT", "uINPUTDP", "CONTINUE", "DEFFUN",
"TRANSACTION", "OPENCHECK", "READL", "BSCAN", "REVREMOVE", "SETREM",
"AUTHORIZATION", "PCDRIVER", "READVL", "MATREADL", "RECORDLOCKL",
"RECORDLOCKU", "WRITESEQF", "WORDSIZE", "RECIO", "SETIT", "SEND",
"UPRINT", "AUXMAP"
Array 's.substring' never dimensioned.

5 Errors detected, No Object Code Produced.
(Batch::New2)
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

You need to check your BASIC documentation or some of the existing routines (all of which include source code) for the proper syntax. For starters, there's no type declaration for variables as everything is a string and your 'for loop' syntax is incorrect. Did you look at other examples before coding this up?

If all that changes in each iteration of the loop is the name of the file to be processed, this is very easy to build a looping Sequence job for and I would strongly encourage you to take that route. Only go 'under the covers' to leverage functionality that exists nowhere else. All you need is a way to generate a delimited list of your filenames, something the UserVariables stage can then store for you and the Start / End Loop stages will automate everything else.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Nat_1
Participant
Posts: 4
Joined: Wed Jan 06, 2010 12:11 pm

Post by Nat_1 »

There ARE no examples. That was my point in the first post.

Thanks for pointing out that I need to use BASIC. I was using Java. Hopefully I an use arrays in BASIC (it's been a LONG time since I used it). Thanks for your help.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Actually, there are plenty of examples of syntax and function usage. While you won't find exact examples of what you're doing, every routine in the repository includes source code. Poke around and double-click on them to see it. That's the way most of us picked this up.

You'll also want to dig into the Server Job Developer's Guide pdf, there are chapters in there on the BASIC language and 'Programming in WebSphere DataStage'. Not to mention the BASIC Reference Guide pdf. Chapter 14 in the Designer Client Guide pdf is all about Building Job Sequences, an approach I still think you'd be much better off taking.
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54595
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

There's also a training DVD at DSXchange Learning Center on the various uses of the DataStage BASIC programming language.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Nat_1
Participant
Posts: 4
Joined: Wed Jan 06, 2010 12:11 pm

Post by Nat_1 »

Thanks for the info, guys. Though I'm not sure we have any examples in our repository. I wasn't here when the server was set up but I'm told there were many things that weren't installed, including sample code. I'll check the documentation for some sample code. In the meantime, I've made some changes to my code and I'm now only getting 2 errors when compiling:

dim files(2)
files(0) = "S10001ECL.txt"
files(1) = "S10005ECL.txt"

fltr = ""
i = 0
WHILE i <2


fltr = left(files(i),6)

hJob2 = DSAttachJob("New2", DSJ.ERRFATAL)
If NOT(hJob2) Then
Call DSLogFatal("Job Attach Failed: New2", "JobControl")
Abort
End
ErrCode = DSSetParam(hJob2, "ChecklistParams", "(As pre-defined)")
ErrCode = DSSetParam(hJob2, "ChecklistParams.file", files(i))
ErrCode = DSSetParam(hJob2, "ChecklistParams.filter", fltr)
ErrCode = DSSetDisableProjectHandler(hJob2, @FALSE)
ErrCode = DSSetDisableJobHandler(hJob2, @FALSE)
ErrCode = DSRunJob(hJob2, DSJ.RUNNORMAL)
ErrCode = DSWaitForJob(hJob2)
Status = DSGetJobInfo(hJob2, DSJ.JOBSTATUS)
If Status = DSJS.RUNFAILED Or Status = DSJS.CRASHED Then

Call DSLogFatal("Job Failed: New2", "JobControl")
End

i = i + 1


WEND

*******************************************************************************************************
Error compiling Job Control Subroutine Batch::New2
Compiling: Source = 'RT_BP906/JOB.889590643.DT.1534770407', Object = 'RT_BP906.O/JOB.889590643.DT.1534770407'
*****************************************************************************************************?**
0014 WHILE i <2
^
"WHILE" unexpected, Was expecting: Array Name, Variable name,
New variable name, ';', Statement label, "ABORT", "ABORTE", "ABORTM",
"BEGIN", "BREAK", "CALL", "CHAIN", "CLEAR", "CLEARCOM", "CLEARDATA",
"CLEARFILE", "CLEARPROMPTS", "CLEARSELECT", "CLOSE", "CLOSESEQ",
"COM", "COMMON", "CONVERT", "CREATE", "CRT", "DATA", "DEBUG",
"DEL", "DELETE", "DELETEU", "DIMENSION", "ECHO", "ERRMSG", "ENTER",
"EQUATE", "EXECUTE", "EXIT", "FILELOCK", "FILEUNLOCK", "FLUSH",
"FOOTING", "FOR", "GET", "GETX", "GOSUB", "GOTO", "GROUPSTORE",
"HEADING", "HEADINGE", "HEADINGN", "IF", "INPUT", "INPUTDP",
"INPUTCLEAR", "INPUTERR", "INPUTIF", "INPUTNULL", "INPUTTRAP", "INS",
"KEYEDIT", "KEYTRAP", "LET", "LOCATE", "LOCATEP", "LOCK", "LOOP",
"MAT", "MATBUILD", "MATPARSE", "MATREAD", "MATREADU", "MATWRITE",
"MATWRITEU", "NAP", "NOBUF", "NULL", "ON", "OPEN", "OPENDEV",
"OPENPATH", "OPENSEQ", "PAGE", "PERFORM", "PRECISION", "PRINT",
"PRINTER", "PRINTERIO", "PRINTERR", "PRINTERRX", "PROCREAD",
"PROCWRITE", "PROMPT", "RANDOMIZE", "READ", "READBLK", "READNEXT",
"READSEQ", "READT", "READU", "READV", "READVU", "RELEASE", "REMOVE",
"RETURN", "REWIND", "SEEK", "uSEEK", "SELECT", "SELECTN", "SELECTV",
"SELECTE", "SLEEP", "SSELECT", "SSELECTN", "SSELECTV", "STATUS",
"STOP", "STOPE", "STOPM", "STORAGE", "TABSTOP", "TTYCTL", "UNLOCK",
"WEOF", "WEOFSEQ", "WRITE", "WRITEBLK", "WRITESEQ", "WRITET",
"WRITEU", "WRITEV", "WRITEVU", "TPRINT", "INPUTDISP", "KEYEXIT",
"TIMEOUT", "FIND", "FINDSTR", "GETLIST", "DELETELIST", "READLIST",
"WRITELIST", "DECLARE", "TTYGET", "TTYSET", "HUSH", "ASSIGN",
"SELIND", "LOOPEOL", "uINPUT", "uINPUTDP", "CONTINUE", "DEFFUN",
"TRANSACTION", "OPENCHECK", "READL", "BSCAN", "REVREMOVE", "SETREM",
"AUTHORIZATION", "PCDRIVER", "READVL", "MATREADL", "RECORDLOCKL",
"RECORDLOCKU", "WRITESEQF", "WORDSIZE", "RECIO", "SETIT", "SEND",
"UPRINT", "AUXMAP"
0040 WEND
^
End of Line unexpected, Was expecting: Assignment Operator

2 Errors detected, No Object Code Produced.
(Batch::New2)
ray.wurlod
Participant
Posts: 54595
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

DataStage BASIC does not have WHILE..WEND.

The correct syntax for DataStage BASIC is given in the DataStage BASIC manual, and can be learned from the Programming with DataStage BASIC training DVD available from DSXchange Learning Center. While or Until tests can be used in uncounted or counted loops. In short:

Code: Select all

Loop
   statements
{ While | Until } test_expression
   statements
Repeat
or

Code: Select all

For variable = startvalue To finishvalue Step incrvalue
   statements
{ While | Until } test_expression
   statements
Next variable
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Nat_1 wrote:Though I'm not sure we have any examples in our repository. I wasn't here when the server was set up but I'm told there were many things that weren't installed, including sample code.
Understand that I wasn't referring to specific "sample code" being provided but rather the core set of Server routines that install with the product all include their source code.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply