Can we change the parameter value at run time?

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
Md Dawar Mughni
Participant
Posts: 10
Joined: Wed Sep 30, 2009 11:55 pm
Location: Pune,India
Contact:

Can we change the parameter value at run time?

Post by Md Dawar Mughni »

Hi All,

The requirement is to split the file into multiple files based on
1.Each file should contain specific number of records or less then that
and
2.All the key columns should be in single file

Eg:
source file :

col1 col2
1 a
1 b
1 c
2 d
3 k
3 g
4 b
5 x
5 b


if number of records in each o/p file should be <= 4
then outpu files would be

file1:

col1 col2
1 a
1 b
1 c
2 d

file2:

col1 col2
3 k
3 g
4 b

file3:
col1 col2
5 x
5 b

How to impliment it in Datastage?

Appreciate the help in advance.

Warm Regards,
-Dawar
"Just do it"
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Can you explain, please, how your choice of subject is related to the question that you asked? :?
-craig

"You can never have too many knives" -- Logan Nine Fingers
dsx999
Participant
Posts: 29
Joined: Mon Aug 11, 2008 3:40 am

Post by dsx999 »

What should happen if you have more than 4 same key values (as per your example)?
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

I only asked to make sure we're answering the right question, that we know everything that was behind bringing you here. Off the bat, I don't see the connection between them... but maybe that's just me.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Kirtikumar
Participant
Posts: 437
Joined: Fri Oct 15, 2004 6:13 am
Location: Pune, India

Post by Kirtikumar »

Can you please explain what if - the no. of records mentioned are <=4 and there are more than 4 records for a particular key. E.g. instead of 3 records for key 1, you have 9 records?
Regards,
S. Kirtikumar.
Kirtikumar
Participant
Posts: 437
Joined: Fri Oct 15, 2004 6:13 am
Location: Pune, India

Post by Kirtikumar »

Try the below. I think it would work.

Aggregate the data on key col and get count for keys. so in your case it would be:
1 - 3
2 - 1
3 - 2
4 - 1
5 - 2

Sort this data for transformer input without partitioning. Then stage vars should be:
RecordCount = If CurrFile = PrevFile or PrevFile = file0 Then RecordCount + CurrCount Else CurrCount
CurrRecCount = Incoming RecCount
PrevFile = CurrFile
CurrFile = If (RecordCount + CurrRecCount) <= 4 then CurrFile Else CurrFile + 1

It would work as below - first line with 0 rownum shows the initial values for stage vars:

Code: Select all

RowNum   RecordCount       CurrRecCount     PrevFile             CurrFile
0        0                0                 file0                file1
1        0                3                 file1                file1
2        3                1                 file1                file1
3        4                2                 file1                file2
4        2                1                 file2                file2
5        3                2                 file2                file3
Now join this with your original data and you have record and to which file it should go.
Regards,
S. Kirtikumar.
Md Dawar Mughni
Participant
Posts: 10
Joined: Wed Sep 30, 2009 11:55 pm
Location: Pune,India
Contact:

Post by Md Dawar Mughni »

dsx999 wrote:What should happen if you have more than 4 same key values (as per your example)?
Assumption is there would not be more than 4 same key.
Md Dawar Mughni
Participant
Posts: 10
Joined: Wed Sep 30, 2009 11:55 pm
Location: Pune,India
Contact:

Post by Md Dawar Mughni »

chulett wrote:I only asked to make sure we're answering the right question, that we know everything that was behind bringing you here. Off the bat, I don't see the connection between them... but maybe that's just me.
Requirement is to slpit a file into multiple files based on a fixed number of records ( That is in my example is 4) and provided all the same keys should be in the same file.

If there is any dould please let me know we clarify more.
"Just do it"
Kirtikumar
Participant
Posts: 437
Joined: Fri Oct 15, 2004 6:13 am
Location: Pune, India

Post by Kirtikumar »

Whatever I mentioned was to be done in the first job. Then call the second job multiple times for no. of rows created by first job.

During each call, pass the filename from the file created in the first job. In the job using join with the original file and created file from first job, you can get the desired result.
Regards,
S. Kirtikumar.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Md Dawar Mughni wrote:
chulett wrote:I only asked to make sure we're answering the right question, that we know everything that was behind bringing you here. Off the bat, I don't see the connection between them... but maybe that's just me.
Requirement is to slpit a file into multiple files based on a fixed number of records ( That is in my example is 4) and provided all the same keys should be in the same file.

If there is any dould please let me know we clarify more.
Sorry, but this still doesn't do anything to answer my question - what does this requirement (which you are clarifying here) have to do with your subject of "Can we change the parameter value at run time?". However, I'm just going to let that go and stop worrying about it now.

Carry on.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Md Dawar Mughni
Participant
Posts: 10
Joined: Wed Sep 30, 2009 11:55 pm
Location: Pune,India
Contact:

Post by Md Dawar Mughni »

Well let me clarify that,
I was thinking of changing the parameter value at run time
1. Have two parameters

a.FileName(value as "SplitedFile")
b.Suffix (value as 1)

And use it in the File stage where all the files should get created
#File_Path/#FileName#Suffix#.txt

2. The value of the Parameter Suffix should be changed inside the trnasformer as per the number of records, Eg: for first 4 recorde it will be 1 for next 5 it will be 2 and so on

(But I dont know wheter we can do it or not because in transformer I didn get any thing related )


-------- ------------------ ---------------------
Input File Stage ----> Trnasformer --------> OutputFile Stage
-------- ------------------- ----------------------

But Not very sure how can we do this....

Please assist...
jwiles
Premium Member
Premium Member
Posts: 1274
Joined: Sun Nov 14, 2004 8:50 pm
Contact:

Post by jwiles »

You can't do this the way you are currently envisioning for two main reasons:
1) You can't change job parameters on the fly while the job is running (they are resolved at job submission time only)
2) SeqFile doesn't support closing and opening multiple files during a job run.

As the number of files/number of records per file may change from run to run, one potential option is to use a BuildOp, custom operator or external target to handle the file writes. Your transformer could pass the filename as a column. I would envision something like this:

Input File->Transformer->Column Export->[BuildOp or CustomOp or ExtTarget]

The purpose of the Column Export would be to create your final output record and place it in a single column to the file-handling stage.

Regards,
- james wiles


All generalizations are false, including this one - Mark Twain.
Post Reply