Page 1 of 1

Generate sequence number using parallel job

Posted: Sun Jun 18, 2017 11:19 am
by rish0201
I want to generate sequence numbers starting from 1 every time the job run.
Number should be generated in sequence without gaps in 4 node job.

Thing i have already tried
1) (@inrownum - 1) * @numpartitions + @partitionnum + 1
The above formula only works in 2 node.

2) Column generator - It creates gaps between numbers in 4 nodes and the numbers are not in sequence.

3) Surrogate Key - It takes the last highest value and starts from there. I want it to start from 1 for each job run. Is there any way to reset the key ?

Any suggestions is welcome.

Thanks in Advance.

Posted: Sun Jun 18, 2017 6:23 pm
by ray.wurlod
In what way does the given expression "not work" with four nodes?

Posted: Wed Jun 21, 2017 3:02 am
by rish0201
In 4 nodes the numbers are not generated in sequence. It leaves gaps in between.

Posted: Wed Jun 21, 2017 7:08 am
by Mike
Did you use round robin partitioning method?

Mike

Re: Generate sequence number using parallel job

Posted: Wed Jun 21, 2017 7:51 am
by chulett
rish0201 wrote:I want it to start from 1 for each job run.
Then it would seem you do not have a need for any kind of surrogate key 'generator' but simply need a counter. At its most simple, you could use a stage variable for that, initial value of zero, increment for each record... as long as you handled the partitioning properly.

Re: Generate sequence number using parallel job

Posted: Thu Jun 22, 2017 4:59 am
by Neelabh
As inferred by @Mike as well this method requires data to be evenly distributed across all partition hence it works well with Round Robin partitioning. If not it will generate sequence with "gaps" (i.e. what you are observing).

Use other methods as suggested by craig.

1) Use Stage variable.

2) If source is seq. file stage generate row number.

3) If database use row_number function.

4) Enable transformer stage to run on a single node and use @INROWNUM

Posted: Thu Jun 22, 2017 5:45 am
by boxtoby
Hi,

You weren't far out with your original idea.

Here's what I use: (@PARTITIONNUM+1)*10 : @OUTROWNUM

Use @OUTROWNUM rather than @INROWNUM to aviod problems with funnel stages etc.

Hope that helps!

Cheers,
Bob.