Data Set for Link collector links need to be the same size?

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
jpr196
Participant
Posts: 65
Joined: Tue Sep 26, 2006 1:49 pm
Location: Virginia

Data Set for Link collector links need to be the same size?

Post by jpr196 »

Hey All,

Hopefully an easy question. We have 2 sequential files (same structure) loading to one table. We were using link collector stage but our job has failed a few times after running for a period of time. The files have different amounts of data (1st has 50 million rows and 2nd has 70 million). Would this cause a timeout error using the link collector stage, round-robin algorithm? Is the simplest solution to break it up into 2 jobs and process each separately? Thanks in advance!
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

What kind of failure, a timeout? Basically yes, they should be of a similar volume but I don't think that's a hard and fast requirement. :?
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

The Link Collector stage is notorious for this behaviour. It takes Round Robin to mean "wait", rather than "skip if not ready". It does not process the "end of data" token gracefully.

You could cat the files together in a Filter command then, within the job if you want to, use Link Partitioner and Link Collector stages in concert to cause parallel processing. But try it without these stages first - I think that the speed of the Sequential File stage will be more than adequate.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

That probably explains why I don't use it. As noted, prefer concatenation, either before job or dynamically via the Filter.
-craig

"You can never have too many knives" -- Logan Nine Fingers
jpr196
Participant
Posts: 65
Joined: Tue Sep 26, 2006 1:49 pm
Location: Virginia

Post by jpr196 »

Thanks for the responses and suggestions. I need to filter data from each file before loading (and this is a one time load) so I think I'll process each file separately to work around the link collector.
Post Reply