Archiving Sequential file when the job is running

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
bman
Participant
Posts: 33
Joined: Wed Oct 10, 2007 5:42 pm

Archiving Sequential file when the job is running

Post by bman »

Hi

I have a parallel job that writes to a sequential file. My requirement is to archive the file whenver the size reaches a limit so that the size of the file will not be non manageable.
But I am finding it difficult to back up the file when the job is running.

I tried the below method,

Sequential file stage write to a link file that has symbolic link to the original file.whenver the original file size reaches a limit change the link to a new file. But this method fails as the sequential file stage keeps the file opened and even if I change the link, it still writes to the original file.
Link File ---> Original File 1
and once the original file is big enough change the link as
Link File ----> Original File2 . But even after changing the link data is getting written to the Original File1 itself

Any other way to achieve this ? Any settig in Datstage to reopen the file pointer or something similiar ?
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

You can't "archive" or otherwise futz with a file that is being written to. :?
-craig

"You can never have too many knives" -- Logan Nine Fingers
bcarlson
Premium Member
Premium Member
Posts: 772
Joined: Fri Oct 01, 2004 3:06 pm
Location: Minnesota

Post by bcarlson »

Can you explain more about your parallel job? Is it a batch job that runs 1 or more times during the day or is it a trickle feed, like reading a queue all day long? Have you considered writing to a compressed file? The sequential file stage's filter option can be used to force compression on output. Set the filter to 'compress -c' or 'gzip -c'.

To the general public - Is there a way to make a DataStage job shut itself down? In this case, count how many records are processed. When limit is reached, have the DataStage job stop itself. Once stopped, archive the target file and then restart with a new file.

Brad.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Take a look in the Transformer stage constraints dialog - there should be a row count limiter there.

But the job status would then be stopped, so it would need to be reset. Before that, you would need to establish where it got up to, so as to begin the extraction phase (or maybe just the load phase) from that point (+ 1).
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply