Sequential file outputs Twice the number of rows it has !

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
ady
Premium Member
Premium Member
Posts: 189
Joined: Thu Oct 12, 2006 12:08 am

Sequential file outputs Twice the number of rows it has !

Post by ady »

Hi,

I have a server job which writes 1249149 rows to a sequential file.
When the same file is used as an input in another job it outputs 2410220


I dont understand why this is happening, I tried this file out in a parallel job, In that case also it gives out 2410220 rows.


Please help
meena
Participant
Posts: 430
Joined: Tue Sep 13, 2005 12:17 pm

Post by meena »

Hi,
What exactly are you doing in the job. After loading the data into sequential file are you able to view the data. And also check with the "Update action".
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

Are you appending to the file?
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
ady
Premium Member
Premium Member
Posts: 189
Joined: Thu Oct 12, 2006 12:08 am

Post by ady »

I am overwriting the file ..

I am able to view the data properly. My job design is



Seq file > Transformer > Seqfile


The transformer has a constraint which does not allow blank rows to pass. Thats it !
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

Something is off. What is your source. Regardless, check the row counts of your source and then do a comparison with target. Something fishy going on with what you see on the designer as rows.
You sure no other process going on? Like an after job subroutine that doubles the file or soemthing. Something happening inside your job in stage variables or in a routine where you are duplicating/doubling records?
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
ady
Premium Member
Premium Member
Posts: 189
Joined: Thu Oct 12, 2006 12:08 am

Post by ady »

The data comes from a script, The script gives out the record count in the last row of the data which is rejected, I actually extract the record count and write it in different sequential file.

The job actually has a after job subroutine !!

The process compares the "count from the script" and the "count of rows processed on the output link" and if they are the same moves the output file to different location or deletes the file.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Is there a before/after stage/job subroutine that copies the file into place before the job moves a second set of records to it.

What are the link row counts reported? You chose not to reveal this vital piece of information.

In the parallel job it might occur, for example, if you used Entire partitioning on two nodes.

In a server job, something external is happening. Are you overwriting or appending?

Have you checked that your script is returning the correct row count? What does wc -l filename report?

Are there any newline characters in your data? If so, have you handled that situation in the column metadata?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
ady
Premium Member
Premium Member
Posts: 189
Joined: Thu Oct 12, 2006 12:08 am

Post by ady »

The routine moves the output file when the job is a success to a different location. I have checked if the script is appending to the previous version of the file, but its not. Its moving the file and replacing the old one as its supposed to do.

There are no new line characters in the file. I have checked and the script gets the correct row count.

As you said something external is happening ! .... Hmm..
Post Reply