Page 1 of 1

To check flat file data with source data

Posted: Wed Feb 16, 2005 3:35 pm
by sumitchandra
Hi,

We are using DataStage to extract data from source (SQL 2000) and dump the data in flat file.

We would like to check if the data was extracted appropriately that is compare the row count of data in flat file with that od source to ensure that we have all the intended data in flat file.

How can this be checked in DataStage ?

Thanks
sc

Posted: Wed Feb 16, 2005 3:53 pm
by kcbland
There's a bunch of ways, depending on you. Here's two:

1. In your job, add an extra output link to an aggregator and pass a column containing @OUTROWNUM variable. Do a MAX(in.column) type derivation, and have an output link from the aggregator to a text file. When your job is finished, you have an output file plus a rowcount file. Now write a shell script to do a "wc -l your_output_file" and compare the result to that in the rowcount file.

2. Write a shell script to get the link rowcount from the job in question, and verify that number against the "wc -l your_output_file" result.

Posted: Wed Feb 16, 2005 8:05 pm
by ray.wurlod
On Windows you will need to get MKS toolkit or something similar to use the wc command.

DataStage counts the rows that it processes. That's one source. For the other, create a small DataStage job that executes SELECT COUNT(*) FROM whatever and returns just that number into a file. You can then easily compare the two.