Hi,
We are using DataStage to extract data from source (SQL 2000) and dump the data in flat file.
We would like to check if the data was extracted appropriately that is compare the row count of data in flat file with that od source to ensure that we have all the intended data in flat file.
How can this be checked in DataStage ?
Thanks
sc
To check flat file data with source data
Moderators: chulett, rschirm, roy
-
sumitchandra
- Participant
- Posts: 1
- Joined: Sun Jul 04, 2004 10:11 am
There's a bunch of ways, depending on you. Here's two:
1. In your job, add an extra output link to an aggregator and pass a column containing @OUTROWNUM variable. Do a MAX(in.column) type derivation, and have an output link from the aggregator to a text file. When your job is finished, you have an output file plus a rowcount file. Now write a shell script to do a "wc -l your_output_file" and compare the result to that in the rowcount file.
2. Write a shell script to get the link rowcount from the job in question, and verify that number against the "wc -l your_output_file" result.
1. In your job, add an extra output link to an aggregator and pass a column containing @OUTROWNUM variable. Do a MAX(in.column) type derivation, and have an output link from the aggregator to a text file. When your job is finished, you have an output file plus a rowcount file. Now write a shell script to do a "wc -l your_output_file" and compare the result to that in the rowcount file.
2. Write a shell script to get the link rowcount from the job in question, and verify that number against the "wc -l your_output_file" result.
Kenneth Bland
Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
-
ray.wurlod
- Participant
- Posts: 54595
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
On Windows you will need to get MKS toolkit or something similar to use the wc command.
DataStage counts the rows that it processes. That's one source. For the other, create a small DataStage job that executes SELECT COUNT(*) FROM whatever and returns just that number into a file. You can then easily compare the two.
DataStage counts the rows that it processes. That's one source. For the other, create a small DataStage job that executes SELECT COUNT(*) FROM whatever and returns just that number into a file. You can then easily compare the two.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.