Datastage Audit Job

A forum for discussing DataStage<sup>®</sup> basics. If you're not sure where your question goes, start here.

Moderators: chulett, rschirm, roy

Post Reply
Sumith
Participant
Posts: 5
Joined: Tue Sep 04, 2018 8:37 am

Datastage Audit Job

Post by Sumith »

Hi All,

I have a requirement of this. We extract data from DB2/ODBC and load data into table/file. We have to add audit details such as source record count,target record count and differences to the audit tables on a daily basis.Can you please suggest me on this?
Thanks
Sumith
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Hmmm... this is not a topic for a short conversation. But let's see if we can lay some groundwork here at least.

About the only thing right now that I'd ask you to clarify is your "on a daily basis" comment. Does this mean you need to target something that runs at the end of your work day to gather all of this information up (from somewhere as of yet to be determined) and thus insert them into your audit table all at once - or are you looking for a mechanism where your audit table is populated during each job run or possibly via a tightly-coupled after job type process? Meaning over and over again over the course of the day.

Or are we too early in the design stage to even know yet? :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
Sumith
Participant
Posts: 5
Joined: Tue Sep 04, 2018 8:37 am

Datastage Audit Job

Post by Sumith »

Thanks for your response.

In our project, we extract data from 3 different systems, source being DB2 and flat files. After loading the data, we have to enter the source and target count as below into a file.

SYS_NAME, Current_Date, Source Count, Target Count, Difference
Thanks
Sumith
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

For a do-it-yourself approach you could create something using the DataStage API, either as a DataStage BASIC routine or as an external command. The API has functions for obtaining link row counts - you should investigate these. A systematic approach to naming links in the job will help markedly.

Another approach would be to implement Kim Duke's ETLStats package, which you can download from here. Follow the ETL STATS link. Let me know if you need access granted. This package can capture row counts for every job in your project.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Does not the DSODB / Administration Console provide the majority of that functionality nowadays?
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Only if you run jobs in a fashion that generate operational metadata. And that's not appropriate for all jobs.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
rumu
Participant
Posts: 286
Joined: Mon Jun 06, 2005 4:07 am

Post by rumu »

Hi All,

I have done something similar for our daily batch jobs. We call it Load_Information table.
A static table is maintained with Job name, source table and target table name.
A generic unix script is created to capture the audit information which is being invokes before/after job subroutine. The script reads the static table to look for source and target tables name ,connect to DB to get the counts and update the Load Information table.
I new approach is being considered lately to run this script only after the end of the batch job to capture the information. In this case, the script will loop through the static table. We are not capturing the differences as this is only possible if source to target is 1:1.
To check source minus target, we are using a separate script that uses minus logic between source query and target query.
Rumu
IT Consultant
Post Reply