Capturing record count for scd2 Insert and Update

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
hi_manoj
Participant
Posts: 56
Joined: Sat Aug 13, 2011 2:00 pm
Location: BLR

Capturing record count for scd2 Insert and Update

Post by hi_manoj »

Hi All,
This is a scd2 implementation job and I want capturing number of records inserted and updated to my target. We have two separate links to Oracle connecter stage (one for insert and one for update).
In the current audit capturing process we are reading the count from stage_name.link_name from the output link of CDC stage (this is our source count) and same process from insert and update link to the target connector stage (addition of this two will give us the target count)

This process will work fine if all my records from source are new. So all record will go in the insert link and my source count and target count will match.

If there is any updated records then for that given updated record we will have a 1 record in insert link (new record) and one in the update link (to expire the old record). SO there will be 2 counts for 1 updated record, which will create a difference in count in the audit table.

The audit job is a separate job.
Our after job subroutine (in SCD2 load job) is creating a XML report of the job log and our audit job processes the xml file to extract the information using stage_name.link_name.

Is there any changes I need to do in my job so that I can capture the correct count for a updated record.

Regards
~Manoj
Manoj
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Those are the 'correct' counts, technically. However, perhaps you could subtract the updates from the inserts to get the insert count for new records only in this case.
-craig

"You can never have too many knives" -- Logan Nine Fingers
hi_manoj
Participant
Posts: 56
Joined: Sat Aug 13, 2011 2:00 pm
Location: BLR

Post by hi_manoj »

Thanks Craig
I understand..you have suggested a mathematical formula to match the count between record from source to record to target.

I my case, as I also have a delete link along with the update and insert, so for me the formula is
Insert count+update count+delete count-update count=Insert count+delete count should always match with source count.

But I was trying to do it in datastage. I mean any column that I will add to source field or any way that give me clue to differentiate between update and Insert.

Regards
~Manoj
Manoj
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Not sure what you mean by 'in DataStage'. We've designed our Type2 tables to include metadata columns that let us know what was inserted / updated per load but that allows collection of the information after the fact by querying the targets. We also use separate target stages for 'insert inserts' and 'update inserts' so the monitor shows more appropriate counts.

Note that I'm using Informatica and don't have a fancy SCD stage, if that's what you are using. It's all old school brute force. :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply