Need some info on logic

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
sourabhverma
Participant
Posts: 74
Joined: Thu Jan 05, 2006 2:07 am

Need some info on logic

Post by sourabhverma »

Hi,

We have one parallel job which runs on monthly basis and extracts source data (100 mn records approx) and applying a lookup on that data with 2 reference tables, one is having 5 fields and other one is having 15 fields and propogating all 20 fields along with main stream fields into target table.

The requirement is that whenever there is any kind of change in any of the reference tables, the same changed data should also be refelected in the target table and this should be happening with out re-extraction of the source data.

Could anybody give an idea on doing this, any help is appreciated.
Thanks,
Sourabh Verma
priyadarshikunal
Premium Member
Premium Member
Posts: 1735
Joined: Thu Mar 01, 2007 5:44 am
Location: Troy, MI

Post by priyadarshikunal »

update manually :wink:

do you need to update the values on records loaded previously?
Priyadarshi Kunal

Genius may have its limitations, but stupidity is not thus handicapped. :wink:
sourabhverma
Participant
Posts: 74
Joined: Thu Jan 05, 2006 2:07 am

Post by sourabhverma »

Yes, The changed values should be updated in the final target table also.
Thanks,
Sourabh Verma
Sainath.Srinivasan
Participant
Posts: 3337
Joined: Mon Jan 17, 2005 4:49 am
Location: United Kingdom

Post by Sainath.Srinivasan »

Is it a full extract from the source ?

If Yes, would'nt it be a full refresh (apart from deletes) ?

What will happen if your update fails in the middle ? You will need some type of control.
priyadarshikunal
Premium Member
Premium Member
Posts: 1735
Joined: Thu Mar 01, 2007 5:44 am
Location: Troy, MI

Post by priyadarshikunal »

Sainath already asked the questions i was about to, need answers to suggest further.

This is the kind of scenario where I would put reference table to target database (SCD type 1) and use its surrogate key to the target table. So in case of any change is reference data is automatically reflected to all records. And probably thats why the star schema was invented.

This should be a question for the person designed the database as there is possibility of misunderstanding.
Priyadarshi Kunal

Genius may have its limitations, but stupidity is not thus handicapped. :wink:
sourabhverma
Participant
Posts: 74
Joined: Thu Jan 05, 2006 2:07 am

Post by sourabhverma »

Basically idea behind this requirment -> As this is monthly job, the data will be loaded into target table (appended) on monthly basis. So let`s say after loading this data, if there is any indication that the base ref data is somehow not perfectly correct so after getting the corrected ref data table, the job must be running which should use the new ref data & then finally populate the targte table.

I could actually store the extracted data first into dataset or temporary table and whenever required i could run the job but because of high data volume we don`t want to do this way.
Thanks,
Sourabh Verma
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

If you planing to have the verification manually, you can better have a seperate flow for rectification.
Just to update the corrected reference records in Target.
Or as mentioned, update manually.
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
ajay.vaidyanathan
Participant
Posts: 53
Joined: Fri Apr 18, 2008 8:13 am
Location: United States

Need some info on logic

Post by ajay.vaidyanathan »

Hi,
Use a CDC... I guess you are referring to Change Data Capture Logic.... Try using a CDC and let me know if your issue got resolved or not....
Regards
Ajay
Post Reply