Set row "create_date" and "last_update_date&q

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
NYCooper
Participant
Posts: 3
Joined: Wed Aug 15, 2012 6:57 am

Set row "create_date" and "last_update_date&q

Post by NYCooper »

Hello,
i am new on data stage and i wonder if i can set a "initial create date" and a "last update date" in a table for each row. The table has one key. The Process is an update/insert. i want to save two date's "initial create date" and the "last update date".
in transformer stage i found only the function currentimestamp() but this way set everytime the date when i run the job...
How can i do this?
Thanks a lot
BI-RMA
Premium Member
Premium Member
Posts: 463
Joined: Sun Nov 01, 2009 3:55 pm
Location: Hamburg

Post by BI-RMA »

You have to identify which rows to update and which ones to insert and split the streams for these two operations.

Then you can set initial create date on insert and last update date on update.

But Datastage won't identify which column to write in an Insert/Update-Operation.
"It is not the lucky ones are grateful.
There are the grateful those are happy." Francis Bacon
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Welcome aboard.
You can perform a lookup to determine whether the key already exists and thereby determine the derivation for create date or update date.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
NYCooper
Participant
Posts: 3
Joined: Wed Aug 15, 2012 6:57 am

Post by NYCooper »

Hello,
the first way works fine. (Split the process in two: one insert and one update each time with the currenttimestamp).
the second way with the lookup i try later, sounds much more professional...
thanks for the ideas!
NYCooper
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

The issue with the "upsert" mechanism is that all columns that you need for the insert must also be used in the update. Typically there are columns that you do not want to change, such as the create date, and in order to do that you need to look up the current value and pass that along rather than the current date when the record already exists. And then the insert will need to fail before the update is performed if that is the order of execution.

I personally find the 'combo' insert/update actions to be a crutch and not something one should use the vast majority of the time. Do the lookup, get the information you need for the update and then send them to separate targets doing either the insert or the update. And in this particular case, the only information you may need from the lookup is whether it succeeded or not. :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
aartlett
Charter Member
Charter Member
Posts: 152
Joined: Fri Apr 23, 2004 6:44 pm
Location: Australia

Post by aartlett »

I agree with Craig. Split and insert or update separately.

This is especially true if the number of updates is small. Upsert and update operations are incredibly time consuming in most databases, Inserts are normally very quicl. Insert/Update (or update/insert) tend to take a lot of time.

Be careful of your transaction size. These are no absolutes as it depends on number and size of records and indexes impacted. Play with it a bit.
Andrew

Think outside the Datastage you work in.

There is no True Way, but there are true ways.
Post Reply