Hi All, we are discussing best approach for handling SCDs. I'm thinking of creating a Multi-instance job where the following will be passed as
parameters
Table Name (Dimension table to update)
Schema
File Name (Dimension Data from Input)
File Name (Existing Dimension Data)
There will be more parms but the above is a start. There has been discussions about handling the SCD through stored procedures.
Just wondering if anyone else has built a common process to handle SCDs and what they did.
SCD Common Multi Instance Job
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Hi,
Check this link. viewtopic.php?t=97043. You should use the CDC for finding the delta values. Use the Lookup stage in conjunction with Column Generator stage to get the maximum surrogate value and to generate new surrogate key values.
HTH
--Rich
Check this link. viewtopic.php?t=97043. You should use the CDC for finding the delta values. Use the Lookup stage in conjunction with Column Generator stage to get the maximum surrogate value and to generate new surrogate key values.
HTH
--Rich
Or stall until you get your hands on version 8x...
The SCD stage is extremely powerful. It handles any type 1 or type 2 variation that I've encountered in the past. It doesn't do type 3, but I find type 3 to be an infrequent requirement... probably was not enough of a business case for IBM to invest in supporting type 3.
Mike
The SCD stage is extremely powerful. It handles any type 1 or type 2 variation that I've encountered in the past. It doesn't do type 3, but I find type 3 to be an infrequent requirement... probably was not enough of a business case for IBM to invest in supporting type 3.
Mike
-
- Participant
- Posts: 3593
- Joined: Thu Jan 23, 2003 5:25 pm
- Location: Australia, Melbourne
- Contact:
I have helped build a common SCD container in DataStage Server edition. The Type 2 fields were packed into a single text field with delimiters and the key fields were packed into another field, the two fields were passed into a container where the key was looked up, the type 2 fields were compared and inserts were generated with tagging fields added. If you are adventurous you can use the CRC32 function instead of a packed type 2 field (and risk the false positives). I don't think this approach works so well in parallel jobs. A server job made it easy to pack and unpack the fields.
You could try building a custom stage or just wait until you are ready to move to 8.1.
You could try building a custom stage or just wait until you are ready to move to 8.1.
Certus Solutions
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn:Vincent McBurney LinkedIn
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn:Vincent McBurney LinkedIn