how and which stages are used to main version in SCD 2

rafik2k · Post by **rafik2k** » Tue Dec 06, 2005 1:23 am

Hi all,
i want to implement SCD 2 in parallel job. i have tried using
change capture and change apply stage, but using these stage i am not able to maintain version control.
my requirement is that if
1) new record in the source -> then insert it into target with new row id
2) changed record in the source -> then insert this record into the target as well as update the exisiting record with new time(end time) to main history.

Any help would be greatly appreciated. Thanks!

rafiq

richdhan · Post by **richdhan** » Tue Dec 06, 2005 3:57 am

Hi Rafiq,

The inputs to the CDC should be your source and target. The output of the CDC should lead to a transformer. Have 2 constraints in the transformer

1. change_code = 1 or change_code= 3
2. change_code = 3

From the transformer use the 1st constraint and lead the records to a funnel. From the transformer use the 2nd constraint and lead the records to a join stage and join with the target records. The output of the join stage should go to the same funnel. Now the output of the funnel will have both inserts and updates.

HTH
--Rich

jasper · Post by **jasper** » Tue Dec 06, 2005 4:02 am

personally I would not funnel. When you have inserts and updates split, keep them split up. It will improve performance .
Off cource dependant an DB and a lot of other parameters.

rafik2k · Post by **rafik2k** » Wed Dec 07, 2005 12:06 am

Hi Rich,
thanks for ur quick response
As per your solution i went thrugh it. But whenever there is any insertion or updation in the source file(text file), the conditions in the transformer become false. when i give condition like change_code=2 then it become false?
which is i suppose for deletetion. so what may be the reason?

my requirement is as
suppose i have source data like
NO NAME SAL
10 ABC 100
20 XYZ 200
30 CBA 300

and in the target i should have for the first time populate like this

SURROGATE_KEY NO NAME SAL VERSION
1 10 ABC 100 1000
2 20 XYZ 200 2000
3 30 CBA 300 3000

now for the next load, in the souce file gets changed like

NO NAME SAL
10 ABC 100
20 XYZ 200
30 XXX 300(changed in name)
40 SSS 400(new row)

for the next load, the target should have records like

SURROGATE_KEY NO NAME SAL VERSION
1 10 ABC 100 1000
2 20 XYZ 200 2000
3 30 CBA 300 3000
4 30 XXX 300 3001(increment corr version by 1)
5 40 SSS 400 4001

any solution or help will be greatly appreciated

Thanks in advance
Rafiq

ArndW · Post by **ArndW** » Wed Dec 07, 2005 1:46 am

Rafik,

the basic approach is correct, but something seems to not be working correctly in your case. Could you also output the change_code value and see what it is for each row?

richdhan · Post by **richdhan** » Wed Dec 07, 2005 4:17 am

Hi Rafiq,

I think the problem might be with the link ordering in Change Capture stage. As Arnd suggested write the output of the CDC to a peek stage and check the value for change_code column. If the results are reversed then the link order should be changed in Change Capture stage.

HTH
--Rich

rafik2k · Post by **rafik2k** » Wed Dec 07, 2005 5:06 am

Thanks Arnd and Rich,
The problem has been resolved after changing the link order.
Now my problem is how to maintain the version number of each record in the target as i mentioned in the last message.
source as well as target are text files

any solution?

Thanks in advance

Rafiq

ameyvaidya · Post by **ameyvaidya** » Wed Dec 07, 2005 2:12 pm

AFAIK the only 2 operations supported by sequential files are append and overwrite.

The sequential file stage has no mechanism for updating previously written rows. if it is imperative that output be a text file, you will need to design logic that re-creates the entire file.

One thing that comes to mind is separate the updates and deletes from the inserts and copies and then lookup the original file(Stream) against the updates(Reference) (reversed this intentionally), update the end date and recombine these records with the inserts and copies and build a new file with all the records.

IHTH

laknar · Post by **laknar** » Thu Mar 27, 2008 10:36 pm

Hi,

can any one suggest me the Design of SCD2.

how can i generate Surr Key for Next load.

ray.wurlod · Post by **ray.wurlod** » Thu Mar 27, 2008 10:47 pm

There are at least five different ways for generating surrogate keys which you would have discovered had you bothered to Search rather than hijacking a thread that is more than two years old.

hamzaqk · Post by **hamzaqk** » Thu Mar 27, 2008 11:59 pm

Code: Select all

1.Just perform a join on the natural key columns and seperate the new records from the old ones
2.In the next step assign the surrogate keys to the new records. 
3.Third would be the update.

.

P.S Are my DWH concepts outdated or your talking about type 3 and not 2 here because i thought in SCD type 2 a new record is added to the table to represent the new information. Therefore, both the original and the new record will be present. The new record gets its own primary key.

altruist · Post by **altruist** » Mon Jun 08, 2009 10:43 am

Sorry to bring up the old post. But this is the closest that I found from my search.

I am planning to implement SCD and obtaining three files Insert, UpdateType2, UpdateType1. Using which I update the Target Table. However I have versioning in Target Table (in addition to it, I want to update based on surrogate keys due to the huge number of natural keys, which eventually would be the where clause while updating instead of surrogate key).

Is the below approach approach a good solution or do we implement SCD type 2 for versioning similar to that of Server Edition.
As with the below solution I don't see that the CDC stage is being used efficiently.

AFAIK the only 2 operations supported by sequential files are append and overwrite.

The sequential file stage has no mechanism for updating previously written rows. if it is imperative that output be a text file, you will need to design logic that re-creates the entire file.

One thing that comes to mind is separate the updates and deletes from the inserts and copies and then lookup the original file(Stream) against the updates(Reference) (reversed this intentionally), update the end date and recombine these records with the inserts and copies and build a new file with all the records.

IHTH

Gladiator1 · Post by **Gladiator1** » Thu Jul 09, 2009 5:50 am

If someone has implemented a SCD 2 in DataStage using CDC Stage.It would be helpful if they could put a screenshot of the job.

DSXchange

how and which stages are used to main version in SCD 2

how and which stages are used to main version in SCD 2

how to maintain version number in SCD 2

Generating Surrogate Key as parameter