Refresh Development Environment from Production

cdp · Post by **cdp** » Mon Oct 29, 2012 4:09 pm

I am devising a method of refreshing the development DW environment from production. There are the database elements, which I'm on top of. I have a couple of questions about a number of DataStage concepts which I am looking for help on.

i) I've created a job whereby I"m updating the surrogate key state files in Development with the distinct list of keys from the newly refreshed development dimension tables. Is this the correct approach to ensuring the surrogate keys aren't duplicated in future development loads ?

ii) In terms of updating the datasets. What would be the best approach for refreshing the development datasets with the contents of production datasets ?

Thanks very much for your help.
Nick.

ArndW · Post by **ArndW** » Tue Oct 30, 2012 6:43 am

I'm not quite sure if I understand (i), but for (ii) the answer depends upon your system configuration. A dataset consists of a descriptor file (usually with a .ds suffix) which in turn points to the file(s) containing the actual data. If your paths as defined in your APT_CONFIG file are identical you can get away with an OS-Level copying of all the files. If the paths are not identical then you have the choice of doing an ORCHADMIN dump and load of the dataset contents or of refilling the datasets using DataStage jobs - the preferable one is to do it from DataStage but that is not always practicable in terms of file size or run time.

chulett · Post by **chulett** » Tue Oct 30, 2012 7:30 am

I'll chime in on (i) and say it sounds just fine to me, assuming you mean the max() values from the refreshed tables.

cdp · Post by **cdp** » Tue Oct 30, 2012 1:51 pm

Thanks chulett. I couldn't really get good information on updating the surrogate key state files. I am slightly confused as to how they work. They are currently have option "Generate Key from Last Highest Value" set to 'N'. When I did some testing I found that set like that, even when I updated the state file with the Max value, it still generated keys which were lower than that. It seems to allocate ranges which each node then uses when processing. I found a post online somewhere that suggested updating the state file with ALL used values, and this should prevent that happening. Is this not the case ?

chulett · Post by **chulett** » Tue Oct 30, 2012 4:08 pm

Sorry, forgot to ask - does your state file 'stand alone' or does it support a sequence object in your target database? My understanding is that changes what is stored there and you may very well need to load all values into a stand-alone state file. I'm not certain off the top of my head, been away from this for too long.

Hopefully someone else will chime in with the specifics.

cdp · Post by **cdp** » Tue Oct 30, 2012 4:12 pm

It stands alone, there is no sequence in the database.