Copying Surrogate state files - Will the values remain?

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
kflow
Participant
Posts: 7
Joined: Sat Sep 01, 2007 4:12 pm

Copying Surrogate state files - Will the values remain?

Post by kflow »

I would like to copy a half dozen surrogate state files from our test to production environment.

Will the values within the files remain the same meaning can I run my jobs in production and have them pick up the surrogate key values from test?
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

What do you think will happen, and why? Have you examined one of your extant state files?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
kflow
Participant
Posts: 7
Joined: Sat Sep 01, 2007 4:12 pm

Post by kflow »

ray.wurlod wrote:What do you think will happen, and why? Have you examined one of your extant state files? ...
I think it will work fine. I'm just being paranoid and wanted to see if anyone has done this.

Yes I have examined the contents of the files via a Surrogate Key generator stage to a peek. I will do the same once I move the files to production.
greggknight
Premium Member
Premium Member
Posts: 120
Joined: Thu Oct 28, 2004 4:24 pm

Post by greggknight »

Well the first question I would ask is.
Are both config.apt file identical as far as how many nodes are defined.
If test has 2 nodes and Prod has 4 nodes your results are not going to be what you expect. and you will get duplicate keys when you run it in prod.
Heres why

if you are running on two nodes the content of the state file is

2050,3050
those are the starting points for each respective node. Now if you move that file to a say 4 node box when the job starts it will
start another number series which then would look like this.

25,1025,2080,3095

so I you already have keyes 1-2049 in you table you will get rejects rejected as long as there is a PK constraint otherwise you will have dups in your table.
By default it uses 1000 number blocks unless you change it.
"Don't let the bull between you and the fence"

Thanks
Gregg J Knight

"Never Never Never Quit"
Winston Churchill
greggknight
Premium Member
Premium Member
Posts: 120
Joined: Thu Oct 28, 2004 4:24 pm

Post by greggknight »

Also you don't need a peek stage to see the contents of a state file.
Just a generator and set the View stet file property to yes.

and you will get this message
Surrogate_Key_Generator_0,0: Content of state file D:\DW_PROJECTS\1_DSProjectDataFiles\PROJ_CORP_DataWareHouse\SurrogateKeys\201\201_tDim_BranchItem.SKEY:
7631237 7631293
7632245 7632293
"Don't let the bull between you and the fence"

Thanks
Gregg J Knight

"Never Never Never Quit"
Winston Churchill
kflow
Participant
Posts: 7
Joined: Sat Sep 01, 2007 4:12 pm

Post by kflow »

Thanks for that response Gregg. I really appreciate it.
greggknight
Premium Member
Premium Member
Posts: 120
Joined: Thu Oct 28, 2004 4:24 pm

Post by greggknight »

I am still pondering the reason you would want to do that.
I would think you would want the max number already being used in production.
"Don't let the bull between you and the fence"

Thanks
Gregg J Knight

"Never Never Never Quit"
Winston Churchill
kflow
Participant
Posts: 7
Joined: Sat Sep 01, 2007 4:12 pm

Post by kflow »

This is a new installation in production. I've been building data in test for some time now and want to copy it over to production.

When you say use a Generator do you mean the Surrogate Key generator stage?
greggknight
Premium Member
Premium Member
Posts: 120
Joined: Thu Oct 28, 2004 4:24 pm

Post by greggknight »

yes
Surrogate_Key_Generator

I built a server process which uses routines and a comma delimited list as a parameter. The list contains a list of all the tables, Dims. The routine contains a case statement which contains all the tables and what their key column name is.

my job takes the table name from the looping counter and passes that to the routine and then the key column name is returned to the sequence and used in the select statement
SELECT '#WareHouseTable#' as TableName,ISNull(Max(#TableField#),0) AS CURRENT_KEY
FROM #WareHouse_LIB#.[dbo].#WareHouseTable#
;
this builds my new state files. Of course you only need to run it once for a table or tables unless you want to rebuild all your state files.
"Don't let the bull between you and the fence"

Thanks
Gregg J Knight

"Never Never Never Quit"
Winston Churchill
blewip
Participant
Posts: 81
Joined: Wed Nov 10, 2004 10:55 am
Location: London

Post by blewip »

greggknight wrote:Also you don't need a peek stage to see the contents of a state file.
Just a generator and set the View stet file property to yes.
Sorry are you saying you can view the State file using the SK Generator Stage?

It's just that I can't see how.
Modern Life is Rubbish - Blur
greggknight
Premium Member
Premium Member
Posts: 120
Joined: Thu Oct 28, 2004 4:24 pm

Post by greggknight »

create a job with just that stage in it.
Then run the job
the output will be in the director log in one of the messages as I showed earlier.

You have to run it there is no show data option.
"Don't let the bull between you and the fence"

Thanks
Gregg J Knight

"Never Never Never Quit"
Winston Churchill
Post Reply