Hi Experts,
Our datastage architecture is comprised of one coductor node and one compute node.Datastage version is 8.5 FP 1
I am using flat file surrogate key and using it in my parallel job in transformer stage and passing the values through NextSurrogateKey() in the derivation field.
When I run this job with 3 million records, it gets finished in 14 hours.The surrogate key flat file is located on one NFS mounted file system which is visible from both the nodes(conductor node as well as compute node)
Now if I place the copy of the surrogate key file in /tmp folder on both the nodes, the same job gets finished in 1 minute for 3 million records.
Now my question is why the same job is executing with different run times in the above two scenarios?
Surrogate key flat file issue in transformer stage.
Moderators: chulett, rschirm, roy
I have defined 5 nodes in the configuration file, one conductor node and 4 compute nodes, so I believe the job runs on all the nodes(conductor as well as compute nodes).
I have not defined any block size in the properties of surrogate key tab under transformer stage, it is "System selected block size" radio button enabled only.
I have not defined any block size in the properties of surrogate key tab under transformer stage, it is "System selected block size" radio button enabled only.
Arvind
As your state file is on NAS/NFS, I would recommend specifying a block size rather than using the system-selected option. You could try a value of 10000 or 20000 and adjust higher or lower from there. The goal is to reduce the number of I/O requests to the file (open/close/read/write) over the NFS connection...this is primarily what is slowing down the job. NFS server load can also be a factor if it is also being used for source/target and datasets.
Regards,
Regards,
- james wiles
All generalizations are false, including this one - Mark Twain.
All generalizations are false, including this one - Mark Twain.
This is resolved. We have replaced NextSurrogateKey() function in the transformer derivation with below expression and the job finished fine in 10 minutes.
@INROWNUM * @NUMPARTITIONS + @PARTITIONNUM +1 - @NUMPARTITIONS + ps_nxtval
Please note that ps_nxtval is the variable that holds the initial value of the flat file surrogate key.
@INROWNUM * @NUMPARTITIONS + @PARTITIONNUM +1 - @NUMPARTITIONS + ps_nxtval
Please note that ps_nxtval is the variable that holds the initial value of the flat file surrogate key.
Arvind