error running any job with lookup stage
Posted: Thu Dec 26, 2013 9:13 pm
Hi,
i have few jobs, well any job with a lookup or a join stage getting hung in my testing environment
All these jobs have a common flow
We read from Oracle Database, use a lookup stage to lookup data from Oracle DB, and write data to a Dataset
What we did to verify?
we checked with the DBAs , there are no locks happening on the DB
we cleared all the RT logs, &PH&
bounced the datastage server twice without any luck
created copy of the job, replaced it with join stage, same result.
removed join/lookup, implemented the logic in the source oracle stage, it completes within 30 seconds.
Moved the same job to another environment, pointed to the testing database, job completes within 30 seconds again
i am not sure what we have missed to check. I am thinking of deleting and re-creating the project tomorrow or in the next week, but would like to understand if there is anything else i can look at
Also i noticed that , every time we try to run the jobs that get hung, a PID as below is generated.
dsadm 32416 1 0 Dec26 ? 00:00:00 /opt/app/xxxxxxxxx/InformationServer8.7/Server/PXEngine/bin/osh -f RT_SC59/OshScript.osh -monitorport 13400 -pf RT_SC59/jpfile -impexp_charset UTF-8 -string_charset UTF-8 -input_charset UTF-8 -output_charset UTF-8 -collation_sequence OFF
i have not seen this type of PID earlier, the information from all over the forum has confused me more. May be i will take a stab at it again after sometime. How can i cleanup PIDs with this message?
Appreciate your help on this.
Thank you
Sri
i have few jobs, well any job with a lookup or a join stage getting hung in my testing environment
All these jobs have a common flow
We read from Oracle Database, use a lookup stage to lookup data from Oracle DB, and write data to a Dataset
What we did to verify?
we checked with the DBAs , there are no locks happening on the DB
we cleared all the RT logs, &PH&
bounced the datastage server twice without any luck
created copy of the job, replaced it with join stage, same result.
removed join/lookup, implemented the logic in the source oracle stage, it completes within 30 seconds.
Moved the same job to another environment, pointed to the testing database, job completes within 30 seconds again
i am not sure what we have missed to check. I am thinking of deleting and re-creating the project tomorrow or in the next week, but would like to understand if there is anything else i can look at
Also i noticed that , every time we try to run the jobs that get hung, a PID as below is generated.
dsadm 32416 1 0 Dec26 ? 00:00:00 /opt/app/xxxxxxxxx/InformationServer8.7/Server/PXEngine/bin/osh -f RT_SC59/OshScript.osh -monitorport 13400 -pf RT_SC59/jpfile -impexp_charset UTF-8 -string_charset UTF-8 -input_charset UTF-8 -output_charset UTF-8 -collation_sequence OFF
i have not seen this type of PID earlier, the information from all over the forum has confused me more. May be i will take a stab at it again after sometime. How can i cleanup PIDs with this message?
Appreciate your help on this.
Thank you
Sri