Page 1 of 2

Job in Running State

Posted: Thu Jul 26, 2007 5:48 am
by abhilashnair
We have a server job which runs daily. Normally it takes approx 7 minutes to complete. But the yesterdays run of this job is still in running state. No warnings whatsoever. We are not accessing any Databases. Only seq files and hash files. All the files required are in place and have the proper rights. We are able to view data as well thru designer. But the job wont move ahead. please reply

Posted: Thu Jul 26, 2007 7:12 am
by chulett
Is the job actually running when you check from the O/S?

Code: Select all

ps -ef |grep phantom |grep -v grep
Does that show any processes that you recognize as coming from your job in question?

Posted: Thu Jul 26, 2007 7:23 am
by abhilashnair
chulett wrote:Is the job actually running when you check from the O/S?

Code: Select all

ps -ef |grep phantom |grep -v grep
Does that show any processes that you recognize as coming from your job in question?
Yes. The command shows processes related to this job.

Posted: Thu Jul 26, 2007 7:58 am
by chulett
Does the job respond to a 'Stop' request? I'm not sure, off the top of my head, what would cause a job that allegedly only uses 'seq files and hash files' to hang like that. You might want to expand on what your exact job design is to see if there is any issue there...

Posted: Thu Jul 26, 2007 8:23 am
by abhilashnair
The job does respond to Stop request as well as reset request. I tried stopping it, then resetting it and then restarted it. Still the same thing.

Regarding the structure, the job reads records from a seq file..Does a lookup and couple of transformations. Then writes the output to another set of seq files.
The job has not been changed for quite some time now. In fact it ran well 0n 07/24. The issue started yesterday i.e 07/25.

Posted: Thu Jul 26, 2007 9:07 am
by chulett
Then you need to ascertain what changed yesterday. If not the job iteself, then what in the job's environment? Obviously, something did - the hard part can be figuring out what.

Another thing could be the data. Is there anything 'unusual' (whatever that means) about that day's data? Especially looking around the record count that matches where it gets 'stuck'?

Posted: Thu Jul 26, 2007 9:12 am
by abhilashnair
chulett wrote:Then you need to ascertain what changed yesterday. If not the job iteself, then what in the job's environment? Obviously, something did - the hard part can be figuring out what.

Another thing could be the data. Is there anything 'unusual' (whatever that means) about that day's data? Especially looking around the record count that matches where it gets 'stuck'?
The job has not changed yesterday.

No rows are being read from the i/p file...the link is in running state..it shows blue color...but the statistics show 0rows,0 rows/sec. But the file is fine...I can view data also

Posted: Thu Jul 26, 2007 10:50 am
by Krazykoolrohit
reset the job and clear the status file (using director).

Posted: Thu Jul 26, 2007 11:17 am
by abhilashnair
That was the first thing I did. :o

Re: Job in Running State

Posted: Thu Jul 26, 2007 1:47 pm
by ray.wurlod
abhilashnair wrote:We have a server job which runs daily. Normally it takes approx 7 minutes to complete. But the yesterdays run of this job is still in running state. No warnings whatsoever. We are not accessing any Databases. Only seq files and hash files. All the files required are in place and have the proper rights. We are able to view data as well thru designer. But the job wont move ahead. please reply
What has changed?

"Nothing" is not the correct answer. Track it down.

Re: Job in Running State

Posted: Thu Jul 26, 2007 6:38 pm
by kausikMitra
abhilashnair wrote:We have a server job which runs daily. Normally it takes approx 7 minutes to complete. But the yesterdays run of this job is still in running state. No warnings whatsoever. We are not accessing any Databases. Only seq files and hash files. All the files required are in place and have the proper rights. We are able to view data as well thru designer. But the job wont move ahead. please reply
Abhilash
May be I am wrong..can u pls check the area where the user is going to create the hash file has write access. Did you earlier run the job using the same user or different user. if yes, are they belong to same group, enjoying same privs. Abhilash, you have to be careful while using hash files, if used properly will give you good performance or the reverse.How you are creating the hash file, is it a full refresh? are you deleting the file before creating. if yes then pls dont select the "clear before writing" in Update option, what is the value you have set for caching attribute? what is the groupsize?

Posted: Fri Jul 27, 2007 12:54 am
by yaminids
We had the exact same issue. We resolved the issue by executing a small command (like 'ls-l /home/dstage') as part of after job sub-routine

Try that and hopefully that will resolve your issue

Yamini

Re: Job in Running State

Posted: Fri Jul 27, 2007 2:51 am
by reypotxo
[quote="abhilashnair"]We have a server job which runs daily. Normally it takes approx 7 minutes to complete. But the yesterdays run of this job is still in running state. No warnings whatsoever. We are not accessing any Databases. Only seq files and hash files. All the files required are in place and have the proper rights. We are able to view data as well thru designer. But the job wont move ahead. please reply[/quote]

It could be a locked file issue. Try to run the job with a different output file name and if it works, review systems locks

Re: Job in Running State

Posted: Fri Jul 27, 2007 3:11 am
by abhilashnair
ray.wurlod wrote:
abhilashnair wrote:We have a server job which runs daily. Normally it takes approx 7 minutes to complete. But the yesterdays run of this job is still in running state. No warnings whatsoever. We are not accessing any Databases. Only seq files and hash files. All the files required are in place and have the proper rights. We are able to view data as well thru designer. But the job wont move ahead. please reply
What has changed?

"Nothing" is not the correct answer. Track it down.
I decided to go by our very own illustrious Ray's advice. Set out to find the 'something' which changed between one day and 'tracked down' something very interesting. Not sure whether this has been discussed before. Here is what I did.

Imported the same job from production server to development/test server. Bought all the input seq files as well as hashed files. Mind you...nothing changes here...job,files,metadata everything is same. Only the server has changed.

Then compiled the job and ran it
Here is the catch
The job ran normally without a single warning.
Ok!!! Fine!!!! What would I do next? Ya...exactly you guessed it right

I deleted the problematic job from production. Took the good job from dev and imported to production. Compiled and ran it....You are in for a surprise here. The job got stuck midway through...

Please Don't tell me that the entire project in the prod server is corrupt...Because that is the only thing I can think of now...eventhough I hope thats not the case.

There are a hell lot of jobs in that particular project..What do we do next?
:cry:

Posted: Fri Jul 27, 2007 3:21 am
by crystal_pup
Sometimes the job doesnt terminate successfully and takes a lot of time to complete....If u experience such delays then one thing u can do is to empty the &PH& directory from ur datastage home directory.The &PH& directory consists of information abt active stages and every time a job is run a &PH& directory is added which requires periodic cleaning...I hope this would solve ur problem.

Cheers,
Kunal