11.5 WLM Stuck Job Workaround

A forum for discussing DataStage<sup>®</sup> basics. If you're not sure where your question goes, start here.

Moderators: chulett, rschirm, roy

Post Reply
sjfearnside
Premium Member
Premium Member
Posts: 278
Joined: Wed Oct 03, 2007 8:45 am

11.5 WLM Stuck Job Workaround

Post by sjfearnside »

Just a note about a problem in our 11.5 prod WLM system and a solution I found.

There was a stuck job on the default queue with no PID found on the server.

The jobs going to the default queue where waiting a long time before processing even though there was no limiting factors such as concurrent jobs limit/excessive CPU or Memory.

IBM has a patch (JR57145) you can request and that is included in 11.5 FP2 but it does not remove the stuck job from the queue.

I did a little digging and found what I thought was a better and less invasive workaround if you use default queue to run your job. I created a custom queue and set it to default and any job submitted after the creation of the new default queue went to the new default queue. Any job using a sequencer on our system that was started before the new default queue was setup still sent jobs to the old default queue.

If you are an admin you can move the jobs in the queue to the top of the queue so they are before the stuck job in the queue and they will execute.

I set the wlmon=0 in the DSODBConfig.cfg file to turn off the WLM and expected it would not take effect until our regularly scheduled monthly reboot. By looking at the DSWLM logs it appears it took effect that night around midnight, although the DataStage engine was not restarted. :?

Interestingly I found some apps that appear to start and stop the WLM in the DSWLM directory. I have not tested them yet but I will and update the information I find here.

Hopefully this information will be of assistance to someone in the future.

Thanks
Post Reply