Page 1 of 1

Director cannot get job status

Posted: Mon Apr 24, 2006 11:08 am
by Unfriendly
Datastage Director leaves an error message and a disabled job after refreshing a directory.

An error message:
Error calling subroutine: DSR_EXECJOB (Action=5); check DataStage is set up correctly in project xxxxx
(Subroutine failed to complete successfully (30107))

is posted when a certain directory is opened in Director.
One of the jobs cannot be started from a sequence but can be started manually.
This job has the following description: !!Error getting job status!!

If we remove this job the next one in the directory gets the same error.

We have tried rebuilding the indices and restarting the server.

Any ideas would be greatly appreciated.

Posted: Mon Apr 24, 2006 11:18 am
by ArndW
It would seem that your project might be corrupted. When you did the DS.REINDEX did you check to see if any error messages were listed? You might also try running the TCL command DS.CHECKER when no users are logged into the project and look at the output carefully to see if any errors are shown.

Is this a new error that has suddenly occurred? Can you think of something that happened on the machine that might define a "before the error" and "after the error"?

Posted: Mon Apr 24, 2006 11:19 am
by kcbland
Welcome aboard! Is this a new installation, a new project, etc? Have you looked at the permissions of the userid used logging into the Director client? Do they have permissions to the project directory?

Posted: Mon Apr 24, 2006 11:58 am
by kcbland
Check the free space on the file system for the project, plus make sure your /tmp isn't full.

Posted: Mon Apr 24, 2006 4:29 pm
by ray.wurlod
Execute the following commands to identify possibly corrupted storage.

Code: Select all

UVFIXFILE DS_JOBS
UVFIXFILE DS_JOBOBJECTS
UVFIXFILE RT_CONFIG805
UVFIXFILE RT_STATUS805
UVFIXFILE RT_LOG805

Posted: Tue Apr 25, 2006 10:06 am
by Unfriendly
Tried the UVFIXFILE as requested.
There is no warning or error message is there?

However the UVFIXFILE RT_STATUS807 and
UVFIXFILE RT_CONFIG807

gave the following output:

1 group(s) processed
1 group buffer(s) processed
0 record(s) processed.

Where everything else processed a number of records this processed 0 records. is this the problem?

Posted: Tue Apr 25, 2006 10:29 am
by ArndW
If you do a COUNT DS_JOBS or COUNT DS_JOBOBJECTS do you get 0 records for either or both? If so, then you have a problem that will probably necessitate restoring a backup.

Posted: Tue Apr 25, 2006 4:13 pm
by ray.wurlod
No error or warning message from UVFIXFILE proves that no corruption was found in any of the (hashed) files processed. It is important to eliminate corruption as a cause of your problem.

Re-reading the error message, the corruption is actually in job number 807, so it is these commands that were needed. Sorry about misreading.

Code: Select all

UVFIXFILE DS_JOBS 
UVFIXFILE DS_JOBOBJECTS
UVFIXFILE RT_CONFIG807
UVFIXFILE RT_STATUS807
UVFIXFILE RT_LOG807


(You can skip DS_JOBS and DS_JOBOBJECTS, since you've done them already.)

Posted: Wed Apr 26, 2006 5:58 am
by Unfriendly
I ran uvfixfile on 807 the first time and result is as previously posted.
No error messages on any of them.

COUNT DS_JOBS
returned 829

COUNT DS_JOBOBJECTS
returned 11538