Job under control failed to start
Posted: Mon Jun 04, 2007 2:19 am
Has anyone seen this happen? Like to advance a cogent explanation?
The job controller reported (-14) that the job under control failed to start within the 60 seconds allowed. The job run request was made at 4:39:08 and the failure was reported at 4:40:10, at which time the job controller aborted, the abort having been generated in DSRunJob(). The code did not get to DSWaitForJob().
However, the job under control DID start, logging its start time as 4:40:30 - some 20 seconds after it had been reported as failing to start.
Its log events showed that it was run under control of the now aborted job controller, and it finished (still under control) some 15 minutes later with no warnings or errors.
There was nothing of interest left in &PH&, and - because of the "under control" events - it is safe to say that the job was not run manually.
The controlling job is a job control routine that's been around since 2003, before job sequences were invented. The folks here say that this behaviour has occurred on a couple of other occasions, sometimes in different (production) projects on the same server.
Eight CPUs are barely ticking over (%Idle > 95%) and very little of the available memory being consumed. This most recent example occurred when only this control job was started immediately after rebooting the server. Demand for other resources was close enough to zero - just the ODBC drivers being called from the controlled job.
There is no before-job or before-stage subroutine in the controlled job. It contains two Transformer stage, one performing eight lookups, the other performing some calculations into three output links. 15 minutes is its usual execution time.
The job controller reported (-14) that the job under control failed to start within the 60 seconds allowed. The job run request was made at 4:39:08 and the failure was reported at 4:40:10, at which time the job controller aborted, the abort having been generated in DSRunJob(). The code did not get to DSWaitForJob().
However, the job under control DID start, logging its start time as 4:40:30 - some 20 seconds after it had been reported as failing to start.
Its log events showed that it was run under control of the now aborted job controller, and it finished (still under control) some 15 minutes later with no warnings or errors.
There was nothing of interest left in &PH&, and - because of the "under control" events - it is safe to say that the job was not run manually.
The controlling job is a job control routine that's been around since 2003, before job sequences were invented. The folks here say that this behaviour has occurred on a couple of other occasions, sometimes in different (production) projects on the same server.
Eight CPUs are barely ticking over (%Idle > 95%) and very little of the available memory being consumed. This most recent example occurred when only this control job was started immediately after rebooting the server. Demand for other resources was close enough to zero - just the ODBC drivers being called from the controlled job.
There is no before-job or before-stage subroutine in the controlled job. It contains two Transformer stage, one performing eight lookups, the other performing some calculations into three output links. 15 minutes is its usual execution time.