Datajob hang up

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
xinhuang66
Participant
Posts: 161
Joined: Wed Aug 02, 2006 4:30 am

Datajob hang up

Post by xinhuang66 »

I get a few jobs constantly hang up after we clicked the run button, only in some scenario it works through.

However, if we set $APT_DISABLE_COMBINATION to TRUE, all these jobs are fine.

Anybody encountered this situation as well, why $APT_DISABLE_COMBINATION value to FALSE will cause this kinds of trouble.
OddJob
Participant
Posts: 163
Joined: Tue Feb 28, 2006 5:00 am
Location: Sheffield, UK

Post by OddJob »

Is there anything specific to the jobs you're having trouble with. I've seen issues of jobs hanging that read and write to the same DB table, and where operator combination has put the two db stages in the same process.
xinhuang66
Participant
Posts: 161
Joined: Wed Aug 02, 2006 4:30 am

Post by xinhuang66 »

Definitely, I need to read and write to the same table, but in two seperated oracle stage ?

How come it is going to cause trouble ? Why can you make sure it is because you are reading and writing to the same table that cause trouble.

Our process is to read from a table, compared with an input dataset, then find a delta write back.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

With $APT_DISABLE_COMBINATION set to False, chances are that the writing and reading are being done in the same process, which leads to a self-deadlock situation in Oracle. You need to force them to be separate processes, perhaps by disabling combinability in just one of the Oracle stages.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
OddJob
Participant
Posts: 163
Joined: Tue Feb 28, 2006 5:00 am
Location: Sheffield, UK

Post by OddJob »

You could try following good ETL principles and split your job into somethng a bit more supportable i.e. extract (and possibly a bit of your transformation), drop to a dataset and perform the load in a separate job. At least of something goes wrong you've got the data to hand to see what's happening.

Alternatively, place a copy stage before the output DB stage, selecting Force=True from its properties.

Or, you could just switch off combination for this one job - you probably won't see much change in performance.
xinhuang66
Participant
Posts: 161
Joined: Wed Aug 02, 2006 4:30 am

Post by xinhuang66 »

Unfortunately, our template is doing things like

Read data from table, do a little bit comparision with input, then two output branches point back to the table again, one is load, the other is update.

If we need to guarantee one operation in a table per job, we need to split them into three small pieces, it is a hassel.

If it is a deal lock situation, why the job didn't come out with error 'Deal Lock' ? I have seen heaps of 'deal lock' error in Teradata stage. Which is fine since the job aborted and we know it is a dead lock.

This time for oracle, the job is just keep on runing running over the whole night even if it is a very very small data volume and simple logic.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Oracle is not Teradata (you may have noticed). Oracle can wait for a very long time for a lock held by the same process (a "self-deadlock") to be released, either by the SELECT completing or the cursor being dropped. All you need to do is to prevent operator combination, so that the SELECT and INSERT are being performed by separate player processes.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
OddJob
Participant
Posts: 163
Joined: Tue Feb 28, 2006 5:00 am
Location: Sheffield, UK

Post by OddJob »

If you think it's a hassle to change the jobs now, wait till you go into production and can't work out why your data loads are failing. You may need to change your priorities from 'what's easy to code' to 'what's easy to support'.
Post Reply