oracle stage suspended during insertion

abyss · Post by **abyss** » Thu Aug 25, 2016 8:00 pm

hi all:
I have a very simple job that load the data from data set and insert around 300 thousands rows into oracle table:

Code: Select all

data set ---------> oracle stage

the job suspended after insert 150 to 250 thousands rows, suspend means stage stop inserting new rows and it doesn't return any error or warning messages either. it doesn't happen every day but 1 or 2 times a week

the properties of oracle stage is:

I talked to SA and DBAs to confirm there is no network problems between servers and DBAs couldn't see problem either. can anyone help?

thanks

SURA · Post by **SURA** » Thu Aug 25, 2016 8:30 pm

We had the similar situation when we use SQL Server and i know there is a bug in the tsort which IBM aware.

I am not sure is it the same case here, but the workaround is to change the target load to sequential resolve the issue.

abyss · Post by **abyss** » Thu Aug 25, 2016 9:59 pm

yep it is parallel (auto partition), i was thinking parallel could be the problem.
you mean change the whole job to server job? then the dataset wouldn't work then, so change the dataset to sequential file?

chulett · Post by **chulett** » Thu Aug 25, 2016 10:10 pm

No, he means set the stage property so it runs sequentially in your parallel job, i.e. one instance only. That or run on a single node.

IMHO a better solution would be to understand why it is 'suspended' - deadlocks? Blocking locks? And then change from 'auto' to one where you control the partitioning, perhaps hash or modulus on the PK value(s) involved, that or even round robin.

I don't recall what 'auto' equates to here but I'm a more of a fan of explicit control over it. And I'm sure Ray will be along to enlighten us.

abyss · Post by **abyss** » Thu Aug 25, 2016 11:30 pm

yes chulett. i have talked to DBA this morning, he can't find any deadlock, blocking locks or whatsoever on database and that's why i felt a bit helpless, the only thing he could find was there were 2 insertion sql processes on this table it may cause the problem and this is confirmed with SURA's opinion.

my solution for now is:
change the config file, let the job run single node
change the partition strategy to hash partition. (maybe oracle connector type?)

thanks all.

SURA · Post by **SURA** » Thu Aug 25, 2016 11:33 pm

I don't remember the option in the Oracle stage, but in SQL Server native stage there is an option to change it to Sequential, not the Node.

abyss · Post by **abyss** » Fri Aug 26, 2016 4:10 am

ok, i created new config file and make first node to ORACLE dedicated node, so oracle stage should only use that node for the job. let's see what happens......

abyss · Post by **abyss** » Sun Aug 28, 2016 11:27 pm

grrr, the problem happened again

now the log shows oracle connector will run in parallel on 1 processing nodes. but still shows 2 connections on database side......

abyss · Post by **abyss** » Thu Oct 27, 2016 6:35 pm

chulett wrote:IMHO a better solution would be to understand why it is 'suspended' - deadlocks? Blocking locks?...

Hi chulett: what is blocking locks?

chulett · Post by **chulett** » Thu Oct 27, 2016 7:17 pm

Best to have that conversation with your DBA. Basically, a record is locked by another process, 'blocking' your process's ability to lock it before the update. I've seen it happen by something as simple as someone browsing the record in Toad and clicking on it in the Schema Brower data window which locks it. Person exits Toad and bam, your suspended process completes.

abyss · Post by **abyss** » Thu Oct 27, 2016 8:58 pm

thanks chulett

did some research and find this:
http://www-01.ibm.com/support/docview.w ... wg21430589

If the job runs ok with the Oracle Upsert statement set to sequential, then most likely you have duplicate records affecting the deadlock.

Resolving the problem
Ensure you have all duplicate records in the same partition, so that the multiple Oracle update process do not try to access (lock) the same block at the same time. Use hash partitioning and hash on the columns used by the Oracle "where clause".

so it may still a partition thing.