Page 1 of 3

Abnormal End error while running a DS load job into Oracle

Posted: Sat Jan 29, 2005 10:36 pm
by rrcreddy4
Hi All,

I am new to DS. I have created a new job in DS 6 which selects the data from an Oracle table(The select statement has UNION ALL) and then writes to a sequential file and then goes through mapping process before I insert the same data in to an Oracle table on the different database. The target table has a function based unique constraint. I get an error during the load job which reads from the sequential file and then loads in to an oracle table through a transformer. The error message says, Abnormal End, Job failed and then I also get DSD.SEQClose error. The job has almost a million rows. Also while inserting I am using Insert or Update exisiting rows clause. When I run for lesser rows I do not have an issue.

Please suggest me if anyone has gone through this....

RC

Posted: Sun Jan 30, 2005 3:14 pm
by ray.wurlod
Welcome aboard! :D

Check the job log file for warning or error messages. For example, by default a DataStage job will abort if 50 warnings have been logged (you set the limit when requesting the job run).

Or it may be that the Oracle rollback segments is too small, in which case there will be an Oracle error message logged. What stage type are you using for the load?

Contemplate using the bulk loader (sqlldr) instead; have DataStage prepare the data, and possibly invoke sqlldr. This will be a lot faster than INSERT statements, not least because it's not transactional. You will, of course, need a sqlldr control script; you can have DataStage prepare this, you can prepare your own, or you can edit the one produced by DataStage (for example to use accurately-sized buffers).

Posted: Sun Jan 30, 2005 4:12 pm
by rrcreddy4
Thanks. I am using Oracle9i Stage for loading the data from already created sequential file in to an oracle table using the Insert or update clause we have on Oracle9i stage. When I load data for less records everything is fine. The issue is when I load more records.

I don't see any errors in the log file, When I am watching the job using DS Director, I see this abnormal error and then see DSD.SEQclose in BASIC program failure.

Please let me know.

RC

Posted: Sun Jan 30, 2005 5:17 pm
by chulett
When the job aborts, try Resetting the job from the Director. It's the button with the two left-facing arrows like rewind on a VCR. When it does the reset, it may put new information about the problem into the log labelled as "From previous run...". Please post the contents of that log message if it shows up, or let us know if it doesn't.

Posted: Sun Jan 30, 2005 8:28 pm
by rrcreddy4
Hi, Thanks for the reply. I do get the "Resetting Previous Job ..." message in the log.In the I do see a message saying "BASIC" error, DSD.SEQClose error. I can send you an exact error message tomorrow.

Please let me know.

Thanks again
RC

Posted: Sun Jan 30, 2005 9:58 pm
by chulett
Make sure to look for the "previous run" message, not anything about the actual resetting. If it's there, double-click on it to read the detail. There will also be a 'Copy' button that will automatically copy it to the clipboard for you.

Posted: Sun Jan 30, 2005 10:32 pm
by kcbland
Please provide a text sketch of your job, something like this:

Code: Select all

OCI--->xfm--->SEQ--->xfm--->OCI.
The way you described your job design left me scratching my head. I hope your job design is like the above sketch, as it will actually run 2 distinct processes with the second dependent on the full completion of the first.

Now, if you didn't do it this way, but actually have separated it like this:

Code: Select all

OCI--->xfm--->SEQ
SEQ--->xfm--->OCI
then your problem is that the top stream will still be writing the file while the bottom stream is reading the same file. This is really bad.

Now about the abnormal termination, what is the commit rate you are using? If it blows up, at about what point? You say you have 1M rows, but it is blowing up after 10K, 100K, at which point? How long is your job taking? Could you be dealing with database rollback segment issues, snapshot too old, etc? You're using the 9i OCI, are you using the 9i client software as well? Are there triggers on the table that could be causing a load issue as you touch a row?

Posted: Mon Jan 31, 2005 9:46 am
by rrcreddy4
Hi,

You are right, We have standard way of using DS. We split the jobs in to SRC, Look Up, XClean,Xmap,Xkey,Load.

In my case, I skipped Xkey. I extracted data from Oracle using OCI---Transformer---Seq file and then did cleansing,Xmap and then started doing Load, using SEQ------Transformer-----OCI.

I do have a trigger on the table to update update_user_id and update_date.

Initially I was committing at the end. After experiencing the issue, I have started committing after every 10K rows. Still I got the error. To answer afetr how many rows it blows, that is not fixed, when I run for 6 months data(1M rows) it blows after 600K rows. But the same job when I run for less than 200K rows I am fine initially. Again to explain in detail, I am running this job for a span of 12 months, If I run for 2 months period I am fine for 3 to 4 rounds and after that even that starts erroring. Then I get down to 10 days I am fine.

Regarding table space etc. am I supposed to get Oracle error for that rather than abnormal error.

Please suggest me. I will post the exact error message soon.

RC

Posted: Mon Jan 31, 2005 9:58 am
by kcbland
rrcreddy4 wrote:I do have a trigger on the table to update update_user_id and update_date.
Every row loaded hits this trigger, can it be disabled on this table and you do the work in the ETL job? You'll save a lot of time.
rrcreddy4 wrote: Regarding table space etc. am I supposed to get Oracle error for that rather than abnormal error.
That's why I asked about using 9i OCI with 9i client. Until we know the exact error message, we're guessing. I'm guessing you're having an Oracle related problem, not DS because it seems related to row counts.

Of course, you could have a piece of data within your stream that is blowing up DS. On that note, are you using any DS functions that you wrote in derivations or constraints? Could it be that a data condition takes a path in the logic that is blowing up the job? Abnormal terminations fall into 3 categories in my experience:

(1) Function/Subroutine call doing something it shouldn't, job can't recover
(2) Caching/row buffering, IPC, etc fouling the works
(3) Reading from hash file and writing back same row to same hash file without passive stage in between and attempting to use caching

Posted: Tue Feb 01, 2005 9:45 am
by rrcreddy4
Hi Here are the error messages I am getting . These are with Previous run title.

Project:smqa (whseqa)
Job name:LdTouchpointFactDICatalogs
Event #:127
Timestamp:2/1/2005 10:41:10 AM
Event type:Info
User:qdssmqa
Message:
From previous run
DataStage Job 237 Phantom 18248
jobnotify: Unknown error
DataStage Phantom Finished.
[18253] DSD.StageRun LdTouchpointFactDICatalogs. LdTouchpointFactDICatalogs.xfm 1 0/100 - core dumped.

Project:smqa (whseqa)
Job name:LdTouchpointFactDICatalogs
Event #:128
Timestamp:2/1/2005 10:41:10 AM
Event type:Info
User:qdssmqa
Message:
From previous run
DataStage Job 237 Phantom 18253
Abnormal termination of DataStage.
Fault type is 10. Layer type is BASIC run machine.
Fault occurred in BASIC program DSD.SEQClose at address 8

I am running on Oracle 9i client. Please let me know. Again I am new to this DS.

RC

Posted: Tue Feb 01, 2005 9:59 am
by kcbland
Are you using any row buffering or IPC stages?

Posted: Tue Feb 01, 2005 10:01 am
by scottr
Hi Bland what's this "database rollback segment issues, snapshot too old"

error,one of my jobs are aborting giving this message and if i reset it and run again it just runs fine without failing..

thanks

Posted: Tue Feb 01, 2005 10:09 am
by rrcreddy4
I think we are using row buffering.

RC

Posted: Tue Feb 01, 2005 10:43 am
by rrcreddy4
Hi

I ran with commit after every 30000 rows and the total rows almost 1M. And i got the below error

From previous run
DataStage Job 237 Phantom 29511
Abnormal termination of DataStage.
Fault type is 10. Layer type is BASIC run machine.
Fault occurred in BASIC program *DataStage*DSR_LOADSTRING at address 6ec.

Posted: Tue Feb 01, 2005 11:18 am
by kcbland
rrcreddy4 wrote:I think we are using row buffering.

RC
See my previous comments about this sometimes being flaky. Turn it OFF and try your job.