Sequential File read limit

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
SURA
Premium Member
Premium Member
Posts: 1229
Joined: Sat Jul 14, 2007 5:16 am
Location: Sydney

Sequential File read limit

Post by SURA »

Hello All

I would like to understand the reason why the Datastage Sequential file stage is not able read the file which more the xGB?

Several link shows the limit is in the OS level and not with the Datastage, but my interest is slightly differ.

I have 6GB text file / 18 million records.

OS Windows 2008 R2 / 64 Bit

Code: Select all

SEQ_FILE --> ODBC

Datastage can load up to 12 million records / up to 4 GB data (due to the OS limitation).
Same OS with SQL Server 2008.

Code: Select all

SQLSERVER Import wizard to Table.

I can use the import wizard to load the same data file without any limitation issues ! (I haven't tried SSIS)
End of the day, trying to load into table.

What is the difference the way how the file has been read by the import wizard vs Datastage?

Please throw some light :lol:
Thanks
Ram
----------------------------------
Revealing your ignorance is fine, because you get a chance to learn.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Have you tried using multiple readers?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
SURA
Premium Member
Premium Member
Posts: 1229
Joined: Sat Jul 14, 2007 5:16 am
Location: Sydney

Post by SURA »

Haven't think that one Ray.

Thank you so much for your direction; let me try it and get back to you.
Thanks
Ram
----------------------------------
Revealing your ignorance is fine, because you get a chance to learn.
Mike
Premium Member
Premium Member
Posts: 1021
Joined: Sun Mar 03, 2002 6:01 pm
Location: Tampa, FL

Post by Mike »

First thing to do is isolate the issue. I doubt that reading from the sequential file is a problem.

Create a copy of your job that looks like this:

Code: Select all

Sequential File --> Copy
I would suspect the ODBC driver or the ODBC DSN set up before I would suspect the sequential file read.

Mike
SURA
Premium Member
Premium Member
Posts: 1229
Joined: Sat Jul 14, 2007 5:16 am
Location: Sydney

Post by SURA »

Thanks Mike

I will check this too.
Thanks
Ram
----------------------------------
Revealing your ignorance is fine, because you get a chance to learn.
SURA
Premium Member
Premium Member
Posts: 1229
Joined: Sat Jul 14, 2007 5:16 am
Location: Sydney

Post by SURA »

Thanks Mike.

It could be the ODBC limit, as suggested, the job with copy stage pulled all the rows.


Thanks Ray

You are 100% right, when i increased the readers as suggested, solved the problem.

Thank you so much guys for your valuable input.

Another good lesson for me. :D
Thanks
Ram
----------------------------------
Revealing your ignorance is fine, because you get a chance to learn.
SURA
Premium Member
Premium Member
Posts: 1229
Joined: Sat Jul 14, 2007 5:16 am
Location: Sydney

Post by SURA »

Sorry guys

I can't believe , all of the sudden, the same job failing again.

This time i am getting the below error.
SRC_TRANS,0: Error reading on import.
SRC_TRANS,0: Consumed more than 100000 bytes looking for record delimiter; aborting
Any changes?
No changes to the Server, Job, OS...

Then how this job ran successfully and managed to load the same file before?

I have no clue!

I found the IBM suggestion, which didn't worked well.

http://www-01.ibm.com/support/docview.w ... wg21651999

On that day when Mike suggested to load the data using copy stage. It ran successfully, but today that job also failing with same error.
Thanks
Ram
----------------------------------
Revealing your ignorance is fine, because you get a chance to learn.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Then it is no longer the 'same file' if suddenly it can't find the record delimiter.
-craig

"You can never have too many knives" -- Logan Nine Fingers
SURA
Premium Member
Premium Member
Posts: 1229
Joined: Sat Jul 14, 2007 5:16 am
Location: Sydney

Post by SURA »

Thanks craig.

But i am 100% sure there is no changes in the file, because that same file is sitting in the project folder (in my PC) which i have used before.

I was doing the end to end testing and I didn't used the large file to minimize the load time. Hence my testing is done, I tried to use the same file again which I have used before.

Anyhow lets see.
Thanks
Ram
----------------------------------
Revealing your ignorance is fine, because you get a chance to learn.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Something is different, obviously. Be curious what you end up tracking down.
-craig

"You can never have too many knives" -- Logan Nine Fingers
SURA
Premium Member
Premium Member
Posts: 1229
Joined: Sat Jul 14, 2007 5:16 am
Location: Sydney

Post by SURA »

Yes, that's right.

If i recall correctly , as i have mentioned in my initial thread, in total 18 mil+ records. But when i run the first load ETL successfully loaded 12 million records, but didn't given any warnings or error about not loading the rest of the records.

Is it how the ETL tool will behave? I don't know !

Anyhow i will crack this down.
Thanks
Ram
----------------------------------
Revealing your ignorance is fine, because you get a chance to learn.
Post Reply