Sequential File read limit

SURA · Post by **SURA** » Thu Aug 18, 2016 10:58 pm

Hello All

I would like to understand the reason why the Datastage Sequential file stage is not able read the file which more the xGB?

Several link shows the limit is in the OS level and not with the Datastage, but my interest is slightly differ.

I have 6GB text file / 18 million records.

OS Windows 2008 R2 / 64 Bit

Code: Select all

SEQ_FILE --> ODBC

Datastage can load up to 12 million records / up to 4 GB data (due to the OS limitation).

Same OS with SQL Server 2008.

Code: Select all

SQLSERVER Import wizard to Table.

I can use the import wizard to load the same data file without any limitation issues ! (I haven't tried SSIS)

End of the day, trying to load into table.

What is the difference the way how the file has been read by the import wizard vs Datastage?

Please throw some light

ray.wurlod · Post by **ray.wurlod** » Fri Aug 19, 2016 3:53 pm

Have you tried using multiple readers?

SURA · Post by **SURA** » Sun Aug 21, 2016 1:27 pm

Haven't think that one Ray.

Thank you so much for your direction; let me try it and get back to you.

Mike · Post by **Mike** » Sun Aug 21, 2016 2:53 pm

First thing to do is isolate the issue. I doubt that reading from the sequential file is a problem.

Create a copy of your job that looks like this:

Code: Select all

Sequential File --> Copy

I would suspect the ODBC driver or the ODBC DSN set up before I would suspect the sequential file read.

Mike

SURA · Post by **SURA** » Sun Aug 21, 2016 5:39 pm

Thanks Mike

I will check this too.

SURA · Post by **SURA** » Sun Aug 21, 2016 11:50 pm

Thanks Mike.

It could be the ODBC limit, as suggested, the job with copy stage pulled all the rows.

Thanks Ray

You are 100% right, when i increased the readers as suggested, solved the problem.

Thank you so much guys for your valuable input.

Another good lesson for me. :D

SURA · Post by **SURA** » Mon Aug 29, 2016 6:49 pm

Sorry guys

I can't believe , all of the sudden, the same job failing again.

This time i am getting the below error.

SRC_TRANS,0: Error reading on import.
SRC_TRANS,0: Consumed more than 100000 bytes looking for record delimiter; aborting

Any changes?
No changes to the Server, Job, OS...

Then how this job ran successfully and managed to load the same file before?

I have no clue!

I found the IBM suggestion, which didn't worked well.

http://www-01.ibm.com/support/docview.w ... wg21651999

On that day when Mike suggested to load the data using copy stage. It ran successfully, but today that job also failing with same error.

chulett · Post by **chulett** » Mon Aug 29, 2016 7:07 pm

Then it is no longer the 'same file' if suddenly it can't find the record delimiter.

SURA · Post by **SURA** » Mon Aug 29, 2016 7:12 pm

Thanks craig.

But i am 100% sure there is no changes in the file, because that same file is sitting in the project folder (in my PC) which i have used before.

I was doing the end to end testing and I didn't used the large file to minimize the load time. Hence my testing is done, I tried to use the same file again which I have used before.

Anyhow lets see.

chulett · Post by **chulett** » Tue Aug 30, 2016 5:19 am

Something is different, obviously. Be curious what you end up tracking down.

SURA · Post by **SURA** » Tue Aug 30, 2016 5:41 pm

Yes, that's right.

If i recall correctly , as i have mentioned in my initial thread, in total 18 mil+ records. But when i run the first load ETL successfully loaded 12 million records, but didn't given any warnings or error about not loading the rest of the records.

Is it how the ETL tool will behave? I don't know !

Anyhow i will crack this down.