Page 1 of 1

XML files parsing error

Posted: Wed Apr 04, 2007 1:16 pm
by pratapsriram
Hi,
I am trying to merge 5 xml files into one xml file. The input xml files donot have LF(UnixNewline) character at the end. So basically each xml file is a string of characters without LF. I expect the output file after merging to be a concatenation of all the individual strings in each xml. Later on I shall delete the repeated tags to make it look like a single xml. So I have tried doing this and the parsing error from the datastage is the following.
Inputfiles_profitcenter_XML,0: Error reading on import.
Inputfiles_profitcenter_XML,0: Consumed more than 100,000 bytes looking for record delimiter; aborting

I dont want to have a record delimiter, so can Datastage handle files without a Delimiter?

I have searched on the forum, but nowhere its mentioned how to parse the xml files.
If anyone has done this before, please help me figure this out.

Thanks,
Sree

Posted: Fri Apr 06, 2007 7:03 am
by pratapsriram
I have played a little after the last post. Here is what I have gotten to..
Use sequential stage to read. Specific files option, using filter I am using cat to merge the input files to a single file. It works for small files but if there is a bigger size file then it aborts with the error:
Error reading on import.
Consumed more than 100,000 bytes looking for record delimiter; aborting
Import error at record 0.
The runLocally() of the operator failed.

How to workaround this issue?
Please help.

Posted: Fri Apr 06, 2007 7:14 am
by chulett
Seems like a difficult approach. Why not parse your multiple files and load the flattened data into a work table? Then extract the total data set from there and create a single proper file from it? A database table would allow you to help the XML generation by sorting or grouping, if needed. Seems to me you could also just land and concatenate the data into a flat file.

This rather than just stitch together these files and then hack out the naughty bits. :wink: