XML files parsing error

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
pratapsriram
Premium Member
Premium Member
Posts: 41
Joined: Tue Jan 24, 2006 3:43 pm
Location: United States
Contact:

XML files parsing error

Post by pratapsriram »

Hi,
I am trying to merge 5 xml files into one xml file. The input xml files donot have LF(UnixNewline) character at the end. So basically each xml file is a string of characters without LF. I expect the output file after merging to be a concatenation of all the individual strings in each xml. Later on I shall delete the repeated tags to make it look like a single xml. So I have tried doing this and the parsing error from the datastage is the following.
Inputfiles_profitcenter_XML,0: Error reading on import.
Inputfiles_profitcenter_XML,0: Consumed more than 100,000 bytes looking for record delimiter; aborting

I dont want to have a record delimiter, so can Datastage handle files without a Delimiter?

I have searched on the forum, but nowhere its mentioned how to parse the xml files.
If anyone has done this before, please help me figure this out.

Thanks,
Sree
pratapsriram
Premium Member
Premium Member
Posts: 41
Joined: Tue Jan 24, 2006 3:43 pm
Location: United States
Contact:

Post by pratapsriram »

I have played a little after the last post. Here is what I have gotten to..
Use sequential stage to read. Specific files option, using filter I am using cat to merge the input files to a single file. It works for small files but if there is a bigger size file then it aborts with the error:
Error reading on import.
Consumed more than 100,000 bytes looking for record delimiter; aborting
Import error at record 0.
The runLocally() of the operator failed.

How to workaround this issue?
Please help.
Knowledge is Power
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Seems like a difficult approach. Why not parse your multiple files and load the flattened data into a work table? Then extract the total data set from there and create a single proper file from it? A database table would allow you to help the XML generation by sorting or grouping, if needed. Seems to me you could also just land and concatenate the data into a flat file.

This rather than just stitch together these files and then hack out the naughty bits. :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply