XML tags read issue

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
prasad v
Participant
Posts: 174
Joined: Mon Mar 30, 2009 2:18 am

XML tags read issue

Post by prasad v »

Hi

I am extracting data from files which has XML tags and New line characters withing the field. File Format as below
Field delimiter -comma
record delimiter - New Line
field surrounded by double quotes

can some one please let me know how do we read this fields? as one row becomes several rows because of new line character.
Row is as below in source file:

Code: Select all

"<?xml version='1.0'?>
 <fcpMailFormat=\"normal\" fcpMode=\"substituteAd>
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Can you not parse it using an XML stage?
-craig

"You can never have too many knives" -- Logan Nine Fingers
prasad v
Participant
Posts: 174
Joined: Mon Mar 30, 2009 2:18 am

Post by prasad v »

Thanks Chulett. But Sorry, It's not XML. It is html tags.

I cannot use XML stage as this is one of the field in .csv file other fields are normal text.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

prasad v wrote:But Sorry, It's not XML. It is html tags.
Your example says otherwise. However, do you actually want to parse information out of the field or are you simply looking for how to read it properly? For the latter, use the Server Sequential File stage as it can handle fields that "contain terminators".
-craig

"You can never have too many knives" -- Logan Nine Fingers
prasad v
Participant
Posts: 174
Joined: Mon Mar 30, 2009 2:18 am

Post by prasad v »

I just want to read the data and load into sql server field. Do you mean Server Jobs Sequential File Stage in Server Jobs?
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Yes. If this needs to be a PX job then use the Server Sequential File stage in a Server Shared Container.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Mike
Premium Member
Premium Member
Posts: 1021
Joined: Sun Mar 03, 2002 6:01 pm
Location: Tampa, FL

Post by Mike »

Your requirement leaves a lot unspecified...

Do you want all of the content in the file in a single SQL Server field?

If that is the case, then you can simply change the record delimiter to a character that doesn't appear within your data and read the entire file as a single record with one field.

Mike
prasad v
Participant
Posts: 174
Joined: Mon Mar 30, 2009 2:18 am

Post by prasad v »

Thanks everyone.

I have Source file
Col1, Col2,Col3,Col4
1,1,<xml=><>,2
1,2,<xml><>,3

But Col3 values have new line character in it
Record Delimiter is new line

What i am doing
Sequential File-->Transformer-->ODBC
here i am using Schema File in Sequential File Stage and passing all the columns to Sql Server table. This is Multi instance job use to execute several files.

Above 4 columns in file goes into 4 columns in table.

Code: Select all

I have similar process does the samething

Sequential File-->Column Import-->Transformer-->ODBC

Here Sequential File stage reads the whole content as single column and Column IMport uses the Schema file to split into columns. here i have question what would be the max length i can read in sequential file?
In both the proceses, there are no transformations included in jobs. just straight load except audit id(execution id)

Thanks
Post Reply