Generating multiple XML files from XML output stage

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
suryadev
Premium Member
Premium Member
Posts: 211
Joined: Sun Jul 11, 2010 7:39 pm

Generating multiple XML files from XML output stage

Post by suryadev »

I am getting the input as XML and parsing it, which is giving the below records.. Name key Acct
ABC 123 111
DEF 456 111
GHI 789 333
JKL 712 222
MNO 765 222

I am sending all these records into XML output stage to form 3 XML files based on the acct number i.e. 111 acct value records will go into one XML and 333 into one XML file and then 222 value records to one more XML file. So three files should be generated. Key for the XML is the same key column.
I gave the option as trigger column is acct but did not generate XML, also tried single column as option it gave 5 different files....but not 3 files as required...

Thank you
Thanks,
Surya
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

Hard to say....but try sorting it by that col upstream
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
suryadev
Premium Member
Premium Member
Posts: 211
Joined: Sun Jul 11, 2010 7:39 pm

Post by suryadev »

Thank you, so you mean to sort it based on the acct column and send into XML output stage....any options to be changed in the XML output stage.....to generate XML...files?
Thanks,
Surya
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

No, a change in the trigger column is what determines when a new file is created. The suggestion was to sort by that column so the values arrive together and you get three files rather than five. Also understand that partitioning could affect the output.
-craig

"You can never have too many knives" -- Logan Nine Fingers
suryadev
Premium Member
Premium Member
Posts: 211
Joined: Sun Jul 11, 2010 7:39 pm

Post by suryadev »

I used a sort stage before a transformer and then the XML output stage. I sorted using ACC_NO and processed the flow in sequential rather than parallel for the sort stage, transformer and XML output stage. Used the same trigger_column option and three files have been generated. Now I need to load these XML CLOB into a table as three records so use the sequential flow till the table.

Please let me know if anything here needs to be changed?

@ chulett as you mentioned, how to get rid of the partitioning effect, as I am using sequential flow the partition should not effect. Please suggest

Thank you!
Thanks,
Surya
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

Add an output link to the xmlOutput stage.....put one column on it, called something like "xmlContent". Make it a long varchar and some. Arbitrary long length. Put a single / in the description.

Send that column downstream to an rdbms stage or wherever.

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
suryadev
Premium Member
Premium Member
Posts: 211
Joined: Sun Jul 11, 2010 7:39 pm

Post by suryadev »

Thank you Ernie!

In the XML_Output stage under stage-->Options--->Write to file where I gave path and the number of files are generated based on the trigger column.

But if I follow the same by giving an output link from the XML output to a sequential file then only 1 file is created with all the XML tags in it.

Is this because I am loading into a file and it will not be the same when loaded into a table?

But I do have an other requirements also where I need to load in files from XML _output...as I need to add a key to each file...
Thanks,
Surya
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

Exactly....when you have xmlOuput do the writing, it writes three 'files'....when you send it on a link, you get three "rows" instead of just one row.

What you do with the xml downstream with rdbms or sequential stages is up to you.

Your original question was how to load three rows.

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
suryadev
Premium Member
Premium Member
Posts: 211
Joined: Sun Jul 11, 2010 7:39 pm

Post by suryadev »

Yes Ernie, that works when written by XML output as you said which is if I give filename.xml in the file path of XML output it is generating filename.xml,filename1.xml,filename2.xml....but if I give sequential file and give the same path all the 3 rows are combining into filename.xml....

What changes do i need to make in the sequential file so that each row will be loaded in each file?

Thanks again
Thanks,
Surya
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

If you have 9.1.2, the Seq Stage has options for that. If you are prior to 9.1.2, look into the Folder Stage of a Server Job....it can do things like that.

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
Post Reply