Split single XML to Multiple XMLs based on count

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
suja.somu
Participant
Posts: 79
Joined: Thu Feb 07, 2013 10:51 pm

Split single XML to Multiple XMLs based on count

Post by suja.somu »

There is a job which creates a very large XML. Requirement is to split the single large XML to multiple XML's based on count.

For an example : large XML has 900 records, I need to split into 3 small XML's of 300 each. The count 300 does not vary. Is that doable in new XML stage in Version 8.5 .

I need to know if its possible to create 3 XMLs using new XML stage or I have to create a post job to do this.

What are the different ways to handle this scenario?
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

There is a "trigger column" property that does exactly that.
-craig

"You can never have too many knives" -- Logan Nine Fingers
suja.somu
Participant
Posts: 79
Joined: Thu Feb 07, 2013 10:51 pm

Post by suja.somu »

could you please help me to locate the property in assembly XML editor.

can we send the count as a parameter to the XML ?
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

I'm assuming it's part of the new XML stage but couldn't tell you where or exactly how it was implemented... but I'm sure it would be documented somewhere. In the Old World, you designate a Trigger Column and whenever the value in it changes the current file is closed and a new one opened. So we would basically treat it as a numeric value and just Mod() the @OUTROWNUM or outgoing record count. When the remainder was zero, we would increment the Trigger Column value and that change would cause a new file to spool off. And there's no need to include the trigger in the output XML. The downside was you had very little control over the name it used for each file.

So that would be one approach, Mod() using your desired count. For all I know there's something more better built into the new Assembly.
-craig

"You can never have too many knives" -- Logan Nine Fingers
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

...there is, but it isn't as straightforward. Ultimately, you map your "changing" element to the Document Collection, and that will result in a new document being generated and spun out of the Stage....

However --- for something like this, I wouldn't try it that way. Sounds like you have a perfectly working Job, with a perfectly valid Assembly. Rather than make code changes inside of the Assembly, just send an end-of-wave upstream from the xml stage, based on the row count.

Put an end-of-wave stage upstream, and have it react (send a new wave) based on a set of counters where you have a flag that indicates when you've hit a new group ("n" rows or whatever). The end of wave indicator will signal the xml stage to wrap up and finish it's aggregation of nodes.......then starting again with the next batch of incoming rows. If you set up your counters and indicators for "triggering" every 300 rows, and you pass in 600 rows, you would get 2 rows out.......send those as one big column to their ultimate destination......(sequential stage, mq, database, etc.)

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Interesting!
-craig

"You can never have too many knives" -- Logan Nine Fingers
qt_ky
Premium Member
Premium Member
Posts: 2895
Joined: Wed Aug 03, 2011 6:16 am
Location: USA

Post by qt_ky »

And clever!
Choose a job you love, and you will never have to work a day in your life. - Confucius
oracledba
Premium Member
Premium Member
Posts: 49
Joined: Mon Aug 06, 2012 9:21 am

Post by oracledba »

smart :)
Post Reply