Page 1 of 1

Datastage XML input stage question

Posted: Wed Oct 18, 2017 3:44 pm
by vik1979
Hello All,

I am using XML input stage for parsing XML files in a parallel job, would like to know how I can pass through XML parsing when repetitive element is not found when parsing.

Code: Select all

<Store>
     <ID>
           <code>
                <Item extension="11"/>
            </code>
      </ID>
     <ID>
           <code>
                <Item extension="21"/>
            </code>
      </ID>
     <Container>
            <Stock code="abc"/>
     <Container>
            <Stock code="bcd"/>
</Store>
In above example, Store and ID elements are mandatory in the message and Container element is optional. I am parsing the XML keeping ID element as key, chunking out and parsing the Container info in next XML stage.
I am using below design in parsing. In XML stage 2, when Container info is missing in the message (as it is optional), no records are coming out of XML stage 2. I am using 'repetitive element required' unchecked in Transformation Settings and using Container ID as key.

[External Source] -> [ XML Stage 1 Parsing ID info ] - > [ XML Stage 2 Parsing Container info] -> Peek

Other option, I may have to parse out both elements separately and use a join stage to achieve this. But we are trying to minimize the number of the joins in the job. Would like to know if there is a way to make my first option work.

Thanks,
Vik.

Re: Datastage XML input stage question

Posted: Wed Oct 18, 2017 3:55 pm
by vik1979
First option is working fine. I changed the path I gave for 2nd XML stage.

Posted: Thu Oct 19, 2017 10:52 am
by asorrell
I know you are on release 9 - but wanted to mention that the Hierarchical Stage in 11.3 does a better job of handling these situations. It has some different limitations but is usually more flexible than the XML stage. As long as your XSD (schema) defines it as a repeating element, the Hierarchical Stage should handle it whether it is there zero or more times.

If your site has 11.3 available (now or coming soon) then you might want to look at the Hierarchical stage.

Posted: Thu Oct 19, 2017 11:25 am
by eostic
As Andy notes, the Hierarchical stage can do some nice things.....but beware! It cannot handle parents with no children, when both are on the same link.
xmlInput handles this fine, with unchecking repeating element required, and nulls will result for the children. The other will lose the Parent entirely, forcing separate links.

The Hierarchical Stage is more performant and can handle huge XML docs, but when the XML docs are small, use the xmlInput Stage instead.

Ernie

Posted: Fri Oct 20, 2017 9:53 am
by vik1979
Thanks for the reply. I am unable to see your complete post, I need to get the premium access :)
I will definitely look into the IBM notes for the Hierarchical stage in 11.3.
I am working on complex HL7 standard schemas, which has a lot of parent child relationships.
I tried XML stage before using XML input stage in 9.1, which was making it complicated to parse elements under each list into a separate output.. which will end up using too many joins to combine the data.
So, am trying to understand how hierarchical stage is different from XML stage. Thanks.

XML PARSER ERROR

Posted: Wed Nov 22, 2017 5:10 pm
by venkata9
Hi,

i'm not able to post a new topic even though i'm premium member.
so i'm posting my question here

I'm getting the the below xml parser error randomly,

environet: DS 9.1.2 , Redhat linux 6

XMLS_MESSAGE_PARSER,0: Message bundle error Can't find resource for bundle com.ibm.e2.Bundle_E2_engine_msgs_en_US, key E2IllegalStateException.parentCursorInvalid.

could some one help?