XML Hierarchical Stage Produces No Rows

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
irvinew
Participant
Posts: 15
Joined: Mon Jun 18, 2018 8:52 am
Location: Regina, SK
Contact:

XML Hierarchical Stage Produces No Rows

Post by irvinew »

I have successfully read on 4 XSD files

I have 1 xml source test file to read in.

The source xml file has really 2 parts, one called AcademicRecordBatch which has basic Sender/Destination info that does not change; I can read in that. it produces rows in a seperate job. I

The second part has high school transcript data, called HighSchoolTranscript. I have problems with this part. This is in a separate job.

I am using 2 jobs because I couldn't get the union parser stage to work without hanging the system.

My job is simple to start; it goes from a Hierarchical stage right to a peek stage just to get things going. Problem is that the highschool job compiles but produces no rows. To start it is only mapping firstname, lastname.

Sometimes I run into a scalar error like this:

com.ibm.e2.core.exceptions.E2IllegalStateException: CDIER0835E: In step XML_Parser, the Hierarchical Data stage tried to assign the value true to the {http://www.ibm.com/e2/reserved}@@isPresent scalar element, but the element already has the value true. Possible mapping error involving ListToGroup. The parent list element for the scalar element is AcademicSummary.
Test completed

Not sure if warnings prevent the rows from being read; but I am getting a successful compiles but no rows simple or complex; I know it is rather vague but any thoughts on where to debug/start? I have tried chunking with successful compilations but still no rows.

Thanks for the help in advance.




This is the source xml file

Code: Select all

<?xml version="1.0" encoding="UTF-8"?>
<AcRecBat:AcademicRecordBatch xmlns:AcRecBat="urn:org:pesc:message:AcademicRecordBatch:v2.1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="urn:org:pesc:message:AcademicRecordBatch:v2.1.0 AcademicRecordBatch_v2.1.xsd">
	<BatchEnvelope>
		<BatchID>00000001</BatchID>
		<BatchDateTime>2018-01-31T11:06:35-07:00</BatchDateTime>
		<BatchDeliveryMethod>DeliverWhole</BatchDeliveryMethod>
		<SourceAgency>
			<Organization>
				<APAS>SK00000000</APAS>
				<LocalOrganizationID>
					<LocalOrganizationIDCode>SK00000000</LocalOrganizationIDCode>
					<LocalOrganizationIDQualifier>SK</LocalOrganizationIDQualifier>
				</LocalOrganizationID>
				<OrganizationName>Saskatchewan Ministry of Education</OrganizationName>
				<Contacts>
					<Phone>
						<AreaCityCode>306</AreaCityCode>
						<PhoneNumber>7876012</PhoneNumber>
					</Phone>
					<Email>
						<EmailAddress>student.records@gov.sk.ca</EmailAddress>
					</Email>
				</Contacts>
			</Organization>
		</SourceAgency>
		<DestinationAgency>
			<Organization>
				<PSIS>47004000</PSIS>
				<LocalOrganizationID>
					<LocalOrganizationIDCode>47004000</LocalOrganizationIDCode>
					<LocalOrganizationIDQualifier>SK</LocalOrganizationIDQualifier>
				</LocalOrganizationID>
				<OrganizationName>University of Regina</OrganizationName>
			</Organization>
		</DestinationAgency>
	</BatchEnvelope>
	<BatchContent>
		<HSTrn:HighSchoolTranscript xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:n1="http://www.altova.com/samplexml/other-namespace" xmlns:HSTrn="urn:org:pesc:message:HighSchoolTranscript:v1.5.0" xsi:schemaLocation="urn:org:pesc:message:HighSchoolTranscript:v1.5.0 HighSchoolTranscript_v1.5.0.xsd">
			<TransmissionData>
				<DocumentID>2018-01-3111061</DocumentID>
				<CreatedDateTime>2018-01-31T11:06:35-07:00</CreatedDateTime>
				<DocumentTypeCode>StudentRequest</DocumentTypeCode>
				<TransmissionType>Original</TransmissionType>
				<Source>
					<Organization>
						<APAS>SK00000000</APAS>
						<LocalOrganizationID>
							<LocalOrganizationIDCode>SK00000000</LocalOrganizationIDCode>
							<LocalOrganizationIDQualifier>SK</LocalOrganizationIDQualifier>
						</LocalOrganizationID>
						<OrganizationName>Saskatchewan Ministry of Education</OrganizationName>
						<Contacts>
							<Phone>
								<AreaCityCode>306</AreaCityCode>
								<PhoneNumber>7876012</PhoneNumber>
							</Phone>
							<Email
Will
wpkalsow
Premium Member
Premium Member
Posts: 11
Joined: Wed Mar 12, 2003 6:13 pm
Contact:

Post by wpkalsow »

The XML_Parser error will only occur during parsing of data to match the xsd during testing or execution of the stage.

Sounds like it is a mapping issue to me.

Can you share the xsd for this data?
irvinew
Participant
Posts: 15
Joined: Mon Jun 18, 2018 8:52 am
Location: Regina, SK
Contact:

Post by irvinew »

Hope this helps; its the HighSchool XSD; the include CoreMain is far to large to paste here; the AcademicRecord is quite big as well. Those 2 files don't appear in the Library as name spaces; only the AcademicRe ordBatch and HighSchoolTranscript can be found to be used as XSD documents in the parser stages.

Thank you in advance :)

Code: Select all

	<xs:import namespace="urn:org:pesc:core:CoreMain:v1.14.0" schemaLocation="CoreMain_v1.14.0.xsd"/>
	<xs:import namespace="urn:org:pesc:sector:AcademicRecord:v1.9.0" schemaLocation="AcademicRecord_v1.9.0.xsd"/>
	<!--============================================================================-->
	<!--Name:      HighSchoolTranscript.xsd  -->
	<!--Version:  1.5.0-->
	<!--Date:       17-December-2014-->
	<!---->
	<!--Change Log:-->
	<!--v1.0.x 23-May-2005 Bruce Marton  - Draft version proposed by PESC High School Transcript workgroup. -->
	<!--v1.0.x 24-May-2005 Bruce Marton  -Minor corrections. -->
	<!--v1.0.x 15-September-2005 Bruce Marton  - Additional draft changes proposed by PESC High School Transcript workgroup. -->
	<!--v1.0.0 15-February-2006 Bruce Marton  - Final proposed changes for PESC High School Transcript as approved for public comment by PESC Change Control Board (reviewed - JAF). -->
	<!--v1.2.0 29-April-2011 Jeffrey Funck  -  -->
	<!--Include all changes requested from Tom Stewart -->
	<!--   Change #   TS20110329030400 -->
	<!--v1.3.0 15-June-2012 Jeffrey Funck  -  -->
	<!--Modify to pull in new versions of sector libraries -->
	<!--   Change #   TS20120305094902 -->
	<!--v1.4.0 15-October-2013 Jeffrey Funck  -  -->
	<!--Modified to use the newest version of CoreMain (v1.13.0)-->
	<!--   Change #   TS20130624000001 -->
	<!--v1.5.0 17-December-2014 Jeffrey Funck  -  -->
	<!--Modified to use the newest version of CoreMain (v1.14.0)-->
	<!--   Change #   MB20140606000001 -->
	<!--============================================================================-->
	<!---->
	<xs:element name="HighSchoolTranscript">
		<xs:complexType>
			<xs:sequence>
				<xs:element name="TransmissionData" type="AcRec:TransmissionDataType"/>
				<xs:element name="Student" type="AcRec:K12StudentType"/>
				<xs:element name="NoteMessage" type="core:NoteMessageType" minOccurs="0" maxOccurs="unbounded"/>
				<xs:element name="UserDefinedExtensions" type="core:UserDefinedExtensionsType" minOccurs="0"/>
			</xs:sequence>
		</xs:complexType>
	</xs:element>
</xs:schema>
Will
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

:idea:

FYI - code tags preserve whitespace / formatting, otherwise the forum software gets rid of all those pesky 'extra' spaces.
-craig

"You can never have too many knives" -- Logan Nine Fingers
irvinew
Participant
Posts: 15
Joined: Mon Jun 18, 2018 8:52 am
Location: Regina, SK
Contact:

Post by irvinew »

Further context:

I think I figured out why it doesn't read rows; though I've yet to get it to work.

I have a multiple XSD validating and parsing a single xml file; within that XML file it has other xml files bundled into it; that is supposed to be contained in the "BatchContent". Within the document root there is a type element that refers to this bundled "BatchContent" section: Here is the code for the main XSD document:


<xs:schema xmlns:AcRecBat="urn:org:pesc:message:AcademicRecordBatch:v2.1.0" xmlns:AcRec="urn:org:pesc:sector:AcademicRecord:v1.9.0" xmlns:core="urn:org:pesc:core:CoreMain:v1.14.0" xmlns:xs="http://www.w3.org/2001/XMLSchema" targetNamespace="urn:org:pesc:message:AcademicRecordBatch:v2.1.0" elementFormDefault="unqualified" attributeFormDefault="unqualified" version="v2.1.0">
<xs:import namespace="urn:org:pesc:core:CoreMain:v1.14.0" schemaLocation="CoreMain_v1.14.0.xsd"/>
<xs:import namespace="urn:org:pesc:sector:AcademicRecord:v1.9.0" schemaLocation="AcademicRecord_v1.9.0.xsd"/>
<!--============================================================================-->
<!--Name: AcademicRecordBatch-->
<!--Version: 2.1.0-->
<!--Date: 17-December-2014-->
<!---->
<!--Change Log:-->
<!-- Change # JTS20070816102300 -->
<!-- Reviewed by Jeffrey A Funck -->
<!--2.0.0 14-March-2008 Tuan Anh Do - Restructured Schema to include Transmission Data Segment for sending/receiving agencies -->
<!-- Changes for this version is not backwards compatible with v1.0.0. -->
<!-- The Batch Content is mandatory and places the data package as the child which moves it down one layer.-->
<!-- The Batch Envelope is entirely Optional -->
<!-- The Batch Envelope is entirely Optional -->
<!-- The Batch Envelope is entirely Optional -->
<!--v2.1.0 17-December-2014 Jeffrey Funck - -->
<!--Modified to use the newest version of CoreMain (v1.14.0)-->
<!-- Change # MB20140606000001 -->
<!--============================================================================-->
<!---->
<xs:element name="AcademicRecordBatch">
<xs:complexType>
<xs:sequence>
<xs:element name="BatchEnvelope" type="AcRec:TransmissionBatchType" minOccurs="0"/>
<xs:element name="BatchContent" type="core:AcademicRecordBatchType"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>



The "BatchContent" refers to a file called "CoreMain" in the CoreMain XSD document this is the referred type:



<xs:complexType name="AcademicRecordBatchType">
<xs:annotation>
<xs:documentation>This is used to create a place holder and root element to contain multiple logical XML documents that are bundled for a single batch transmission</xs:documentation>
</xs:annotation>
<xs:sequence>
<xs:any namespace="##other" processContents="strict" maxOccurs="unbounded"/>
</xs:sequence>
</xs:complexType>




But in my parser; Datastage doesn't want to drill down into it; so all I get in the test data is this:

</BatchEnvelope>
<BatchContent><?xml version="1.0" encoding="UTF-8"?><?xml version="1.0" encoding="UTF-8"?><?xml version="1.0" encoding="UTF-8"?></BatchContent>
</AcRecBat:AcademicRecordBatch>


in the Tree structure all I see is this:

ns0:BatchContent
- wildcard0
? e2res:text()


Any thoughts on how to get this to work? I did validate all files at xmlvalidator.com
Will
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Thanks Will... I nuked the other topic. :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
irvinew
Participant
Posts: 15
Joined: Mon Jun 18, 2018 8:52 am
Location: Regina, SK
Contact:

Post by irvinew »

Update:

I got 1 row to work; seems as reason the hierarchical stage didn't want to drill down into the xml was that it couldn't resolve where the schema location was........I think.

Anyway I got it to resolve 2 different ways:

1) I butchered the xsd and made the highschool a complex type

Code: Select all

xs:complexType name="HighSchoolTranscriptDataType">
		<xs:sequence>
			<xs:element name="TransmissionData" type="AcRec:TransmissionDataType" minOccurs="0" maxOccurs="unbounded" />
				<xs:element name="Student" type="AcRec:K12StudentType" maxOccurs="unbounded"/>
				<xs:element name="NoteMessage" type="core:NoteMessageType" minOccurs="0"/>
				<xs:element name="UserDefinedExtensions" type="core:UserDefinedExtensionsType" minOccurs="0"/>	
		</xs:sequence>	
</xs:complexType>
2) I altered the schemaLocation to be a direct link

"urn:org:pesc:message:AcademicRecordBatch:v2.1.0 AcademicRecordBatch_v2.1.xsd"

became

"AcademicRecordBatch_v2.1.xsd"


The AcademicRecordBatch is what the testfile draws from, the AcademicBatch is as follows:

Code: Select all

<xs:element name="AcademicRecordBatch">
		<xs:complexType>
			<xs:sequence>
				<xs:element name="BatchEnvelope" type="AcRec:TransmissionBatchType" minOccurs="0"/>
				<xs:element name="BatchContent" type="core:AcademicRecordBatchType"/>
			</xs:sequence>
		</xs:complexType>
</xs:element>
Anyway; regardless of how I altered the xsd datastage and my xml editor would always fumble in 2 ways

It didn't like this line:

<HSTrn:HighSchoolTranscript xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:n1="http://www.altova.com/samplexml/other-namespace" xmlns:HSTrn="urn:org:pesc:message:HighSchoolTranscript:v1.5.0" xsi:schemaLocation="urn:org:pesc:message:HighSchoolTranscript:v1.5.0 HighSchoolTranscript_v1.5.0.xsd">


This line is in the testfile; it specifies the xml for the transcript


If I took it out it complains that it can't find these 2 lines from above, which doesn't exist in the testfile; but their minoccurs is 0, so ???

<xs:element name="NoteMessage" type="core:NoteMessageType" minOccurs="0"/>
<xs:element name="UserDefinedExtensions" type="core:UserDefinedExtensionsType" minOccurs="0"/>


Anyway I am dammed either way and can't get a break; can someone explain why datastage cares about the missing files?
Why does datastage complain about the <HSTRN:HighSchool................. line??


Sorry for the long post; trying to give as much info as possible.
Will
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

I can't really help but I am of the opinion that we'd rather have too much information rather than not enough. :wink:

Out of curiosity, have you involved your official support provider yet?
-craig

"You can never have too many knives" -- Logan Nine Fingers
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

Still having issues with this xml?
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
irvinew
Participant
Posts: 15
Joined: Mon Jun 18, 2018 8:52 am
Location: Regina, SK
Contact:

Solved

Post by irvinew »

I have solved my own problem.

1. Datastage does not like the XSD document I had; I had to reverse engineer one of the XSD documents to make it an element a complex type just so Datastage could drill down into the XML. The XML had an embedded XML document where that document could be 1 or many.

2. I set my validation to Reject

3. Whatever is in those XSD documents (There are 4 of them) Datastage doesn't like them; it consumes so much memory it fails at some point; thus by the time the parser reaches the transcripts it stops reading. At some point the server begins to throw Java stack and heap errors and the system bogs down.

I don't know if it is a rookie mistake but I had sample test data to test the parser still in there. If I took it out then it worked. At some point I made my own XSD document from the XML document to generate a sample test data to test the parser; that is where I stumbled on the answer. I don't know why the system gets bogged down because of this; maybe someone can chime in on that.

Thanks for the moderator for help; Cheers.
Will
Post Reply