Page 1 of 1

XML Input - Dataset Output

Posted: Thu Oct 29, 2015 8:34 am
by anil411
We are reading below XML file as below. The Last Column(CondoProjectName) has a special char.

Code: Select all

<?xml version="1.0" encoding="UTF-8"?> 
 <typ:UCDPReceiveAppraisalRequest xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:typ="http://receiveappraisal.company.com/schema/types"> 
    <typ:CurrentFileSequenceNumber>1</typ:CurrentFileSequenceNumber> 
    <typ:LastFileSequenceNumber>0</typ:LastFileSequenceNumber> 
    <typ:DocumentFileIDRecordCount>1</typ:DocumentFileIDRecordCount> 
    <typ:SyntheticTestIndicator>true</typ:SyntheticTestIndicator> 
    <typ:RequestSubmitDateTime>2008-09-18T21:49:45</typ:RequestSubmitDateTime> 
     <typ:RequestBeginDateTime>2014-09- 18T19:18:33</typ:RequestBeginDateTime> 
     <typ:RequestEndDateTime>2006-08-19T13:27:14-04:00</typ:RequestEndDateTime> 
     <typ:VendorName>XYZ Support</typ:VendorName> 
     <typ:DocumentFiles> 
      <typ:DocumentFile> 
       <typ:DocumentFileID>FILEID-1</typ:DocumentFileID> 
       <typ:DocumentFileStatus>Successful</typ:DocumentFileStatus> 
       <typ:AppraisalRecordCount>1</typ:AppraisalRecordCount> 
        <typ:Appraisals> 
          <typ:Appraisal> 
            <typ:DocumentID>DOCID-1</typ:DocumentID> 
            <typ:DocumentType>2</typ:DocumentType> 
            <typ:SubmissionStatus>In Progress</typ:SubmissionStatus> 
            <typ:RawPropertyStreetAddress>1234 Any Street  Drive</typ:RawPropertyStreetAddress> 
           <typ:RawPropertyUnitNumber>11</typ:RawPropertyUnitNumber> 
           <typ:RawPropertyCity>Vienna</typ:RawPropertyCity> 
           <typ:RawPropertyState>VA</typ:RawPropertyState> 
           <typ:RawPropertyZipCode>22102</typ:RawPropertyZipCode> 
    <typ:RawAppraiserLicenseNumber>string</typ:RawAppraiserLicenseNumber> 
  <typ:RawAppraiserStateNumber>string</typ:RawAppraiserStateNumber> 
 <typ:RawAppraiserCertificationNumber>string</typ:RawAppraiserCertificationNumber> 
 <typ:ScrubbedAppraiserLicenseNumber>string</typ:ScrubbedAppraiserLicenseNumber> 
 <typ:RawSupervisorAppraiserLicenseNumber>string</typ:RawSupervisorAppraiserLicenseNumber> 
 <typ:RawSupervisorCertificationNumber>string</typ:RawSupervisorCertificationNumber> 
 <typ:ScrubbedSupervisorAppraiserLicenseNumber>string</typ:ScrubbedSupervisorAppraiserLicenseNumber> 
 <typ:AppraisedValueOfSubjectProperty>1000.00000000000</typ:AppraisedValueOfSubjectProperty> 
 <typ:EffectiveDateOfAppraisal>2002-11-05-05:00</typ:EffectiveDateOfAppraisal> 
 <typ:AppraisalFormNumberType>Small Residential Income Property Appraisal Report</typ:AppraisalFormNumberType> 
 <typ:AssignmentType>Purchase</typ:AssignmentType> 
 <typ:AssignmentTypeOther>string</typ:AssignmentTypeOther> 
 <typ:PropertyRightsAppraised>Fee Simple</typ:PropertyRightsAppraised> 
 <typ:PropertyRightsAppraisedOther>string</typ:PropertyRightsAppraisedOther> 
 <typ:CondoProjectName>‘Foothills Addition</typ:CondoProjectName> 
 </typ:Appraisal> 
 </typ:Appraisals> 
 </typ:DocumentFile> 
 </typ:DocumentFiles> 
 </typ:UCDPReceiveAppraisalRequest> 
After reading the XML input file, We are writing to a Dataset. The value in Output file for CondoProjectName is written as "^ZFoothills Addition"

If i change encoding from UTF-8 to ISO-8859-1 , The Data matches but we receive the file from external agencies.

We want to continue using UTF-8 and Data should match between XML and
Dataset. Please let me know if anybody faced similar issue.

Appreciate your help

Posted: Thu Oct 29, 2015 8:49 am
by ray.wurlod
Are you certain that there is no ^Z character in the source data? (Note, too, that this is the DOS end-of-file marker.)

Posted: Thu Oct 29, 2015 11:38 am
by anil411
Ray,

We don't have ^Z character in the source data.

The Source Data is as below.

<typ:CondoProjectName>‘Foothills Addition</typ:CondoProjectName>

Output in Dataset is as below.

"^ZFoothills Addition"

Please advise me.

Posted: Thu Oct 29, 2015 9:21 pm
by ray.wurlod
Can you please advise what the actual (hex codes) values are for the first three source characters after the tag? ^Z is 0x1A (which may help you).

It appears that you are not using a compatible character map between source and target.

Re: XML Input - Dataset Output

Posted: Thu Oct 29, 2015 9:35 pm
by chulett
anil411 wrote:If i change encoding from UTF-8 to ISO-8859-1 , The Data matches but we receive the file from external agencies.
I don't understand the "but" here. Are you saying you change the encoding in the job and it works? Or if you change the first element in the XML file?

Posted: Fri Oct 30, 2015 6:27 am
by anil411
Chulett,

If encoding="UTF-8" in first line of XML, the data is having issue.

If encoding="ISO-8859-1" in first line of XML, The Data between Source
and Target are matching.

I can't change NLS Settings , as they are disabled in our Project.

Is there any function to resolve this issue.

Thank you,

Posted: Fri Oct 30, 2015 6:51 am
by chulett
Then it seems to me you have two options. One is to ask the vendor to make the change. Second is to pre-process the file and change it yourself using something like awk / sed / perl. This could built in as a Before Job process or accomplished via a Sequence job.

Posted: Mon Nov 02, 2015 6:51 am
by eostic
I am fairly surprised that the data isn't being escaped as &#nn; where nn is the hex value for each of the bytes in question. That might be an option for you as you consider editing the overall file as Craig suggests.

Ernie

Posted: Thu Nov 12, 2015 10:05 am
by ds_developer
What is the datatype (in DS) you are using for this field? I just did this without changing the encoding="UTF-8" designation by using the NVarChar datatype. No changes to the NLS settings either.