Hi,
I'm trying to read Chinese file and it fails when CHAR and for VARCHAR it works.
The existing other regions have CHAR and we are trying to minimize the changes in the layout.
The layout is Char Fixed width file Unicode.
FIRSTNAME:Char(30)-Unicode
MiddleName:Char(1)-Unicode
LastName:Char(30)-Unicode
Existing data:
COLNIE MPROLL
chinese data:
李娜 MPROLL
when viewed in Hex editor -chinese char took around 3bytes.
Hence i tried firstname:6bytes(data)+24padchars but no luck.
External ustring too short. Imported only 0 external characters into a ustring of fixed length 1.
##W IIS-DSEE-TFIG-00201 09:53:53(001) <SQ,0> Field "MiddleName" has import error and no default value; data: <empty>, at offset: 787
Is it only varchar is supported for multi-byte?
Thanks in advance!
Chinese Char /UTF-8
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
UTF-8 is an encoding of the Unicode code points, and does handle multi-byte data (though using up to four bytes per character).
VarChar will give you fewer problems than Char, because the latter requires fields to be padded to length.
VarChar will give you fewer problems than Char, because the latter requires fields to be padded to length.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Thanks for your replies.
Are there any other issues apart from bytes space b/w UTF-8 /UTF-16?
1) We were able to use UTF-8 for thai and china - both seq file as i/p and o/p.
2) Is there a way to check in ds as right now i dont have temp db to check for bytes space usage b/w utf-8/utf-16.
3) Is it possible to read mainframe i/p UTF-16 ,process it in datastage and load into MDM tables(utf-8)?
Thanks.
Are there any other issues apart from bytes space b/w UTF-8 /UTF-16?
1) We were able to use UTF-8 for thai and china - both seq file as i/p and o/p.
2) Is there a way to check in ds as right now i dont have temp db to check for bytes space usage b/w utf-8/utf-16.
3) Is it possible to read mainframe i/p UTF-16 ,process it in datastage and load into MDM tables(utf-8)?
Thanks.