Page 1 of 1

Viewing UTF-8 data in Datastage on Unix Platform

Posted: Wed Oct 25, 2006 12:08 pm
by datastage_user
Hello All,

I have a peculiar problem when viewing text data in DataSatge. When viewing a file saved in unicode format and viewed with Unicode NLS mapping in DataStage i encounter no problems. However, when i save file in utf-8 format and try to view this in Datastage with UTF-8 NLS mapping I get the following error: nls_read_delimited() - row 1, too many columns in record. How did changing to UTF-8 create this differnce?

Thanks,

Vikram

Posted: Wed Oct 25, 2006 1:26 pm
by ray.wurlod
UTF-8 is a term that covers a multitude of different encodings of Unicode. DataStage actually uses an idiosyncratic encoding that should strictly be called UV-UTF8; it preserves dynamic array delimiter characters as single byte characters, and therefore must map genuine Char(248) through Char(255) into the Unicode private use area.

Without knowing exactly what your format is, it is difficult to comment further.

But let me re-state that there are many different encodings that call themselves UTF8. Visit the Unicode Consortium website to begin your search for more information.