Additional info: DataStage v8.5.0.1 on AIX v6.1 using DB2 v9.7.0.4
Problem:
- The job returns the warning message (the dsOppClh stage that reads a dataset):
dsOppClh,1: Invalid character(s) ([xFC]) found converting string (code point(s): Standard.G[xFC]nter Suczynski) from codepage UTF-8 to Unicode, substituting.
Investigation:
- The job has the NLS default map set to UTF-8
- Reading the database via the db2 client returns the string as expected: Günter
- Using the db2connector rather than the dataset returns the string as: Günter
- Reading the dataset via "orchadmin dump" returns the problem string as: Günter
- I was once told that the DB2 database uses UTF-8 (I do not know the command to confirm that, but I believe the db2 client uses the operating system setting).
- AIX locale command returns: LANG=C
- The project has a the parallel NLS default map set to ISO-8859-1 (which I believe also understands [xFC] as ü).
- The dataset was created via a job that reads the table and passes all columns using RCP to a copy stage (that clears the partitioning), and then writes the dataset. Originally that job used the project setting, but has now been recompiled with the NLS default map set to UTF-8.
- I tried rewriting the extract job without RCP, and the dataset can now be read without causing the warning.
So it looks like there's something wrong with the job that's using RCP.
When using RCP, the job informs us that: "External schema field, name: CLH_ID, type: STRING was not found in the design schema" (CLH_ID is the column that contains string "Günter").
My guess is that DB2 does not provide DataStage with a "ustring" datatype (to db2, the column is simply "varchar"). So when using RCP, the column is saved as "string" instead of "ustring".
Is my guess right?
Does anyone have any suggestions for a fix or workaround?
Thanks for the suggestion Venki, but it won't work in our case.
I forgot to explain the job design.
Job1:
Reads the given db2 table using RCP, and writes it to a dataset (e.g. <tablename>.ds
Job2:
Reads this dataset, with the columns explictly defined (as ustring).
So we have no opportunity to change the data type in the schema file.
As a workaround, we redefined the problem column in DB2 as Vargraphic. RCP interprets this as ustring[], avoiding the warning.
John Wyman (one of the brilliant guys from IBM Tech Support) has given me a few ideas to try when I get the chance. So stay tuned.
Add To Favorites View next topic View previous topic
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum