Page 1 of 1

Junk characters when moving data from VARCHAR to CHAR

Posted: Mon Aug 13, 2018 5:24 am
by deesh
Hi,

My source having CHAR datatype and target has VARCHAR datatype. While moving data I am getting junk character into target.

Same thing s happening while moving data from VARCHAR to CHAR.

One of the options is to trim the column through out the job, but I need a better option.

Please suggest how to remove the junk char.

Posted: Mon Aug 13, 2018 6:26 am
by chulett
Well, let's start with trying to explain what exactly your "junk" characters are. The vast majority of the time, they're simply perfectly good but misunderstood and mistranslated data, i.e. a characterset conversion issue. Can you post some before and after examples, please, so we can get you some proper help? I'd also like to know what exactly your source and target are and if you have any idea what characterset they are.

Posted: Mon Aug 13, 2018 4:07 pm
by deesh
It is DB2 data, let me show an example

VARCHAR(5) data "ABCD" moved to CHAR(5) now data "ABCD " in the 5th character junk will happen instead of space.

Posted: Mon Aug 13, 2018 6:40 pm
by chulett
You need to tell us what "junk" means, what the value of this unwanted character is. There must be a DB2 function that will give you that, on Oracle is it called DUMP. Also, what is your APT_STRING_PADCHAR environment variable set to? Left at its default, it can look like "junk" to some folks / systems.

Posted: Tue Aug 14, 2018 12:24 pm
by asorrell
Because you have not identified what kind of "junk" you are getting - my best guess is that your project is still using the default value 0x0 (Hex 0) for padding CHAR fields. CHAR fields are always automatically padded to their max length. Your job is probably using the default value from $APT_STRING_PADCHAR in the project environment variable settings.

Hex 0 is a somewhat problematic default. The 0x0 character is invisible in most cases unless you use "cat -tv", "od" or some other hex viewing mechanism.

The quick fix is to put $APT_STRING_PADCHAR into your job and set it to a space.

Note - if other jobs are also using that 0x0 default pad character, you might have issues with data from those jobs failing joins with data from your jobs.