Page 1 of 1

Removal of Special Character

Posted: Sat May 08, 2010 10:21 pm
by shobhit_vk_gupta
Can anybody tell how to remove special characters from a string in DataStage?

Posted: Sun May 09, 2010 3:34 am
by ray.wurlod
Yes but, before we do, first explain why these characters are "special" and whether you have the permission of the owner of the data to make such a change.

Removal of Special Character

Posted: Tue May 11, 2010 5:19 am
by shobhit_vk_gupta
Yes I do have permission of the owner. This was the smal requirement to do AphaNumeric comparision on two columns. So thats why I was trying to remove special characters from the column

Posted: Tue May 11, 2010 6:08 am
by chulett
No-one can help you unless you can (and do) explain exactly what you mean by "special characters". By itself that statement means nothing, unfortunately.

Posted: Tue May 11, 2010 4:58 pm
by ray.wurlod
Except for Special Ed.

He's definitely a special character.

Posted: Wed May 12, 2010 10:23 am
by shobhit_vk_gupta
Special characters are like:

!@#$%^&*()_-+={}[]|\:";'?><,./~`

In simple words any non Alphanumeric characters are special characters

Posted: Wed May 12, 2010 10:30 am
by anbu
Use external filter

Code: Select all

Filter command : sed "s/[^0-9A-Za-z]//"

Posted: Wed May 12, 2010 4:32 pm
by ray.wurlod
Philosophically, they're not special, they're just non alphanumeric.

Code: Select all

Convert("!@#$%^&*()_-+={}[]|\:;'?><,./~`" : '"', "", InLink.TheString)

Posted: Fri May 14, 2010 9:51 pm
by shobhit_vk_gupta
I am given examples of Non Aplhanumeric Codes. Kindly let me know if their is any general logic.

Posted: Sat May 15, 2010 6:45 am
by chulett
You mean other than what people have already posted? :?

Think about what it is you need to accomplish, you should be able to solve this. You can either build a list of everything you want to delete and then do so (one of the solutions posted) or build a list of everything you want to keep and then delete everything else (the other solution posted). If you want to do this all "in DataStage" the same function can be used for both - Convert() - with one being a little simpler than the other while the 'better' solution (the latter, IMHO) is a little trickier but will be quite interesting and easy to understand once you see it.

Give this a shot and let us know if you have any specific questions.

ps. Both methodologies and the whole issue of "special" characters have been discussed here ad nauseam. A proper Exact Match search should turn them up.