Issue :: I have a column As CustomerName varchar(50) where the Data is represented as below
MHI三原 ホストダウンサイジング 棚卸フェーズ
I want to know whether these data is a Japanece/chianece charactors or Junck Charactors.
Can any one guide me to find the correct stage/rules to define in quality stage which will divide the data according to the chainese/japanese/ Junk.
Please can any one advice on this.
Recognization on junk
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
These are Japanese characters.
There are no rules in QualityStage for dividing characters based on the character sets to which each belongs. That's not really what QualityStage is for, although you might be able to create a heavily customised rule set.
The preferred tool would be DataStage, and you'd still need some custom code to identify whether a particular character belongs to a particular character set (aka code page). But why?
Know also that Chinese, Japanese and Korean share a few hundred characters (known as the CJK characters under the Unicode standards).
There are no rules in QualityStage for dividing characters based on the character sets to which each belongs. That's not really what QualityStage is for, although you might be able to create a heavily customised rule set.
The preferred tool would be DataStage, and you'd still need some custom code to identify whether a particular character belongs to a particular character set (aka code page). But why?
Know also that Chinese, Japanese and Korean share a few hundred characters (known as the CJK characters under the Unicode standards).
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
That is standard japanese text, discussing host downsizing.
<a href=http://www.worldcommunitygrid.org/team/ ... TZ9H4CGVP1 target="WCGWin">
</a>
</a>
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Recognization on junk
Actually i am desired to change the given character to English.
As this was arriving on a daily basis ,I want to work out a process in datastage which will convert Japanese Characters to English Characters.
Can this process be implemented in Datastage itself?
Thanks
As this was arriving on a daily basis ,I want to work out a process in datastage which will convert Japanese Characters to English Characters.
Can this process be implemented in Datastage itself?
Thanks
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Yes, but not meaningfully. For example you can transliterate (specify the sound of a Japanese character using English characters, such as "東" and "京" becoming "Tō" and "kyō") but there is no one-to-one correspondence between CJK characters and English characters. Anyone who specifies such a requirement is ignorant of the differences. Resist stupid requirements!
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact: