Seperate Numeric field into two with Charcaters as delimitte

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
Abhinav
Premium Member
Premium Member
Posts: 65
Joined: Tue Jun 29, 2004 10:26 am
Location: California

Seperate Numeric field into two with Charcaters as delimitte

Post by Abhinav »

Hi,

I have source files with one of the column which has values as

Code: Select all


       Field1

  12345 + 3456
  12345 and 897698
  234455 * 82379
 783845 &*# 2938

I need to have my ouput into two columns with only numeric values into them, i should treat any special charcters as my delimitter.

My output should like

Code: Select all

  Column1     Column2

   12345          3456
   12345        897698
  234455        82379
  783845         2938

Is this possible to accomplish through datastage in server edition if yes can you throw some ides on this.

I appreciate your help.

Thanks

Abhi
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

Write a DS Function like ParseYourText(Arg1, Arg2) where Arg1 is the string and Arg2 is the position you need returned (in this case usage is 1 or 2).

Tryout some code like:

Code: Select all

MyText = Arg1
CONVERT "!@#$%^&*()_+-=[]\;';,./<>?" TO "" IN MyText
MyText=CHANGE(UPCASE(MyText), "AND", " ")
MyText = TRIM(MyText)
Ans = FIELD(MyText, " ",Arg2)
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
Woth
Participant
Posts: 24
Joined: Thu Mar 18, 2004 8:18 pm

Post by Woth »

hey!

Column1 = myString[1,index(field1," ",1)-1]
Column2 = myString[index(field1," ",2)+1,len(field1)]


you might have to play around with the +1 and -1 in each. and might have to trim 'myString' before you start playing around with it.

Basically, you search for the first 'space' . that gives you the end of your first answer. Then you search for the second 'space', that gives you the start of your next answer

hope it helps
Woth
Participant
Posts: 24
Joined: Thu Mar 18, 2004 8:18 pm

Post by Woth »

terribly sorry


it should read

Column1 = field1[1,index(field1," ",1)-1]
Column2 = field1[index(field1," ",2)+1,len(field1)]
NBALA
Participant
Posts: 48
Joined: Tue Jul 11, 2006 11:52 am
Location: IL, USA

Post by NBALA »

Hi,

You can use BASIC Field function,

Field (string, delimiter, instance [ ,number] )

String is your input - 234455 * 82379 in your case

delimiter is the character that delimits the substring

instance specifies which instance of delimiter terminates the substring.

number specifies the number of delimited substrings to return.

Code: Select all

 Field("234455 * 82379", "*",1) 
Will Returns 234455

Code: Select all

 Trim(Field("234455 * 82379", "*",2)) 
will returns 82379

Hope this helps.

-NB
Abhinav
Premium Member
Premium Member
Posts: 65
Joined: Tue Jun 29, 2004 10:26 am
Location: California

Seperate fields into two from one

Post by Abhinav »

The problem is we are not sure what would be the delimitter, the only thing we are sure is we will have non numeric characters as the delimitter, it could be a space or * or + or and or any charcater and of any length, we are not sure the length of the entire field as well.

Is there a way to do this in Datastage.

Thanks for all your thoghts so far,

Any further suggestion to get this resolved would be appreciated.

Thanks

Abhi


Woth wrote:hey!

Column1 = myString[1,index(field1," ",1)-1]
Column2 = myString[index(field1," ",2)+1,len(field1)]


you might have to play around with the +1 and -1 in each. and might have to trim 'myString' before you start playing around with it.

Basically, you search for the first 'space' . that gives you the end of your first answer. Then you search for the second 'space', that gives you the start of your next answer

hope it helps
OttMAdpttch
Charter Member
Charter Member
Posts: 6
Joined: Thu Mar 27, 2003 1:55 pm
Contact:

Re: Seperate Numeric field into two with Charcaters as delim

Post by OttMAdpttch »

Probably the simplest means of accomplishing the extraction of one or the other values is to create a simply Datastage Transform that has three visible arguments: DataToParse (actual data value), Delimiter (such as space), and ValueToReturn (integer value 1,2,3, etc. to indicate which value you want returned). In the definition field of the transform insert the following command:

Field(Trim(%DataToParse%),%Delimiter%,%ValueToReturn%))

Once you've created this transform, you'll be able to use it in the derivation of your target column(s). Keep in mind that a transform has pros and cons: 1) pro: it is less expensive to use a transform then to call a routine because it is compiled as part data stage job's object. 2) con: if you change the transform you'll need to recompile any jobs that are using it.

Here is what the derivation column might look like if your data was "1234 + 5678" and the transform is named ParseData:

ParseData(linkname.column," ",1) returns 1234
ParseData(linkname.column," ",3) returns 5678
ParseData(linkname.column,"+",1) returns 1234b (b represents space)
ParseData(linkname.column,"+",2) returns b5678 (b represents space)

Hope this helps!

Mark

Abhinav wrote:Hi,

I have source files with one of the column which has values as

Code: Select all


       Field1

  12345 + 3456
  12345 and 897698
  234455 * 82379
 783845 &*# 2938

I need to have my ouput into two columns with only numeric values into them, i should treat any special charcters as my delimitter.

My output should like

Code: Select all

  Column1     Column2

   12345          3456
   12345        897698
  234455        82379
  783845         2938

Is this possible to accomplish through datastage in server edition if yes can you throw some ides on this.

I appreciate your help.

Thanks

Abhi
Mark Ott
DataStage Architect
Adept Technologies, Inc.
Abhinav
Premium Member
Premium Member
Posts: 65
Joined: Tue Jun 29, 2004 10:26 am
Location: California

Re: Seperate Numeric field into two with Charcaters as delim

Post by Abhinav »

Hi Mark,

Thanks for your help on this, but my problem i dodnot know what could be my delimitter inorder to use field function. Delimitter varies for each record and it could be any value other than numeric as i mentioned, my problem is i cannot use any defined delimitter.

I should have a solution in which i should be abble to parse any NON Numeric characters of any length in between the numeric values as my delimitter.

Is this possible. If yes how can we accomplish this.


Thanks

Abhi
OttMAdpttch wrote:Probably the simplest means of accomplishing the extraction of one or the other values is to create a simply Datastage Transform that has three visible arguments: DataToParse (actual data value), Delimiter (such as space), and ValueToReturn (integer value 1,2,3, etc. to indicate which value you want returned). In the definition field of the transform insert the following command:

Field(Trim(%DataToParse%),%Delimiter%,%ValueToReturn%))

Once you've created this transform, you'll be able to use it in the derivation of your target column(s). Keep in mind that a transform has pros and cons: 1) pro: it is less expensive to use a transform then to call a routine because it is compiled as part data stage job's object. 2) con: if you change the transform you'll need to recompile any jobs that are using it.

Here is what the derivation column might look like if your data was "1234 + 5678" and the transform is named ParseData:

ParseData(linkname.column," ",1) returns 1234
ParseData(linkname.column," ",3) returns 5678
ParseData(linkname.column,"+",1) returns 1234b (b represents space)
ParseData(linkname.column,"+",2) returns b5678 (b represents space)

Hope this helps!

Mark

Abhinav wrote:Hi,

I have source files with one of the column which has values as

Code: Select all


       Field1

  12345 + 3456
  12345 and 897698
  234455 * 82379
 783845 &*# 2938

I need to have my ouput into two columns with only numeric values into them, i should treat any special charcters as my delimitter.

My output should like

Code: Select all

  Column1     Column2

   12345          3456
   12345        897698
  234455        82379
  783845         2938

Is this possible to accomplish through datastage in server edition if yes can you throw some ides on this.

I appreciate your help.

Thanks

Abhi
Mike
Premium Member
Premium Member
Posts: 1021
Joined: Sun Mar 03, 2002 6:01 pm
Location: Tampa, FL

Post by Mike »

Original post at 13:21 (my time). Solution posted by Ken at 14:16. Now 16:34. Seems like becoming a premium member so that you can see the solution would be a real bargain. :roll:

Mike
Last edited by Mike on Thu Sep 14, 2006 3:36 pm, edited 1 time in total.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

The MatchField() function will work in this case.
MatchField(InLink.TheString, "0N0X0N",1) for the first numeric field, MatchField(InLink.TheString, "0N0X0N",3) for the second numeric field.

Read about this function in online help or in the DataStage BASIC manual.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Abhinav
Premium Member
Premium Member
Posts: 65
Joined: Tue Jun 29, 2004 10:26 am
Location: California

Post by Abhinav »

Thanks a lot Ray,

I hope this will work for me, i will try with MatchField function and if i still need any further help i will get back to you. I hope i will not.

Thanks everyone for all your thoughts, inputs and suggestions.

I really appreciate all your help and support.

Thanks

Abhi


ray.wurlod wrote:The MatchField() function will work in this case.
MatchField(InLink.TheString, "0N0X0N",1) for the first numeric field, MatchField(InLink.TheString, "0N0X0N",3) for the second numeric field.

Read ...
Post Reply