Split unknown input field. Code Review & suggestions ple

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
larryoceanview
Participant
Posts: 70
Joined: Fri Dec 26, 2003 3:14 pm
Location: Plantation, FL

Split unknown input field. Code Review & suggestions ple

Post by larryoceanview »

I need to split an address coming in in one field into multiple fields for correcting.

I came up with a function that can work but is ugly! :(

And since I will probably have to do this over & over I would like to come up with something that is re-usable.

Any :idea: 'S .


Code: Select all

Var1= " "               
field1= " "
field1= " "
field3= " "
field4= " "
field5= " "
changedfield = " "
corrected = " "

Begin Case
  Case  Arg1 MATCHES  "...'1313 ROCKINGBIRD LANE'..." 
     Var1 = Ereplace(Arg1, " ", "~")
     field1= Field(Var1,"~",1)
     field2= Field(Var1,"~",2) 
     field3= Field(Var1,"~",3)
     field4= Field(Var1,"~",4) 
     field5= Field(Var1,"~",5)
End Case
Begin Case
     Case field1 = "ROCKINGBIRD"
          changedfield = "MOCKINGBIRD"
          corrected = changedfield : field2 : field3 : field4 : field5
     Case field2 = "ROCKINGBIRD"
          changedfield = "MOCKINGBIRD"
          corrected = field1: "|" : changedfield : "|" : field3: "|" : field4: "|" : field5
         
     Case field3 =  "ROCKINGBIRD"
          changedfield = "MOCKINGBIRD"
          corrected = field1 :  field2: changedfield : field4 : field5
     Case field4 = "ROCKINGBIRD"
          changedfield = "MOCKINGBIRD"
          corrected = field1 : field2 : field3 changedfield : field5
     Case field5 =  "ROCKINGBIRD"
          changedfield = "MOCKINGBIRD"
          corrected = field1 :  field2: field3 : field4 : changedfield
     Case @True                   ;* all other values
          corrected = Arg1
End Case
 
Ans = corrected   
vmcburney
Participant
Posts: 3593
Joined: Thu Jan 23, 2003 5:25 pm
Location: Australia, Melbourne
Contact:

Post by vmcburney »

Let's start off with a couple address standardisation basics:
- QualityStage is the best tool for this type of work. You can spend weeks and months writing and testing address standardisation routines in DataStage or you could get better quality out of QualityStage in under a week.
- You should not correct the spelling of street names unless you have access to an address matching product certified by your postal authority, or you are using QualityStage address matching jobs.

You've provided a sample address which you standardise in a routine. We could certainly give you code that will split up these type of addresses if every address was in the same format: street number followed by street name followed by street type. The problem is with the hundreds of other address formats including unit numbers, floor numbers, po boxes, shopping centers, property titles, street number ranges, data entry errors etc etc.

If you cannot get QualityStage into your workplace then have a look at all your addresses and take note of the different formats, then design your standardisation routines as a SDLC with time spent on analysis and design first. Try not to write any code until you have worked out how to tackle all the combinations.
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

Code: Select all

Var1 = Arg1
Begin Case 
  Case  Arg1 MATCHES  "...'1313 ROCKINGBIRD LANE'..." 
     Var1 = Ereplace(Arg1, "ROCKINGBIRD", "MOCKINGBIRD") 
End Case 
Ans = Var1
Mamu Kim
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

I'd just like to reinforce as strongly as possible everything Vincent posted. He was even nice enough to use US spelling for you! 8)
In particular, if you are going to "roll your own" do not stint on planning and design time.
US Postal certification is available to plug in directly to Quality Stage (which is the product formerly known as INTEGRITY from Vality, which was acquired by Ascential). It really will return the investment quickly, compared to a do-it-yourself approach, and won't miss any of the weird scenarios that might be encountered.
And it will handle it when you need to process Cuban and Haitian address formats! :wink:
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
larryoceanview
Participant
Posts: 70
Joined: Fri Dec 26, 2003 3:14 pm
Location: Plantation, FL

Roll Your Own!

Post by larryoceanview »

I agree!

It seems like the attempted clean-up was a waste and did more harm then good besides waste valuable time!

I was trying to emulate existing code for migration, You didn't think this was my brainstorm? :roll:

There is an address process that is run down the line that should handle this.

I already wasted a day on this ( was a lot of other attempted fixes like adding a dash to a non-numeric street number, How would you know? without having the tools you mentioned) and was given the command to move out. :D
But who knows I might wind up back here later. :cry:


Thanks for all your replies.
Post Reply