US Address Standardization

Infosphere's Quality Product

Moderators: chulett, rschirm

Post Reply
satheesh_color
Participant
Posts: 182
Joined: Thu Jun 16, 2005 2:05 am

US Address Standardization

Post by satheesh_color »

Hi All,

I am new to Quality stage. I have a rquirement to validate US adresses(NA) for the below values.
Addr_Line1
Addr_Line2
City
State
Country
Zip/PostalCode

For that i am creating new ruleset(AdrValidate) with the below values

From PAT File:
\PRAGMA_START
SEPLIST " ~"
STRIPLIST " `!@#$%^*()_+={}[]|\\;\"<>?"
\PRAGMA_END

CLS File:
;-------------------------------------------------------------------------------
; Table Sort Order: 51-51 Ascending, 26-50 Ascending, 1-25 Ascending
;-------------------------------------------------------------------------------
AVENUE AVE A
AVE. AVE A
AVE, AVE A
ALLEY ALY A
ALY. ALY A
ALY, ALY A
CIRCLE CIR A
CIR. CIR A
CIR, CIR A
COURT CT A
CT. CT A
CT, CT A
COVE CV A
CV. CV A
CV, CV A
DRIVE DR A
DR. DR A
DR, DR A
FREEWAY FWY A
FWY. FWY A
FWY, FWY A
HIGHWAY HWY A
HWY. HWY A
HWY, HWY A
LANE LN A
LN. LN A
LN, LN A
PARKWAY PKWY A
PKWY. PKWY A
PKWY, PKWY A
PLACE PL A
PL. PL A
PL, PL A
STREET ST A
ST. ST A
ST, ST A
ATTENTION ATTN A
ATTN. ATTN A
ATTN, ATTN A
SUITE STE A
STE. STE A
STE, STE A
BOULEVARD BLVD A
BLVD. BLVD A
BLVD, BLVD A
ROAD RD A
RD. RD A
RD, RD A

And i am using the below in Transformer stage to get the required values
Trim(Ereplace(Ereplace(Ereplace(Ereplace(Ereplace(Ereplace(ln_std_adr.UnhandledData_AdrValidate,'C/O',''),'C/ O',''),'C /O',''),'C\O',''),'C\ O',''),'C \O',''))
as SVAddrValidate

Field(SVAddrValidate,' ~ ',1) =SVAddr_Line1
Field(Field(SVAddrValidate,' ~ ',2),' ',1) = SVAddr_Line2
Field(Field(SVAddrValidate,' ~ ',3),' ',1) =SCCity
Field(SVAddrValidate,' ~ ',4)=SVState
Field(SVAddrValidate,' ~ ',5)=SVCountry
Field(SVAddrValidate,' ~ ',6)=SVZip

SourceData:

1 DUNDAS STREET WEST, SUITE 2500=Addr_Line1
INTERWIND CORP.=Addr_Line2
TORONTO=City
ON=State
Country=CAN
M5G 1Z3=Zip

After doing the above standardization, i am getting the value like below
1 DUNDAS STREET WEST=AddrLine1
SUITE 2500=Addr_Line2(part of Addr_Line1 is overwritten to Addr_Line2)
Null=City
Country=TORONTO
State=CAN
ZIP=ON,M5G 1Z3

Kindly let me know how to overcome this issue, as i am new to this stage i am still struggling.

It could be useful on how to stip( ---(more than one dash), (more than one space), more than one comma...

Thanks & Regards,
S.R
satheesh_color
Participant
Posts: 182
Joined: Thu Jun 16, 2005 2:05 am

Post by satheesh_color »

Hi All,

After the standardize process, i got the below output
LiterL=~(Seperation for records)
Addr_Line1,Addr_Line2,City,state,Country,Zip
1 BOSTON PLACE,41ST FLOOR ~ ~ BOSTON ~ MA ~ USA ~ 02108

From the above example, while i try to get the required records i failed as Addr_Line2 is NULL, so 41ST FLOOR goes to Addr_Line2.
Field(SVAddrValidate,' ~ ',1) =SVAddr_Line1
Field(SVAddrValidate,' ~ ',2) =SVAddr_Line2
Field(SVAddrValidate,' ~ ',3) =City
Field(SVAddrValidate,' ~ ',4)=SVState
Field(SVAddrValidate,' ~ ',5)=SVCountry
Field(SVAddrValidate,' ~ ',6)=SVZip


Let me know your thoughts on the same
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

What's wrong with using the out-of-the-box USADDR and USAREA rule sets?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
satheesh_color
Participant
Posts: 182
Joined: Thu Jun 16, 2005 2:05 am

Post by satheesh_color »

Hi Ray,

Thanks.

When i try to use USAREA
ln_std_adr.CityName_USAREA has all the values.

ln_std_adr.StateAbbreviation_USAREA is set as null and so on.
Eg: HOUSTON TX USA(ln_std_adr.CityName_USAREA)

Eg2 for Canada: SAINT BRUNO QC CAN (ln_std_adr.CityName_USAREA)
ln_std_adr.StateAbbreviation_USAREA,ln_std_adr.CountryCode_USAREA,ln_std_adr.ZipCode_USAREA set as null..

I am sorry, still i am struggling with that.

Regards,
S.R
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

I am unable to reproduce this symptom. Have you tried "training" QualityStage with some overrides (e.g. HOUSTON is a city name)?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply