DSXchange: DataStage and IBM Websphere Data Integration Forum
View next topic
View previous topic
Add To Favorites
Author Message
satheesh_color
Participant



Joined: 16 Jun 2005
Posts: 178

Points: 2104

Post Posted: Mon Jun 11, 2018 2:36 am Reply with quote    Back to top    

DataStage® Release: 11x
Job Type: Parallel
OS: Unix
Additional info: US Address Validation using Standardize stage
Hi All,

I am new to Quality stage. I have a rquirement to validate US adresses(NA) for the below values.
Addr_Line1
Addr_Line2
City
State
Country
Zip/PostalCode

For that i am creating new ruleset(AdrValidate) with the below values

From PAT File:
\PRAGMA_START
SEPLIST " ~"
STRIPLIST " `!@#$%^*()_+={}[]|\\;\"<>?"
\PRAGMA_END

CLS File:
;-------------------------------------------------------------------------------
; Table Sort Order: 51-51 Ascending, 26-50 Ascending, 1-25 Ascending
;-------------------------------------------------------------------------------
AVENUE AVE A
AVE. AVE A
AVE, AVE A
ALLEY ALY A
ALY. ALY A
ALY, ALY A
CIRCLE CIR A
CIR. CIR A
CIR, CIR A
COURT CT A
CT. CT A
CT, CT A
COVE CV A
CV. CV A
CV, CV A
DRIVE DR A
DR. DR A
DR, DR A
FREEWAY FWY A
FWY. FWY A
FWY, FWY A
HIGHWAY HWY A
HWY. HWY A
HWY, HWY A
LANE LN A
LN. LN A
LN, LN A
PARKWAY PKWY A
PKWY. PKWY A
PKWY, PKWY A
PLACE PL A
PL. PL A
PL, PL A
STREET ST A
ST. ST A
ST, ST A
ATTENTION ATTN A
ATTN. ATTN A
ATTN, ATTN A
SUITE STE A
STE. STE A
STE, STE A
BOULEVARD BLVD A
BLVD. BLVD A
BLVD, BLVD A
ROAD RD A
RD. RD A
RD, RD A

And i am using the below in Transformer stage to get the required values
Trim(Ereplace(Ereplace(Ereplace(Ereplace(Ereplace(Ereplace(ln_std_adr.UnhandledData_AdrValidate,'C/O',''),'C/ O',''),'C /O',''),'C\O',''),'C\ O',''),'C \O',''))
as SVAddrValidate

Field(SVAddrValidate,' ~ ',1) =SVAddr_Line1
Field(Field(SVAddrValidate,' ~ ',2),' ',1) = SVAddr_Line2
Field(Field(SVAddrValidate,' ~ ',3),' ',1) =SCCity
Field(SVAddrValidate,' ~ ',4)=SVState
Field(SVAddrValidate,' ~ ',5)=SVCountry
Field(SVAddrValidate,' ~ ',6)=SVZip

SourceData:

1 DUNDAS STREET WEST, SUITE 2500=Addr_Line1
INTERWIND CORP.=Addr_Line2
TORONTO=City
ON=State
Country=CAN
M5G 1Z3=Zip

After doing the above standardization, i am getting the value like below
1 DUNDAS STREET WEST=AddrLine1
SUITE 2500=Addr_Line2(part of Addr_Line1 is overwritten to Addr_Line2)
Null=City
Country=TORONTO
State=CAN
ZIP=ON,M5G 1Z3

Kindly let me know how to overcome this issue, as i am new to this stage i am still struggling.

It could be useful on how to stip( ---(more than one dash), (more than one space), more than one comma...

Thanks & Regards,
S.R
satheesh_color
Participant



Joined: 16 Jun 2005
Posts: 178

Points: 2104

Post Posted: Mon Jun 11, 2018 7:21 am Reply with quote    Back to top    

Hi All,

After the standardize process, i got the below output
LiterL=~(Seperation for records)
Addr_Line1,Addr_Line2,City,state,Country,Zip
1 BOSTON PLACE,41ST FLOOR ~ ~ BOSTON ~ MA ~ USA ~ 02108

From the above example, while i try to get the required records i failed as Addr_Line2 is NULL, so 41ST FLOOR goes to Addr_Line2.
Field(SVAddrValidate,' ~ ',1) =SVAddr_Line1
Field(SVAddrValidate,' ~ ',2) =SVAddr_Line2
Field(SVAddrValidate,' ~ ',3) =City
Field(SVAddrValidate,' ~ ',4)=SVState
Field(SVAddrValidate,' ~ ',5)=SVCountry
Field(SVAddrValidate,' ~ ',6)=SVZip


Let me know your thoughts on the same
Rate this response:  
Not yet rated
ray.wurlod

Premium Poster
Participant

Group memberships:
Premium Members, Inner Circle, Australia Usergroup, Server to Parallel Transition Group

Joined: 23 Oct 2002
Posts: 54407
Location: Sydney, Australia
Points: 295092

Post Posted: Mon Jun 11, 2018 3:05 pm Reply with quote    Back to top    

What's wrong with using the out-of-the-box USADDR and USAREA rule sets?

_________________
RXP Services Ltd
Melbourne | Canberra | Sydney | Hong Kong | Hobart | Brisbane
currently hiring: Canberra, Sydney and Melbourne (especially seeking good business analysts)
Rate this response:  
Not yet rated
satheesh_color
Participant



Joined: 16 Jun 2005
Posts: 178

Points: 2104

Post Posted: Tue Jun 12, 2018 2:42 am Reply with quote    Back to top    

Hi Ray,

Thanks.

When i try to use USAREA
ln_std_adr.CityName_USAREA has all the values.

ln_std_adr.StateAbbreviation_USAREA is set as null and so on.
Eg: HOUSTON TX USA(ln_std_adr.CityName_USAREA)

Eg2 for Canada: SAINT BRUNO QC CAN (ln_std_adr.CityName_USAREA)
ln_std_adr.StateAbbreviation_USAREA,ln_std_adr.CountryCode_USAREA,ln_std_adr.ZipCode_USAREA set as null..

I am sorry, still i am struggling with that.

Regards,
S.R
Rate this response:  
Not yet rated
ray.wurlod

Premium Poster
Participant

Group memberships:
Premium Members, Inner Circle, Australia Usergroup, Server to Parallel Transition Group

Joined: 23 Oct 2002
Posts: 54407
Location: Sydney, Australia
Points: 295092

Post Posted: Tue Jun 12, 2018 9:18 pm Reply with quote    Back to top    

I am unable to reproduce this symptom. Have you tried "training" QualityStage with some overrides (e.g. HOUSTON is a city name)?

_________________
RXP Services Ltd
Melbourne | Canberra | Sydney | Hong Kong | Hobart | Brisbane
currently hiring: Canberra, Sydney and Melbourne (especially seeking good business analysts)
Rate this response:  
Not yet rated
Display posts from previous:       

Add To Favorites
View next topic
View previous topic
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum



Powered by phpBB © 2001, 2002 phpBB Group
Theme & Graphics by Daz :: Portal by Smartor
All times are GMT - 6 Hours