extracting start and end position using index function

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

kirankota79
Premium Member
Premium Member
Posts: 315
Joined: Tue Oct 31, 2006 3:38 pm

extracting start and end position using index function

Post by kirankota79 »

I have a sample data like this

"EMP_BASE_RATE=8.81","EMP_BIRTH_DATE=08/03/1987 00:00","EMP_FIRSTNAME=MANUEL","EMP_LASTNAME=BUENO","EMP_FULLNAME=BUENO, MANUEL","EMP_PAYGRP_NAME=PHASE1","EMP_SIN=removed","EMP_FLAG3=N","EMP_FLAG11=0","EMP_FLAG14=R","EMP_VAL1=H2SA002","EMP_VAL2=","EMP_VAL3=1982","EMP_VAL4=.25","EMP_VAL5=SA","EMP_VAL6=CA","EMP_VAL7=4175755","EMP_VAL8=IF0HO1","EMP_VAL17=CA","EMP_VAL20=G00S16"


From this i want to extract 3 values EMP_FIRSTNAME=MANUEL, EMP_LASTNAME=BUENO and EMP_FULLNAME=BUENO, MANUEL. I am able to get the start position for each of these using index function not able to get the last position. Can any suggest me please?
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Use the Field function and tell it you have a comma delimited source.
-craig

"You can never have too many knives" -- Logan Nine Fingers
kirankota79
Premium Member
Premium Member
Posts: 315
Joined: Tue Oct 31, 2006 3:38 pm

Post by kirankota79 »

The position of these name fields in the string is not same for each record they keep on varying like the following sample 3 records.

"EMP_FIRSTNAME=LATHEEF SYED","EMP_LASTNAME=HUSSAINY","EMP_FULLNAME=HUSSAINY, LATHEEF SYED"

"EMP_FIRSTNAME=CHISTINE","EMP_FULLNAME=BRENNECKE, CHISTINE"

"EMP_BASE_RATE=8.81","EMP_BIRTH_DATE=08/03/1987 00:00","EMP_FIRSTNAME=MANUEL","EMP_LASTNAME=BUENO","EMP_FULLNAME=BUENO, MANUEL","EMP_PAYGRP_NAME=PHASE1","EMP_SIN=removed","EMP_FLAG3=N","EMP_FLAG11=0","EMP_FLAG14=R","EMP_VAL1=H2SA002","EMP_VAL2=","EMP_VAL3=1982","EMP_VAL4=.25","EMP_VAL5=SA","EMP_VAL6=CA","EMP_VAL7=4175755","EMP_VAL8=IF0HO1","EMP_VAL17=CA","EMP_VAL20=G00S16"


In that case , how i can use the filed function.

Thanks
nagarjuna
Premium Member
Premium Member
Posts: 533
Joined: Fri Jun 27, 2008 9:11 pm
Location: Chicago

Post by nagarjuna »

Hi ,
As of now i can think of a shell script using awk to get those fields and run that script from execute activity stage .

kirankota79 wrote:The position of these name fields in the string is not same for each record they keep on varying like the following sample 3 records.

"EMP_FIRSTNAME=LATHEEF SYED","EMP_LASTNAME=HUSSAINY","EMP_FULLNAME=HUSSAINY, LATHEEF SYED"

"EMP_FIRSTNAME=CHISTINE","EMP_FULLNAME=BRENNECKE, CHISTINE"

"EMP_BASE_RATE=8.81","EMP_BIRTH_DATE=08/03/1987 00:00","EMP_FIRSTNAME=MANUEL","EMP_LASTNAME=BUENO","EMP_FULLNAME=BUENO, MANUEL","EMP_PAYGRP_NAME=PHASE1","EMP_SIN=removed","EMP_FLAG3=N","EMP_FLAG11=0","EMP_FLAG14=R","EMP_VAL1=H2SA002","EMP_VAL2=","EMP_VAL3=1982","EMP_VAL4=.25","EMP_VAL5=SA","EMP_VAL6=CA","EMP_VAL7=4175755","EMP_VAL8=IF0HO1","EMP_VAL17=CA","EMP_VAL20=G00S16"


In that case , how i can use the filed function.

Thanks
Nag
kirankota79
Premium Member
Premium Member
Posts: 315
Joined: Tue Oct 31, 2006 3:38 pm

Post by kirankota79 »

Nagarjuna, Thanks for reply.

So there is no way that we can do this from DS functions?

And one more thing other than the position is, there is a comma with in the fullname part. so using comma as delimiter in field function will remove the second part in FULL NAME.
sjfearnside
Premium Member
Premium Member
Posts: 278
Joined: Wed Oct 03, 2007 8:45 am

Post by sjfearnside »

Since the comma in the full name is inside the quotes and would be part of that string, would it still be considered the same as a non-quoted comma, i.e. field delimiter? :?

"EMP_FULLNAME=HUSSAINY, LATHEEF SYED"
Sainath.Srinivasan
Participant
Posts: 3337
Joined: Mon Jan 17, 2005 4:49 am
Location: United Kingdom

Post by Sainath.Srinivasan »

There are many ways to do it.

If you have located the position using Index, it must be easy to do a substring to extract the data.

Try it. If you hit any hurdles, give a shout with information of what you tried.
kirankota79
Premium Member
Premium Member
Posts: 315
Joined: Tue Oct 31, 2006 3:38 pm

Post by kirankota79 »

Hello Sai.....I already used the index for finding the start position for three names index(DSLink12.NAME,"EMP_FI",1), index(DSLink12.NAME,"EMP_LA",1), index(DSLink12.NAME,"EMP_FU",1). But the problem is with the end position. How can i get the index for the end position. Any help is great! Thanks
kirankota79
Premium Member
Premium Member
Posts: 315
Joined: Tue Oct 31, 2006 3:38 pm

Post by kirankota79 »

sjfearnside wrote:Since the comma in the full name is inside the quotes and would be part of that string, would it still be considered the same as a non-quoted comma, i.e. field delimiter? :?

"EMP_FULLNAME=HUSSAINY, LATHEEF SYED"
I already tried to use double code as field delimiter, but Datastage is not accepting it. Thanks
kirankota79
Premium Member
Premium Member
Posts: 315
Joined: Tue Oct 31, 2006 3:38 pm

Post by kirankota79 »

kirankota79 wrote:Hello Sai.....I already used the index for finding the start position for three names index(DSLink12.NAME,"EMP_FI",1), index(DSLink12.NAME,"EMP_LA",1), index(DSLink12.NAME,"EMP_FU",1). But the problem is with the end position. How can i get the index for the end position. Any help is great! Thanks
I am looking for a code that finds the index of the " (souble quote) after EMP_FI, EMP_LA and EMP_FU so that we get the end positions for 3.
Editor
Posts: 265
Joined: Wed Oct 30, 2002 11:19 am
Location: Dallas TX
Contact:

Post by Editor »

It has been brought to my attention of the use of real records with potentially real SS numbers. Please in the future refrain from actually posting real numbers.

THanks in advance.

Dennis James
Editor
Kryt0n
Participant
Posts: 584
Joined: Wed Jun 22, 2005 7:28 pm

Post by Kryt0n »

kirankota79 wrote: I am looking for a code that finds the index of the " (souble quote) after EMP_FI, EMP_LA and EMP_FU so that we get the end positions for 3.
Take the substring created from your first index and do an index on this to look for the double quote
kirankota79
Premium Member
Premium Member
Posts: 315
Joined: Tue Oct 31, 2006 3:38 pm

Post by kirankota79 »

Kryt0n wrote:
kirankota79 wrote:
Take the substring created from your first index and do an index on this to look for the double quote
Can you please be more clear on this? i have first position using index(DSLink12.NAME,"EMP_FI",1), then the whole name is like

"EMP_FIRSTNAME=LATHEEF SYED" .


I didn't get, what u said like " Take the substring created from your first index" I am not getting any substring from first index. Its just a position.
Kryt0n
Participant
Posts: 584
Joined: Wed Jun 22, 2005 7:28 pm

Post by Kryt0n »

So make a substring out of it... play around with substrings and the index you have, see what you get... I'm sure things will become easier once you think about it
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

So... maybe it's just me but I'm not really sure what exactly you want out of this. Are you literally trying to find and extract (as one example) this complete string?

EMP_FIRSTNAME=LATHEEF SYED

and that's it, we're done? Or do you have an EMP_FIRSTNAME field you are trying to populate, so first you need to pull that string out and then you need to parse out just the LATHEEF SYED piece as the derivation value for that field? :?

Regardless, as Kryt0n is trying to impress on you, think about what the next steps should be. You found where the target string starts, how do you identify the end of it? Think about searching from that found index position to get the position of the next double-quote and then substring out the data between those two positions. Seems like that should get you pretty close.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply