Page 1 of 1
Breaking Up Mixed Data and Classifying
Posted: Wed Sep 30, 2009 1:31 pm
by emeri1md
I'm taking mixed data, such as 3KG, and breaking them up. It's easy as long as I know for certain that the trailing characters will always be the measurement unit. However, I also get other data in the same format. This is giving me bad data in my output.
I need to take a pattern like this:
* >
and make sure that the characters belong to the measurement unit classification. I know I can use CONVERT in many ways, but none that I can think of seem like they will work. If I could get the number removed from the >, then I could easily use CONVERT, but nothing (according to the reference) seems to remove a portion of the token, just copy.
Is there a way to check a user variable for the classification of the word? I'm fairly certain there isn't.
Thanks.
Posted: Wed Sep 30, 2009 2:13 pm
by JRodriguez
Check below pattern actions to lookup the trailing alphabetic against a lookup table using convert
copy [1](-c) Temp1 ; Will extract all trailing alphabetic characters from operand 1
copy[1](n) Temp2, Will extract all leading numeric characters form operand 1
CONVERT Temp1 @CHECKMESUNIT.TBL TEMP ; if Temp1 on CHECKMESUNIT table it will return "MesUnit" string
[Temp1 = "MesUnit"]
COPY Temp1 {MesUnit}
Posted: Mon Oct 05, 2009 7:53 am
by emeri1md
That would work, except I also need to use the abbreviated form of the word. Saying that, and thinking about your solution, a messy fix would be to have two tables, one to see if the value of the characters is a unit of measure, and a second one to actually change the value to the abbreviated version.
Do you see any alternatives?
Thanks,
Matt
Posted: Mon Oct 05, 2009 5:56 pm
by stuartjvnorton
Hi Matt,
Take a look at CONVERT_S (p31 in the PAL reference).
Sounds like it might be what you want.
< ; whatever pattern
CONVERT_S [1] @units.tbl TKN U ^ ;where U is some type for units
^ | U ;the CONVERT_S permanently split it into 2 tokens.
COPY [1] {Amount}
COPY [2] {Unit}
RETYPE [1] 0
RETYPE [2] 0
The TBL has the normal and stan version of the unit and an optional Comparison Threshold, like a CLS file without the Field Type column.
Posted: Tue Oct 06, 2009 6:25 am
by emeri1md
Thanks for bringing that up, Stuart, but the patterns < or > do not always match up to measurement. It can be another value as well. I need to be able to check and see if it is a unit before committing to it. The same for other types of < and >.
Posted: Tue Oct 06, 2009 3:21 pm
by ray.wurlod
That's why Stuart recommended a lookup against units.tbl as part of his solution.
Posted: Wed Oct 07, 2009 6:24 am
by emeri1md
So you're saying that it will only change the type if it finds it in the table... Sorry, I must have missed that. Thanks for the help. I'll try it out and then mark this as resolved.