Page 1 of 1

Breaking Up Mixed Data and Classifying

Posted: Wed Sep 30, 2009 1:31 pm
by emeri1md
I'm taking mixed data, such as 3KG, and breaking them up. It's easy as long as I know for certain that the trailing characters will always be the measurement unit. However, I also get other data in the same format. This is giving me bad data in my output.

I need to take a pattern like this:

* >

and make sure that the characters belong to the measurement unit classification. I know I can use CONVERT in many ways, but none that I can think of seem like they will work. If I could get the number removed from the >, then I could easily use CONVERT, but nothing (according to the reference) seems to remove a portion of the token, just copy.

Is there a way to check a user variable for the classification of the word? I'm fairly certain there isn't.

Thanks.

Posted: Wed Sep 30, 2009 2:13 pm
by JRodriguez
Check below pattern actions to lookup the trailing alphabetic against a lookup table using convert

copy [1](-c) Temp1 ; Will extract all trailing alphabetic characters from operand 1
copy[1](n) Temp2, Will extract all leading numeric characters form operand 1

CONVERT Temp1 @CHECKMESUNIT.TBL TEMP ; if Temp1 on CHECKMESUNIT table it will return "MesUnit" string

[Temp1 = "MesUnit"]
COPY Temp1 {MesUnit}

Posted: Mon Oct 05, 2009 7:53 am
by emeri1md
That would work, except I also need to use the abbreviated form of the word. Saying that, and thinking about your solution, a messy fix would be to have two tables, one to see if the value of the characters is a unit of measure, and a second one to actually change the value to the abbreviated version.

Do you see any alternatives?

Thanks,

Matt

Posted: Mon Oct 05, 2009 5:56 pm
by stuartjvnorton
Hi Matt,

Take a look at CONVERT_S (p31 in the PAL reference).
Sounds like it might be what you want.

< ; whatever pattern
CONVERT_S [1] @units.tbl TKN U ^ ;where U is some type for units

^ | U ;the CONVERT_S permanently split it into 2 tokens.
COPY [1] {Amount}
COPY [2] {Unit}
RETYPE [1] 0
RETYPE [2] 0

The TBL has the normal and stan version of the unit and an optional Comparison Threshold, like a CLS file without the Field Type column.

Posted: Tue Oct 06, 2009 6:25 am
by emeri1md
Thanks for bringing that up, Stuart, but the patterns < or > do not always match up to measurement. It can be another value as well. I need to be able to check and see if it is a unit before committing to it. The same for other types of < and >.

Posted: Tue Oct 06, 2009 3:21 pm
by ray.wurlod
That's why Stuart recommended a lookup against units.tbl as part of his solution.

Posted: Wed Oct 07, 2009 6:24 am
by emeri1md
So you're saying that it will only change the type if it finds it in the table... Sorry, I must have missed that. Thanks for the help. I'll try it out and then mark this as resolved.