Bug in NUM() Function ?

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
skathaitrooney
Participant
Posts: 103
Joined: Tue Jan 06, 2015 4:30 am

Bug in NUM() Function ?

Post by skathaitrooney »

Hi Experts,

I was using NUM() function in a transformer wherein i faced a peculiar issue.

Below is the definition of NUM function per IBM:
Num (expression)
expression is the expression to test. If expression is a number, a numeric string, or an empty string, a value of 1 is returned. If it is a null value, null is returned; otherwise 0 is returned.

Example: NUM(35E1)

After applying Num function it is should return 0, as field holds both numeric and alphanumeric characters. but it is returning 1

According to me due to the 'E' in the expresion, it is considering the number as an exponential form.

It would work as expected if i test it using NUM(35AE1), NUM(35C1) etc.

Any workaround anyone can suggest to overcome this ?
ds_developer
Premium Member
Premium Member
Posts: 224
Joined: Tue Sep 24, 2002 7:32 am
Location: Denver, CO USA

Post by ds_developer »

I'd suggest adding a test for the 'E' such as:

Num(nodupes.BOUNDARY_AREA) AND Index(nodupes.BOUNDARY_AREA,'E',1)

Hope this helps,
John
qt_ky
Premium Member
Premium Member
Posts: 2895
Joined: Wed Aug 03, 2011 6:16 am
Location: USA

Post by qt_ky »

Here is the reference text quoted from the Parallel Job Developer's Guide:
Num Returns 1 if string can be converted to a number, or 0 otherwise.
- Input: string (string)
- Output: result (int32)
Scientific notation strings can be converted to numbers, so perhaps the function is actually giving correct results.
Choose a job you love, and you will never have to work a day in your life. - Confucius
skathaitrooney
Participant
Posts: 103
Joined: Tue Jan 06, 2015 4:30 am

Post by skathaitrooney »

Yes indeed.. But is their any workaround? I mean in my case this is a complete string and not a number in scientific notation ?
I have tried multiple workarounds to fail.
Any suggestions are welcomed.
qt_ky
Premium Member
Premium Member
Posts: 2895
Joined: Wed Aug 03, 2011 6:16 am
Location: USA

Post by qt_ky »

As John suggested above, you could supplement the Num() function as a workaround to test for presence of an "E" within your string, then take action accordingly. That assumes that "E" is the only exception that you see getting through, in your case.

Another alternative is to search DSXchange on using "double convert" function method, and compare before and after values. That's an interesting and elegant method, although it's got to be overkill as far as CPU cycles go.
Choose a job you love, and you will never have to work a day in your life. - Confucius
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

The "double convert" is useful when you want to remove all "bad" characters from a string without specifying them, rather you list the "good" ones to keep. I'm not sure that would be all that helpful here since (as you noted) the Index will work just fine to detect the 'E' in the string. I'll tweak John's code slightly and say that it is actually a number if:

Code: Select all

Num(nodupes.BOUNDARY_AREA) AND Index(nodupes.BOUNDARY_AREA,'E',1) = 0

-or-

Num(nodupes.BOUNDARY_AREA) AND Not(Index(nodupes.BOUNDARY_AREA,'E',1))
-craig

"You can never have too many knives" -- Logan Nine Fingers
UCDI
Premium Member
Premium Member
Posts: 383
Joined: Mon Mar 21, 2016 2:00 pm

Post by UCDI »

lots of ways to do it... if it is just the E character, num(ereplace(input)) type statement might do the trick. This might be one of the more efficient ways, at the cost of only working for the one situation. If you have to start stringing junk together like num(ereplace(upcase(... input) at some point you have iterated over the string 20 times just to do what can be done in a single iteration*


*I can't think of a single iteration in datastage using the built in but weak string processing functions. That would require a dedicated function / routine type effort, but it is doable. Datastage's string processing leaves much to be desired, to be honest.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

35E1 is a numeric representation ("scientific" notation). It represents

Code: Select all

 35 * 10 ** 1
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Re: Bug in NUM() Function ?

Post by chulett »

We knows. From the original post:
skathaitrooney wrote:According to me due to the 'E' in the expresion, it is considering the number as an exponential form.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Mike
Premium Member
Premium Member
Posts: 1021
Joined: Sun Mar 03, 2002 6:01 pm
Location: Tampa, FL

Post by Mike »

I'll toss out another option since no one has mentioned it yet (there are always options):

Code: Select all

IsValid("decimal[4,0]", "35E1")
This will return 0 since the string literal 35E1 is not a valid decimal(4,0).

Code: Select all

IsValid("decimal[4,0]", 35E1)
This will return 1 since the numeric literal 35E1 is a valid decimal(4,0).

Code: Select all

IsValid("decimal[4,0]", 35E3)
This will return 0 since the numeric literal 35E3 is not a valid decimal(4,0).

Mike
Post Reply