DSXchange: DataStage and IBM Websphere Data Integration Forum
View next topic
View previous topic
Add To Favorites
Author Message
arpitchopra
Participant



Joined: 01 Nov 2010
Posts: 35
Location: New Delhi
Points: 320

Post Posted: Fri Aug 31, 2012 5:26 am Reply with quote    Back to top    

DataStage® Release: 8x
Job Type: Parallel
OS: Unix
Hi Experts !

I have a parallel job that reads from a flat file and loads into a Netezza target. It has the following layout:

File --> Transformer --> Netezza Target

The transformer has a stage variable which checks whether a given column is Numeric or not. The derivation for this stage variable is as follows:

If TrimLeadingTrailing(Lk1.Quantity)='' or IsValid('decimal[18,3]',TrimLeadingTrailing(Lk1.Quantity))=1 Then '' Else 'Quantity is not a valid decimal'

I tried the following test cases:
Quantity: abc
Var: Quantity is not a valid decimal

Quantity: 123abc
Var: Quantity is not a valid decimal

Quantity: 0123 abc
Var: ''


This last test case is something that is troubling me. Even though the field is not a Numeric, it still passes the check. I have a feeling that this is because of the Trim function (which sees the space and discards everything after that space). Please advise.

_________________
Arpit Chopra
ShaneMuir



Group memberships:
Premium Members

Joined: 15 Jun 2004
Posts: 384
Location: London
Points: 3055

Post Posted: Fri Aug 31, 2012 5:47 am Reply with quote    Back to top    

arpitchopra wrote:
I have a feeling that this is because of the Trim function (which sees the space and discards everything after that space).


Did you try outputting the return value of the trim function to see what was actually output?
Rate this response:  
Not yet rated
arpitchopra
Participant



Joined: 01 Nov 2010
Posts: 35
Location: New Delhi
Points: 320

Post Posted: Fri Aug 31, 2012 6:07 am Reply with quote    Back to top    

I did. And surprisingly, the output for the 3rd case is 123.000 !
Is this how TrimLeadingTrailing usually works ? Isn't it just supposed to remove the white spaces from the beginning and the end of the entire string ?

ShaneMuir wrote:
Did you try outputting the return value of the trim function to see what was actually output?

_________________
Arpit Chopra
Rate this response:  
Not yet rated
ShaneMuir



Group memberships:
Premium Members

Joined: 15 Jun 2004
Posts: 384
Location: London
Points: 3055

Post Posted: Fri Aug 31, 2012 6:20 am Reply with quote    Back to top    

Its actually the IsValid test that is not working. If you were to pass your input un trimmed into the IsValid test you would probably find that it returns TRUE.

TrimleadingTrailing is used to remove space from the beginning and end of fields and leave multiple spaces in the middle of a string if they are there, unlike Trim which would remove the multiple spaces and leave just the one.

I wouldn't be surprised if the IsValid test only looks until the space and drops the rest of the line item.

As to why that would be the case - I do not know.
Rate this response:  
Not yet rated
arpitchopra
Participant



Joined: 01 Nov 2010
Posts: 35
Location: New Delhi
Points: 320

Post Posted: Fri Aug 31, 2012 6:35 am Reply with quote    Back to top    

ShaneMuir wrote:
Its actually the IsValid test that is not working. If you were to pass your input un trimmed into the IsValid test you would probably find that it returns TRUE.

I don't want to do that as my file may contain those junk spaces.
Consider a scenario where the input is " 12" (or "<space><space><space>12"). I want this to pass the test and be stored as 12.000
Won't IsValid without a Trim fail this case ?

ShaneMuir wrote:
I wouldn't be surprised if the IsValid test only looks until the space and drops the rest of the line item.
As to why that would be the case - I do not know.

You are right. I have applied TrimLeadingTrailing in the derivation for a few Varchar columns and it behaves as expected. It doesn't discard everything after the 1st space.




Do you have an alternative to test for this then ?

_________________
Arpit Chopra
Rate this response:  
Not yet rated
ShaneMuir



Group memberships:
Premium Members

Joined: 15 Jun 2004
Posts: 384
Location: London
Points: 3055

Post Posted: Fri Aug 31, 2012 6:35 am Reply with quote    Back to top    

I would suggest using the Num() or Alpha() functions to find those values which contain only numbers or alpha characters.
Rate this response:  
Not yet rated
arpitchopra
Participant



Joined: 01 Nov 2010
Posts: 35
Location: New Delhi
Points: 320

Post Posted: Fri Aug 31, 2012 6:44 am Reply with quote    Back to top    

ShaneMuir wrote:
I would suggest using the Num() or Alpha() functions to find those values which contain only numbers or alpha characters.

But this won't help me check for the precision. I need to discard everything that has precision greater than 3.

I mean we can ideally write a code that will check for the number of characters after "." but then it will be a huge effort to explain the multiple substrings and indices in the formula to someone else. I was hoping for something more subtle and simple. Like the IsValid. If only that would work. Sad

_________________
Arpit Chopra
Rate this response:  
Not yet rated
ShaneMuir



Group memberships:
Premium Members

Joined: 15 Jun 2004
Posts: 384
Location: London
Points: 3055

Post Posted: Fri Aug 31, 2012 6:53 am Reply with quote    Back to top    

Well you could try the NUM() function in conjunction with IsValid(). I am pretty sure NUM() will return true for 0123 and 0123.01 or 123.456, but false for 0123 abc.

Also I am pretty sure that the isvalid test won't actually test whether the input string has the relevant precision rather that it can be converted to the relevant precision. So 0123 and 0123.01 and 123.456 would all return true for the IsValid(Decimal) check.

If you want to check whether something has only 3 decimals from a string text I would use something like len(trim(input.string['.',2,1]))=3 or similar.
Rate this response:  
Not yet rated
arpitchopra
Participant



Joined: 01 Nov 2010
Posts: 35
Location: New Delhi
Points: 320

Post Posted: Fri Aug 31, 2012 7:01 am Reply with quote    Back to top    

ShaneMuir wrote:
Also I am pretty sure that the isvalid test won't actually test whether the input string has the relevant precision rather that it can be converted to the relevant precision. So 0123 and 0123.01 and 123.456 would all return true for the IsValid(Decimal) check

You are right. This was a mistake on my part.

However, when I write the following code as a derivation to the same Quantity column, I get 1234.000 as the output.

Code:
StringToDecimal(TrimLeadingTrailing(Lk1.Quantityt),"round_inf")

I don't think it is an issue with IsValid.

_________________
Arpit Chopra
Rate this response:  
Not yet rated
ShaneMuir



Group memberships:
Premium Members

Joined: 15 Jun 2004
Posts: 384
Location: London
Points: 3055

Post Posted: Fri Aug 31, 2012 7:14 am Reply with quote    Back to top    

I suggested that IsValid had an issue, mainly because of your original post which suggested that it was the TrimLeadingTrailing giving the incorrect result, when actually your incorrect result was because the IsValid determined that the string '0123 abc' was a valid decimal and thus created an output you were not expecting.

Also, If you where to pass the string '0123 abc' to a column that is set as decimal with a specific precision, with no functions, conversions etc, i am guessing that it would convert the first part and drop the second part. Ie your output would be eg 123.000. As to my understanding this is because the IsValid test effectively tries to do a StringtoDecimal conversion, if it can, then its valid, if not then invalid. Passing the string to a decimal output would have the same result but you would get the errors written to the log.
Rate this response:  
Not yet rated
arpitchopra
Participant



Joined: 01 Nov 2010
Posts: 35
Location: New Delhi
Points: 320

Post Posted: Wed Sep 12, 2012 3:56 am Reply with quote    Back to top    

It is actually an issue with 2 things.
1. The first is that StringToDecimal() will automatically convert any string to a decimal (regardless of the format specified) if it CAN be converted. So, "1234 abc" will be converted to 1234
2. The second issue (and a bigger one) is with IsValid(). This function will also work in the same way. It will return true if the string CAN be converted to a decimal. It expects you to handle that conversion in the derivation of the target column.

Please read the following link for further information:
http://www-01.ibm.com/support/docview.wss?uid=swg1JR39651

_________________
Arpit Chopra
Rate this response:  
Not yet rated
Display posts from previous:       

Add To Favorites
View next topic
View previous topic
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum



Powered by phpBB © 2001, 2002 phpBB Group
Theme & Graphics by Daz :: Portal by Smartor
All times are GMT - 6 Hours