varchar numeric to Oracle decimal conversion

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
mczern
Premium Member
Premium Member
Posts: 80
Joined: Thu Jan 20, 2005 8:38 am
Location: Atlanta, GA
Contact:

varchar numeric to Oracle decimal conversion

Post by mczern »

I'm converting a source Oracle Code/Value table where the source Value column is defined as Varchar2(256) to an Oracle Decimal(32,10). The values that I need to convert are very diverse (ie: '432,000,200', '1.45e6', '567234.75', '324567.4','123456.789')

My solution:
1. Read the values from Oracle in as VarChar.
2. Use Switch for each Code. In this case 'Amount'
3. In transformer define value column as 'AMOUNT' as float(32,3) to convert the scientific notation values to the proper format.
4. In another transformer convert the float values to decimal(32,10).

The issue I'm running into is that Step 3 is creating artificial values for actual values. An example is that the input value 324567.4 becomes 324567.401 or 567234.75 becomes 567234.749 and most other values are unaffected and convert without issue.

I have played around with the precision in step 3 above. I originally had the precision set to 10, but that gave the same results just to 10 decimal places. I was hoping to keep the precision at 10 since I don't know for some field values what the precision is going to be. Sometimes it's 2 and others it's 6 for precentage type fields.
Mike Czerniawski
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

This is the kind of task that a server job can do brilliantly. Is there any reason you can't use a server job?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
mczern
Premium Member
Premium Member
Posts: 80
Joined: Thu Jan 20, 2005 8:38 am
Location: Atlanta, GA
Contact:

Post by mczern »

We were trying to move away from server jobs.
We're trying to move away from hash files.
I was taking advantage of the switch stage for all of the code/value pairs as well as the additional stages available in parallel addition.

Is this a known issue in parallel addition?
Mike Czerniawski
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Move away from Server jobs where appropriate. Leverage Server jobs where appropriate. Basically Ray is wondering if there is anything you are doing in this job that requires it to be a Parallel job? Your 'moving awaying' not withstanding, of course.
-craig

"You can never have too many knives" -- Logan Nine Fingers
mczern
Premium Member
Premium Member
Posts: 80
Joined: Thu Jan 20, 2005 8:38 am
Location: Atlanta, GA
Contact:

Post by mczern »

there is a switch for the 20+ different field value types that then use the merge stage to bring all the values together into columns then into a SCD stage that takes care of dimension changes. Converting the swith to a transformer with 20+ outputs seems counter productive. Using the SCD is pretty convenient as well.

Would this solution seem to work (thinking off the top of my head and looking at some previous posts)?

1. Define target column as Decimal 38,10.
2. Check for commas and remove (Converts correctly to decimal)
3. Check for E and field out to x * 10^y (does this work for negative?)
4. All other values convert correctly with proper decimal percision

Is there anything I missed?
Mike Czerniawski
ShaneMuir
Premium Member
Premium Member
Posts: 508
Joined: Tue Jun 15, 2004 5:00 am
Location: London

Post by ShaneMuir »

If your source is oracle you could convert it to a number prior to output?

eg use the following in your select clause for the given column

Code: Select all

 to_number(replace('INPUT',',','')) 
where INPUT is your varchar input value.

That syntax should work for any of the values that you provided.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Including those in scientific notation? Hmmm...

By golly, it does! :D
-craig

"You can never have too many knives" -- Logan Nine Fingers
mczern
Premium Member
Premium Member
Posts: 80
Joined: Thu Jan 20, 2005 8:38 am
Location: Atlanta, GA
Contact:

Post by mczern »

The only issue with doing in SQL is that I have 20+ field in the code/value set with 8 or so being numeric.
Mike Czerniawski
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Sorry, but... how exactly is that an issue?
-craig

"You can never have too many knives" -- Logan Nine Fingers
mczern
Premium Member
Premium Member
Posts: 80
Joined: Thu Jan 20, 2005 8:38 am
Location: Atlanta, GA
Contact:

Post by mczern »

I'm not sure. Maybe it is not. I'm currently doing a

Select key, code, value
from c_v_table.

Code is varchar(256), value is varchar(256). Then I have pairs like code:'name','value:Joe Smith';
code:'income', value:'290000';
code:'interest-rate',value:'3.875';
...

I used a switch to parse the values and then convert after a merge creates a single record per key.

Not sure how to convert in SQL first. Use case when?
Mike Czerniawski
rkashyap
Premium Member
Premium Member
Posts: 532
Joined: Fri Dec 02, 2011 12:02 pm
Location: Richmond VA

Post by rkashyap »

For small data volumes, server Jobs compile and perform MUCH better than Parallels. Further Server Jobs offer more flexibility in string manipulations.

For SWITCH stage ... see excerpt from sec 9.3 of red book InfoSphere DataStage: Parallel Framework Standard Practices
9.3 Filter and Switch stages
The Filter and Switch stages evaluate their expressions at runtime for every input row. Because it is compiled, a parallel Transformer with output link constraints is faster than a Filter or Switch.

Use of Filter and Switch stages must be limited to instances where the entire filter or switch expression must be parameterized at runtime. In a Parallel Transformer, link constraint expressions, but not data, is fixed by the developer.
ShaneMuir
Premium Member
Premium Member
Posts: 508
Joined: Tue Jun 15, 2004 5:00 am
Location: London

Post by ShaneMuir »

mczern wrote:Not sure how to convert in SQL first. Use case when?
You would just put the syntax around the field that you wish to convert, so your select statement would become (and add the column alias afterwards else datastage tends to get confused):

Code: Select all

Select key, 
code, 
to_number(replace(value,',','')) value
from c_v_table
You then shouldn't need to perform any conversions, as the output value will be output as a number. I think the default number output will be NUMBER with no bound limit (or it could be DECIMAL - anyone) so you might get some warnings. And you might have do so some sort of numeric to decimal conversion in your job to remove that warning
mczern
Premium Member
Premium Member
Posts: 80
Joined: Thu Jan 20, 2005 8:38 am
Location: Atlanta, GA
Contact:

Solution

Post by mczern »

For those interested, here is how I solved the problem:
1. Defined target column definition to Decimal(38,10).
2. For each column I used the following logic:

Code: Select all

If IsNull( lnk_mrg.THE_VALUE ) then SetNull()  Else if Num(lnk_mrg.THE_VALUE) and index(lnk_mrg.THE_VALUE,'E',1) > 0 then Field(lnk_mrg.THE_VALUE,'E',1) * (10 ^ field(lnk_mrg.THE_VALUE,'E',2)) Else If Num(Convert(',','',lnk_mrg.THE_VALUE)) Then Convert(',','',lnk_mrg.THE_VALUE) else if num( lnk_mrg.THE_VALUE) then lnk_mrg.THE_VALUE else SetNull()
Works for both positive and negative exponents.
Mike Czerniawski
Post Reply