Modify Stage decimal_from_string

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
ymadden@bmi.com
Premium Member
Premium Member
Posts: 21
Joined: Fri Feb 01, 2013 9:13 am

Modify Stage decimal_from_string

Post by ymadden@bmi.com »

We have a DataStage job that reads multiple delimited-file formats using a sequential file stage and various schema files. We then use a modify stage to rename columns so that they line up with an interface table in our database.

We are having problems reading certain numeric fields from the delimited files when the numbers have been formatted with a comma as the thousands separator. We can read the numeric values as a string, but the result of the decimal_from_string conversion function comes out NULL if the value contains a comma. Is there any way to strip the commas in the modify stage? Or are there any options available for the decimal_from_string function?
asorrell
Posts: 1707
Joined: Fri Apr 04, 2003 2:00 pm
Location: Colleyville, Texas

Post by asorrell »

I don't have access to DataStage at the moment, but don't remember an option in the Modify stage to resolve this.

As a workaround, you could use a transformer Trim to strip the commas:

StringToDecimal(Trim(mylink.mycol,",","A"))
Andy Sorrell
Certified DataStage Consultant
IBM Analytics Champion 2009 - 2020
nirdesh2
Participant
Posts: 56
Joined: Thu Nov 20, 2008 12:18 pm
Location: Noida

Re: Modify Stage decimal_from_string

Post by nirdesh2 »

You can use convert function in transformer to convert the values like 12,000 to 12000 and then use StringToDecimal to convert into decimal.
Nirdesh Kumar
ymadden@bmi.com
Premium Member
Premium Member
Posts: 21
Joined: Fri Feb 01, 2013 9:13 am

Post by ymadden@bmi.com »

I was hoping to avoid using a transformer. I was also hoping that since they allow the use of date format masks for date conversions in the modify spec, that they would also allow the use of number format masks.

The files that we are loading do not all contain the same columns. By using the schema files and by passing modify stage specifications as job parameters we are able to use RCP and have one single job to process various file formats. In order to operate on a column in a transformer, that column must be declared, making it required. This would reduce the reusability of this job. If there is no other option though, then we will just have to bite the bullet.

Is it possible to specify a number format on a schema file? I believe it is possible to specify a date format. I have not been able to find thorough documentation on schema file formats.
ssnegi
Participant
Posts: 138
Joined: Thu Nov 15, 2007 4:17 am
Location: Sydney, Australia

Post by ssnegi »

you could strip the delimited file in the before-job routine by invoking shell script.
sed "s/,//g" file.txt
ymadden@bmi.com
Premium Member
Premium Member
Posts: 21
Joined: Fri Feb 01, 2013 9:13 am

Post by ymadden@bmi.com »

This is essentially what we ended up doing. The jobs were already set up to allow sed commands to be passed in for pre-processing. We used a more complicated regular expression though, to avoid destroying the comma delimiters in the csv files.

We also tried specifying a c_format string, taking from this post, but couldn't get it to work.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

You could use the string_trim() function in the Modify stage to remove the comma.

Why do you wish to avoid using Transformer stage?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
ymadden@bmi.com
Premium Member
Premium Member
Posts: 21
Joined: Fri Feb 01, 2013 9:13 am

Post by ymadden@bmi.com »

Hi Ray,
It seems the string_trim function only removes leading or trailing characters. It has no effect on the characters in the middle of the string.

As for avoiding a Transformer - The files that we are loading do not all contain the same columns. By using schema files and by passing modify stage specifications as job parameters we are able to use RCP and have one single job to process various file formats. In order to operate on a column in a transformer, that column must be declared, making it required. This would reduce the reusability of this job.
Post Reply