Modify Stage decimal_from_string

ymadden@bmi.com · Post by **ymadden@bmi.com** » Tue Jul 08, 2014 7:59 am

We have a DataStage job that reads multiple delimited-file formats using a sequential file stage and various schema files. We then use a modify stage to rename columns so that they line up with an interface table in our database.

We are having problems reading certain numeric fields from the delimited files when the numbers have been formatted with a comma as the thousands separator. We can read the numeric values as a string, but the result of the decimal_from_string conversion function comes out NULL if the value contains a comma. Is there any way to strip the commas in the modify stage? Or are there any options available for the decimal_from_string function?

IBM Analytics Champion 2009 - 2020 · Post by **asorrell** » Tue Jul 08, 2014 8:53 am

I don't have access to DataStage at the moment, but don't remember an option in the Modify stage to resolve this.

As a workaround, you could use a transformer Trim to strip the commas:

StringToDecimal(Trim(mylink.mycol,",","A"))

nirdesh2 · Post by **nirdesh2** » Tue Jul 08, 2014 8:55 am

You can use convert function in transformer to convert the values like 12,000 to 12000 and then use StringToDecimal to convert into decimal.

ymadden@bmi.com · Post by **ymadden@bmi.com** » Tue Jul 08, 2014 1:56 pm

I was hoping to avoid using a transformer. I was also hoping that since they allow the use of date format masks for date conversions in the modify spec, that they would also allow the use of number format masks.

The files that we are loading do not all contain the same columns. By using the schema files and by passing modify stage specifications as job parameters we are able to use RCP and have one single job to process various file formats. In order to operate on a column in a transformer, that column must be declared, making it required. This would reduce the reusability of this job. If there is no other option though, then we will just have to bite the bullet.

Is it possible to specify a number format on a schema file? I believe it is possible to specify a date format. I have not been able to find thorough documentation on schema file formats.

ssnegi · Post by **ssnegi** » Tue Jul 08, 2014 4:11 pm

you could strip the delimited file in the before-job routine by invoking shell script.
sed "s/,//g" file.txt

ymadden@bmi.com · Post by **ymadden@bmi.com** » Wed Jul 16, 2014 7:42 am

This is essentially what we ended up doing. The jobs were already set up to allow sed commands to be passed in for pre-processing. We used a more complicated regular expression though, to avoid destroying the comma delimiters in the csv files.

We also tried specifying a c_format string, taking from this post, but couldn't get it to work.

ray.wurlod · Post by **ray.wurlod** » Wed Jul 16, 2014 2:36 pm

You could use the string_trim() function in the Modify stage to remove the comma.

Why do you wish to avoid using Transformer stage?

ymadden@bmi.com · Post by **ymadden@bmi.com** » Mon Jul 21, 2014 1:16 pm

Hi Ray,
It seems the string_trim function only removes leading or trailing characters. It has no effect on the characters in the middle of the string.

As for avoiding a Transformer - The files that we are loading do not all contain the same columns. By using schema files and by passing modify stage specifications as job parameters we are able to use RCP and have one single job to process various file formats. In order to operate on a column in a transformer, that column must be declared, making it required. This would reduce the reusability of this job.

DSXchange

Modify Stage decimal_from_string

Modify Stage decimal_from_string

Re: Modify Stage decimal_from_string