Field value to the length of the column

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
harikhk
Participant
Posts: 64
Joined: Tue Jun 04, 2013 11:36 am

Field value to the length of the column

Post by harikhk »

Hi,

I have a query while creating a fixed width file

1. If there is an integer field(or even a char type) of length 6 how do I get 0s(or another padding character) appended to the beginning
Ex:
Input : 1234
Expected Output : 001234(Length of the target column is 6 )

I have tried the options
a) Str(0,6-Len(column)):column

b) Right('000000':column,6)
In this option I understand that if the length of the column is more for example 30, it would be required to use 30 0s to
concatenate where human errors can creep in

Both the option suffice my expected result but there are many such columns and required to hard code. I would like to know if it is possible without hard coding.

If at all it is required to hard code, whcih of the above options would be suggested in terms of performance.
My input data volume varies from 1000s of records to millions (max 5 million) of records

Please suggest which of the above options is better in terms of constrcution and also performance
Thanks,
HK
*Go GREEN..Save Earth*
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

B is more efficient. This should only be an issue with string fields, for numeric values the 'padding' with zeroes should be happening automatically. Are you not seeing that?
-craig

"You can never have too many knives" -- Logan Nine Fingers
qt_ky
Premium Member
Premium Member
Posts: 2895
Joined: Wed Aug 03, 2011 6:16 am
Location: USA

Post by qt_ky »

I would also vote for option B as it appears to do less work in order to produce the same result.

Decimals data types auto-pad with zeros; haven't seen that with integers.
Choose a job you love, and you will never have to work a day in your life. - Confucius
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Was wondering if being an Integer made a difference. Could you make it a decimal with a scale of zero? But then you'll also get a space reserved for the sign, from what I recall, maybe even one of them pesky decimal points.
-craig

"You can never have too many knives" -- Logan Nine Fingers
harikhk
Participant
Posts: 64
Joined: Tue Jun 04, 2013 11:36 am

Post by harikhk »

Hi Craig,
Decimal datatypes auto pad with zeros while integers do not.
Yes, if defined as Decimal/Numeric, it does have the decimal point which is not preferred along with a space for the sign

And in case of strings, padding by default is to the right as defined by the $APT_STRING_PADCHAR
I am not aware of if there is an option to define left padding

Hi Eric,
In terms of construction effort I agree that option B is easier but what in terms of performance
Thanks,
HK
*Go GREEN..Save Earth*
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

The performance of B has already been answered, did you click on the link in my earlier reply?
-craig

"You can never have too many knives" -- Logan Nine Fingers
qt_ky
Premium Member
Premium Member
Posts: 2895
Joined: Wed Aug 03, 2011 6:16 am
Location: USA

Post by qt_ky »

harikhk wrote:In terms of construction effort I agree that option B is easier but what in terms of performance
What I said above was option B does less work (meaning better performance). You said it was easier, and I would agree with that too. :)
Choose a job you love, and you will never have to work a day in your life. - Confucius
harikhk
Participant
Posts: 64
Joined: Tue Jun 04, 2013 11:36 am

Post by harikhk »

Craig,
Overlooked the link. Thanks for sharing the part.

For performance noted option B better

Still intersted to know if there is another option to avoid hard coding or is only the option considering below scenario
Just in case the target field is a 200 character length and has 50 input characters, is it that the only option as we have to mention 200 zeros(or a padding character for string fields) and then concatenate and apply the right function (as left padding is what I am interseted in)?

And one more query which is a bit connected(hope so), how do write a null field value to the length of the field
Ex: For a field of length 6
Input : NULL
Output : ' ' (the quotes here are only for display and the actual output is 6 spaces)

There could be multiple such null value fields.
Should I go with option b after checking if input value is null .
Last edited by harikhk on Thu May 22, 2014 2:29 pm, edited 1 time in total.
Thanks,
HK
*Go GREEN..Save Earth*
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

An actual NULL field would be just two quotes with nothing between them. If you want a string of equivalent size filled with spaces for your fixed-width file, there is a Spaces() function from what I recall so Spaces(6) would work for your example. Or a CHAR field with the PAD_CHAR set to 0x20 should work, too.
-craig

"You can never have too many knives" -- Logan Nine Fingers
harikhk
Participant
Posts: 64
Joined: Tue Jun 04, 2013 11:36 am

Post by harikhk »

Thanks for the space() function and also I think option concatenation with a null value may not work.
PAD_CHAR helps for right padding. I am looking for a left padding option
Thanks,
HK
*Go GREEN..Save Earth*
qt_ky
Premium Member
Premium Member
Posts: 2895
Joined: Wed Aug 03, 2011 6:16 am
Location: USA

Post by qt_ky »

harikhk wrote:Just in case the target field is a 200 character length and has 50 input characters, is it that the only option as we have to mention 200 zeros(or a padding character for string fields) and then concatenate and apply the right function (as left padding is what I am interseted in)?
Instead of hard-coding 200 zeros, you could go for something in-between options A and B. I would not expect any of the options to "kill" the performance. You could also move the Str() function into a stage variable and reuse it across the variety of derivations. The tradeoff here would be reducing potential for human errors for a bit of performance (may be difficult to measure).

Option C

Code: Select all

Right(Str('0', 200):column, 200)
And you can always search for option D.
Choose a job you love, and you will never have to work a day in your life. - Confucius
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

harikhk wrote:PAD_CHAR helps for right padding. I am looking for a left padding option
I was specifically responding to a scenario where the field was only populated with pad characters - your NULL string. In that case left vs. right makes no difference.
-craig

"You can never have too many knives" -- Logan Nine Fingers
harikhk
Participant
Posts: 64
Joined: Tue Jun 04, 2013 11:36 am

Post by harikhk »

Thanks Craig and Eric.
Marking the topic resolved
Thanks,
HK
*Go GREEN..Save Earth*
ssnegi
Participant
Posts: 138
Joined: Thu Nov 15, 2007 4:17 am
Location: Sydney, Australia

Post by ssnegi »

Right(Str('0',6):column,6)
Post Reply