Need to handle new line termination for a field

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
Rahul Bharadwaj
Premium Member
Premium Member
Posts: 24
Joined: Mon Jul 14, 2008 12:03 am
Location: Bangalore

Need to handle new line termination for a field

Post by Rahul Bharadwaj »

Please let me know the Equivalent property for 'contains terminators' (Column property in server) in parallel job.

Thanks & Regards
Rahul
nagarjuna
Premium Member
Premium Member
Posts: 533
Joined: Fri Jun 27, 2008 9:11 pm
Location: Chicago

Post by nagarjuna »

Save the server metadata as table definition in datastage repository..Go to that metadata & you have an option of sync to parallel ...You would get the parallel job equivalent...Hope this helps..
Nag
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Could you explain for us the issue you are seeing and what things you've tried? Thanks.
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Re: Need to handle new line termination for a field

Post by ray.wurlod »

Rahul Bharadwaj wrote:Please let me know the Equivalent property for 'contains terminators' (Column property in server) in parallel job.
There isn't one.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Ray is being pedantic again. :wink:

Then how about just "the equivalent" rather than "the equivalent property"... as in how would one handle the situation in PX.
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

The parallel Sequential File stage is notorious for not being able to handle embedded newlines. This is not pedantry - search back through the myriad posts complaining about it.

Note that simplistic "solutions" such as using a shell script to delete the new lines don't cut it for me - I believe the customer's data is sacrosanct - if these newline characters are part of the data, then (lacking advice to the contrary) they're part of the data and must be handled accordingly.

My workaround, as often as not, has been to put a server Sequential File stage into a server Shared Container and to embed that in the parallel job.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

I recall issues with embedded quotes but thought there was some sort of equivalent for this in the PX Sequential File stage, guess I thunk wrong.

For those of you who have faced and overcome this issue, what solution did you architect? What was your 'work around'?

Ah... I see Ray added his while I was typing. Thanks! :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
pandeesh
Premium Member
Premium Member
Posts: 1399
Joined: Sun Oct 24, 2010 5:15 am
Location: CHENNAI, TAMIL NADU

Post by pandeesh »

i have never come across this scenario.
Thanks Ray and Craig! Good learning! :)
pandeeswaran
Rahul Bharadwaj
Premium Member
Premium Member
Posts: 24
Joined: Mon Jul 14, 2008 12:03 am
Location: Bangalore

Thank you ray

Post by Rahul Bharadwaj »

Thanks Ray, I overcomed this situation by using server shared containers in parallel job. Thank you.
zulfi123786
Premium Member
Premium Member
Posts: 730
Joined: Tue Nov 04, 2008 10:14 am
Location: Bangalore

Post by zulfi123786 »

We have address processing and its a big junkyard..... had single and double quotes as part of data and we had to use extended ascii characters to quote all fields.
Never had newline characters, the front end guys need to understand its no sense in capturing carriage return and new line charactes as part of any textual descriptions.
- Zulfi
cparru
Participant
Posts: 24
Joined: Wed Mar 07, 2007 11:22 am

broken address

Post by cparru »

Hi Guys,

I tried this solution tp load the address data which is broken into 2 lines. I was able to view the data in shared container/sequential file stage, but while loading it into the table (from a parallel job) it is still loading 2 records for one address. How do i remove that new line character(a square box :-)) from the output? ereplace didn't work for me, its not compiling!

Appreciate your suggestions!
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Can you please post what the "resolution" is?

My point made earlier is that you should NOT remove the newline, at least not without the customer's approval - it's part of the data. So you have to find a mechanism for loading the data in which any character is legitimate. I suspect you've used a bulk loader, which has interpreted the end-of-line in its own way.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply