Partial read of sequential file

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
takyn
Participant
Posts: 6
Joined: Thu May 25, 2006 11:08 pm
Location: Australia

Partial read of sequential file

Post by takyn »

I have a requirement to read a number of files that much the following format:

TYPE, DESCRIPTION, col1, col2, col3, ... coln

note: each record in the sequential file is terminated by a Unix newline.


I am trying to create a job that will read all the files at once (using file pattern), but only read in the first two columns, as the data will vary between files.


If I try to read an individual file using a DS Server job, this works a treat - however I don't have the option of picking up files based on a file mask.

In DS Parallel, I can't seem to read in the files (whether referenced individually, or using a mask). The output returns 0 rows.

any thoughts?

-David
------------------------------------
-Dave
takyn
Participant
Posts: 6
Joined: Thu May 25, 2006 11:08 pm
Location: Australia

Post by takyn »

Played around with settings.

Used:
Final delimiter = none (or just delete this option)
Record Delimiter = UNIX newline
Delimiter = comma
Quote = double

Also set APT_IMPORT_PATTERN_USES_FILESET=TRUE to get the filenames out.

-Dave
------------------------------------
-Dave
ray.wurlod
Participant
Posts: 54595
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

You must read every byte in a sequential file to get to the next byte. Therefore you must read every field. But, if you look carefully, you will find that each column as a "Drop On Import" property - you could set this property for all except the two fields you want to read.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply