Ignore columns from a sequential file
Moderators: chulett, rschirm, roy
Ignore columns from a sequential file
Hi,
I am trying to read records from a sequential file to process and write to a table.
The problem is the sequential file is a global company file that generated by a group to be used by multiple divisions. I only need the first 70 columns of about 90. When I try to run the job I get a warning that:
"Import consumed only 1968bytes of the record's 2073 bytes (no further warnings will be generated from this partition)"
I could add 20 extra columns and make this warning go away but the number of columns in this file could change if another division requests additional columns.
How can I read a sequential file and just take the first 70 columns without getting warnings.
The file is:
Delimiter = comma
Null field value = ''
Quote = double
I am trying to read records from a sequential file to process and write to a table.
The problem is the sequential file is a global company file that generated by a group to be used by multiple divisions. I only need the first 70 columns of about 90. When I try to run the job I get a warning that:
"Import consumed only 1968bytes of the record's 2073 bytes (no further warnings will be generated from this partition)"
I could add 20 extra columns and make this warning go away but the number of columns in this file could change if another division requests additional columns.
How can I read a sequential file and just take the first 70 columns without getting warnings.
The file is:
Delimiter = comma
Null field value = ''
Quote = double
Another option could be to use External Source stage with Source Program
Code: Select all
cut -d',' -f1-70 <INFILE>
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Another option ... Use a Unix command to replace the "delimiter commas" with another delimiter that would not occur in data (while ignoring commas within quotes) and subsequently using the replacee delimiter to extract first seventy columns.
See example of using nawk (on Solaris) to replace delimiting commas with pipe(|) and extracting first 70 columns below:
It is possible to merge the nawk and awk commands given above.
I believe that a much simpler/elegant solution can be implemented using Perl.
See example of using nawk (on Solaris) to replace delimiting commas with pipe(|) and extracting first 70 columns below:
Code: Select all
nawk -F\" 'BEGIN{OFS=FS;} {for(i=1;i<=NF;i=i+2){gsub(/,/,"|",$i);} print $0;}' <infile>| awk -F"|" '{ for(i=1; i<=70; i++) printf("%s|"), $i ;printf("\n") };'
I believe that a much simpler/elegant solution can be implemented using Perl.
And I believe that a much simpler/elegant solution can be implemented using the Server version of the Sequential File stage.
Suppress row truncation warnings. If the sequential file being read contains more columns that you have defined, you will normally receive warnings about overlong rows when the job is run. If you want to suppress these message (for example, you might only be interested in the first three columns and happy to ignore the rest), select this check box.
Suppress row truncation warnings. If the sequential file being read contains more columns that you have defined, you will normally receive warnings about overlong rows when the job is run. If you want to suppress these message (for example, you might only be interested in the first three columns and happy to ignore the rest), select this check box.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers