how to read this file

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
nsm
Premium Member
Premium Member
Posts: 139
Joined: Mon Feb 09, 2004 8:58 am

how to read this file

Post by nsm »

Hi,

I have a file coming in the format given below:
Its basically a questionId,sequence number,question,answerid,answer.
it looks like its tab delimited file but when i do it its not taking it.
one morething there is no limit on the number of answers i receive.

Advice me how can I do that?

Thanks..

q16 67
How likely are you to visit Elidel.com again in the future?
1 Very likely
2 Somewhat likely
3 Not too likely
4 Not at all likely
5 Don't know

q17 68
Have you received emails from Elidel.com since you visited the website a couple of months ago?
1 Yes
2 No
3 Don't know

q18 69
And, in general, did you...
1 Read the entire email message(s)
2 Skim the email message(s)
3 Only read the subject line of the message(s)

q19 70
Did you click on any links to the Web in the email(s) that you received?
1 Yes
2 No
3 Don't know

no_checkc1 71
Do not wish to receive a check for $10.00. -
0 No
1 Yes
vmcburney
Participant
Posts: 3593
Joined: Thu Jan 23, 2003 5:25 pm
Location: Australia, Melbourne
Contact:

Post by vmcburney »

Do you have each question and all its answers on a single line seperated by tabs? If this is the case your first problem will be the differing number of columns between rows. When you import your file definition through the Manager make sure it imports it with 5 answer fields, if it shows less than 5 then manually add the required fields. The import screen has a preview pane which will show you whether the file is in fact tab delimited.

On some rows you will get less than five answers, to avoid an error on these rows check the box for "Suppress Row Truncation Errors" on the Format tab of the sequential file stage. Go to the columns tab and scroll across to the "Incomplete Column" column. In a delimited text file this setting handles missing columns. For the questions fields set this to "Replace".
nsm
Premium Member
Premium Member
Posts: 139
Joined: Mon Feb 09, 2004 8:58 am

Post by nsm »

I tried that its not working..
the file is from AS/400 machine it seems and the persons who is generating the file were not sure how they are generating it I mean what's the delimiter sort of questions.
and they cannot give as the same record they are saying.

when look at the file i feel :
question id and sequence number are in one record
then question is in one record and each answer number and answer are in one record..

Sorry i mentioned it as tab limited which is not..
vmcburney
Participant
Posts: 3593
Joined: Thu Jan 23, 2003 5:25 pm
Location: Australia, Melbourne
Contact:

Post by vmcburney »

It sounds like a type of complex flat file where different rows have a different record structure. These files are usually fixed width.

You have to have a record stretching over three rows in a predictable heirarchy. Question id and sequence number on one row, question on the next row and answers on the next row. You can pull the three rows into one set of values using stage variables in a transformer. Have a look at this post for a description :
viewtopic.php?t=86826

You are still going to have to identify whether there is a delimiter or whether it is fixed width so you can seperate the answers. Perhaps if you post the first few rows here as they appear in the input file. If you can identify the delimiter then you can read each row in as a single large piece of text and use the FIELD command to break the answers up into seperate values.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Look at the file using a hex editor of some kind (such as the UNIX command od -x) to determine exactly what the delimiter characters are. Then you're off and running. Until the AS/400 types change their program! :roll:
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
nsm
Premium Member
Premium Member
Posts: 139
Joined: Mon Feb 09, 2004 8:58 am

Post by nsm »

Thanks Vincent..

I see file is tab delimited
I have question id and sequence in one record
next record follows is question.
and next number of records depends on number of answers they have for ex: if they have 3 answers i will have 3 records after question record
with answer number and answer.

so suggest me how to read that?

nsm.
nsm
Premium Member
Premium Member
Posts: 139
Joined: Mon Feb 09, 2004 8:58 am

Post by nsm »

Does any body came across the above mentioned situation?

If YES please suggest me.

Thanks
NSM.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Splitting the file into three outputs is easy. Read the file with a sequential file stage with one VarChar column defined. Define a Transformer stage with three outputs. Constraint expressions are:
Matches "0X' '1N0N" (question and sequence)
Matches "1N0X" (response option)
Matches "1A0X" (question text)

Parse into columns on the output link. For example, the sequence is

Code: Select all

Field(inlink.column, Char(9), 2, 1)
(Even better would be to have TAB defined as a Stage variable, initialized to Char(9). That way it doesn't have to be re-calculated for each row processed.)
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
vmcburney
Participant
Posts: 3593
Joined: Thu Jan 23, 2003 5:25 pm
Location: Australia, Melbourne
Contact:

Post by vmcburney »

You haven't mentioned how you are planning on transforming and writing it out. Consider keeping the question id as a stage variable, then when you output the question row and the answer rows you can include your saved question id so all rows for a particular question include the question id they belong to.
nsm
Premium Member
Premium Member
Posts: 139
Joined: Mon Feb 09, 2004 8:58 am

Post by nsm »

Thanks Roy and Vincent..

I think I am almost there.

Thanks once again for your wonderful suggestions.

nsm.
Post Reply