Complex Flat File Stage server vs parallel

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
jlitherl
Participant
Posts: 6
Joined: Wed Jun 21, 2006 2:15 am
Location: London

Complex Flat File Stage server vs parallel

Post by jlitherl »

I'm writing a parallel job to read a complex flat file (defined by COBOL Copybook) and interpret the data according to an ID field, signifying which of the REDEFINES sections applies.
I can do this in a server job fairly easily and intuitively with multiple output links, defining the select columns and selection criteria.
When I try to do the equivalent task in a parallel CFF stage, the user interface is different, it doesn't let me define multiple output links, select columns etc.
It has been suggested I wrap the server job in a parallel container, which seems to work, but looks a bit like a bodge to me.
Does anyone know a prettier (i.e. the correct) way to do this, please?
Thanks
ray.wurlod
Participant
Posts: 54595
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Welcome aboard. :D

You are correct to assert that the interface is different, remember that the parallel version must take into account the need to split the rows into multiple parallel streams.

Is there anything in particular about which you want to ask concerning the parallel CFF stage?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
jlitherl
Participant
Posts: 6
Joined: Wed Jun 21, 2006 2:15 am
Location: London

Post by jlitherl »

Thanks for your reply, Ray.

I'm just trying to do what seems to be the obvious thing (well, it is very easy to do with the server CFF stage), that is, to read a record in the CFF stage. Then, according to the value of the record id (record type), to branch off to process the record appropriately to which of the REDEFINE clauses apply.
In the CFF server stage, on the Selection Criteria tab of the Output tab, you can choose which of the output links is chosen depending on the value of the record ID.
Whereas, with the parallel CFF stage, I seem to be constrained to a single output link.
To do this in a parallel job should I be using a combination of two jobs? A CFF followed by a switch, perhaps?
I'll be grateful for any further advice. Thanks.
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

You are right. CFF in PX cannot change its metadata dynamically based on data, as what redefines is for. So either you need to seperatly process the data for different conditions othere than expected from soruce, before reading into CFF or use other stages to filter it out. Read it as varchar (if possible) and deal withing job.
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
jlitherl
Participant
Posts: 6
Joined: Wed Jun 21, 2006 2:15 am
Location: London

Post by jlitherl »

Thanks for the help.

Apparently it is possible to read a CFF with redefines in parallel with a little jiggery-pokery, a switch and a Column Import stage.

The description is on :
http://www.dsxchange.com/viewtopic.php? ... +flat+file

I'm getting nearer but still not quite there...

Regards
JL
Post Reply