Job Design - Request for comment

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply

Is this a well designed parallel job?

Poll ended at Fri Sep 16, 2011 2:37 pm

Yes, it is a very good design
0
No votes
Its okay, if you have powerful enough infrastructure
1
50%
No, this is a bad design
1
50%
 
Total votes: 2

bashbal
Premium Member
Premium Member
Posts: 23
Joined: Mon Mar 01, 2004 12:26 pm
Location: Milwaukee, WI
Contact:

Job Design - Request for comment

Post by bashbal »

I some concerns over this job design. The job has 6 concurrent ABAP/RFC stages that will hit the SAP system and then does merges the data into a sequential file. A subsequent job will load it into our BI database

The concerns I have are:
1) This may overload my network/SAP connections. Since this job runs in parallel SMP, there could be many more than 6 simultaneous connections. In my experience with connecting to remote systems, this may cause some problems due to overhead and burden.

2) The entire job is dependent on successful completion of all SAP connects. If one fails, the entire job fails and must be restarted from the beginning.

3) From a support standpoint, with all of these connections, it could be difficult to debug and diagnose failures.

Image

I would recommend having a separate job for each ABAP connection that would write its own sequential file, then save a copy to archive. I would change the existing job by replacing the ABAP stages with sequential files. This would separate the process into smaller units of work, which are easier to manage.

I would also add that, while I have 8+ years with Server edition, Parallel and SAP are brand new to me and my company.
Lyle
qt_ky
Premium Member
Premium Member
Posts: 2895
Joined: Wed Aug 03, 2011 6:16 am
Location: USA

Re: Job Design - Request for comment

Post by qt_ky »

Concern 1: It looks like from the screen shot that each ABAP stage is running sequentially because of the fan-out icon on each link. I've not used the ABAP stage myself, but the icon indicates that each one is running sequentially, so you only have 6 simultaneous connections when the job starts.

Concern 2: If you had separate jobs for each ABAP and one failed, you could restart your sequence job and it could restart the failed job and save you some time. But if time lapses between the failure and the restart and you need all 6 extracts to begin together to keep data in sync, then the current job design may be better.

Concern 3: Time will tell either way. The job log should highlight a stage name when a failure occurs.

Sequential files can be good for viewing landed data and archiving versions between runs. If you're processing a large volume of data then you may want to try parallel datasets. Read up on the dataset stage.
Choose a job you love, and you will never have to work a day in your life. - Confucius
bashbal
Premium Member
Premium Member
Posts: 23
Joined: Mon Mar 01, 2004 12:26 pm
Location: Milwaukee, WI
Contact:

Re: Job Design - Request for comment

Post by bashbal »

Thanks for the feedback.

We have researched datasets and will be using them extensively in our ETL designs.
Post Reply