DSXchange

Posted: **Fri Sep 09, 2011 2:37 pm**

I some concerns over this job design. The job has 6 concurrent ABAP/RFC stages that will hit the SAP system and then does merges the data into a sequential file. A subsequent job will load it into our BI database

The concerns I have are:
1) This may overload my network/SAP connections. Since this job runs in parallel SMP, there could be many more than 6 simultaneous connections. In my experience with connecting to remote systems, this may cause some problems due to overhead and burden.

2) The entire job is dependent on successful completion of all SAP connects. If one fails, the entire job fails and must be restarted from the beginning.

3) From a support standpoint, with all of these connections, it could be difficult to debug and diagnose failures.

I would recommend having a separate job for each ABAP connection that would write its own sequential file, then save a copy to archive. I would change the existing job by replacing the ABAP stages with sequential files. This would separate the process into smaller units of work, which are easier to manage.

I would also add that, while I have 8+ years with Server edition, Parallel and SAP are brand new to me and my company.

Posted: **Sat Sep 10, 2011 11:19 pm**

Concern 1: It looks like from the screen shot that each ABAP stage is running sequentially because of the fan-out icon on each link. I've not used the ABAP stage myself, but the icon indicates that each one is running sequentially, so you only have 6 simultaneous connections when the job starts.

Concern 2: If you had separate jobs for each ABAP and one failed, you could restart your sequence job and it could restart the failed job and save you some time. But if time lapses between the failure and the restart and you need all 6 extracts to begin together to keep data in sync, then the current job design may be better.

Concern 3: Time will tell either way. The job log should highlight a stage name when a failure occurs.

Sequential files can be good for viewing landed data and archiving versions between runs. If you're processing a large volume of data then you may want to try parallel datasets. Read up on the dataset stage.

Posted: **Thu Sep 15, 2011 5:47 am**

Thanks for the feedback.

We have researched datasets and will be using them extensively in our ETL designs.

DSXchange

Job Design - Request for comment

Job Design - Request for comment

Re: Job Design - Request for comment

Re: Job Design - Request for comment