DSXchange: DataStage and IBM Websphere Data Integration Forum
View next topic
View previous topic
Add To Favorites
Author Message
mail2krishna79


since May 2006

Group memberships:
Premium Members

Joined: 21 Apr 2006
Posts: 56

Points: 431

Post Posted: Fri Mar 25, 2011 12:28 pm Reply with quote    Back to top    

DataStage® Release: 8x
Job Type: Parallel
OS: Unix
Hi,
Somebody designed a job in my project with two transformer Stages are there in the job.
1st transformer they maintained Hash Partitioning and Second Transformer they maintained same partition. Everything else straight move.

Job design is like this :
Ext Src Stg---->Tfm--->Remdup---->OraConnector---->Tfm---->OraConnector

Can anyone suggest me to improve performance in this scenario.

Thanks
hk
soumya5891
Participant



Joined: 07 Mar 2011
Posts: 133

Points: 860

Post Posted: Fri Mar 25, 2011 12:44 pm Reply with quote    Back to top    

If only partition related issues are handled in the transformer then actually no need to implement transformer.This can be handled in any other stage.
Ex-For remove duplicate stage use a sort before it with hash partition and then make the partition in the rdup stage as same.

_________________
Soumya
Rate this response:  
Not yet rated
jwiles



Group memberships:
Premium Members

Joined: 14 Nov 2004
Posts: 1274

Points: 10406

Post Posted: Fri Mar 25, 2011 1:28 pm Reply with quote    Back to top    

FYI: There is typically little need to actually specific Same partitioning. When partitioning is set to Auto, the engine will normally not repartition unless it determines a need to do so.

If all that the transformers are doing is to copy input columns to the output (no null checking, datatype conversion, column generation, etc.) than they are unnecessary and you can do as soumya has suggested.

What type of data volume is being processed and what do the Oracle connectors in the job do, especially the first one?

Regards,

_________________
- james wiles


All generalizations are false, including this one - Mark Twain.
Rate this response:  
Not yet rated
mail2krishna79


since May 2006

Group memberships:
Premium Members

Joined: 21 Apr 2006
Posts: 56

Points: 431

Post Posted: Tue Mar 29, 2011 4:31 pm Reply with quote    Back to top    

Hi Soumaya/jwiless,
Thank you so much for your response. the data volume is 10 million rows.
Sorry for late reply
Rate this response:  
Not yet rated
jwiles



Group memberships:
Premium Members

Joined: 14 Nov 2004
Posts: 1274

Points: 10406

Post Posted: Tue Mar 29, 2011 6:09 pm Reply with quote    Back to top    

10 million rows through transformers is not an issue in your case, so long as you are doing as little in them as you have stated. That leaves the Oracle Connectors, Remove Duplicates and the External Source stage. So, another series of questions as we try to narrow down.

1) How long does this job run when processing 10 million records?

2) What is External Source calling to provide the data?

3) (as asked before) What do the Oracle Connectors do in this job?

4) What degree of parallelism is the job being run at?

5) How wide are the records? (Average bytes per record)

Potential bottlenecks are the the External Source Stage, the Oracle Connectors and the Remove Duplicates stage.

1) Determine how quickly the External Source stage runs by creating a job with just it dumping to a Copy stage: ExtSrc-->Copy. Compile and run to see how long it takes to provide the 10million records.

2) The Remove Duplicates probably has a Sort inserted in front of it at runtime. Depending on the total size of the data being processed, it may have a minor impact.

3) The Oracle connectors: What does each one do (I suspect the first is an Upset of some sort and the second loads rejects)? Do they load to the same table or separate tables? Do they use custom SQL statements? Are the tables being loaded to Indexed, and are the indexes kept up to date?

_________________
- james wiles


All generalizations are false, including this one - Mark Twain.
Rate this response:  
Not yet rated
Display posts from previous:       

Add To Favorites
View next topic
View previous topic
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum



Powered by phpBB © 2001, 2002 phpBB Group
Theme & Graphics by Daz :: Portal by Smartor
All times are GMT - 6 Hours