Performance in DS vs Oracle

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
LavanyaRamesh007
Participant
Posts: 42
Joined: Mon Apr 21, 2008 1:49 am

Performance in DS vs Oracle

Post by LavanyaRamesh007 »

Hi,
In my project the source is Oracle and target is also oracle. For some of the transformations like concatenations, substring functions are written in Oracle( in SQL) rather than doing in DS transformer. My point is that doing the smae in DS will yield good performance since DS uses parallelism concept.. in SQL since we are not making the query in parallel mode i think when the job goes to production there will be a hiccup in performance.
Kindly let me know abt the performance of both?
Sainath.Srinivasan
Participant
Posts: 3337
Joined: Mon Jan 17, 2005 4:49 am
Location: United Kingdom

Post by Sainath.Srinivasan »

Who said Oracle does not use "Parallelism" ? It may be how you have configured it.

Doing in a single query may inturn avoid a transformer or modify stage and thus reducing number of process spawned.

Perform few sample tests in oracle and DS to see which is useful. Also check the plan in both.
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

There are a lot of factors to consider. How loaded down is the database? Are you trying to off load resources from the database? Most of the time if you are doing simple transforms like straight copies then doing everything in the native database is faster. The more complex the transforms the better a ETL should become. There maybe other considerations like at some point Oracle may not be the target then DataStage is a better solution.
Mamu Kim
velagapudi_k
Premium Member
Premium Member
Posts: 142
Joined: Mon Jun 27, 2005 5:31 pm
Location: Atlanta GA

Post by velagapudi_k »

Sainath.Srinivasan wrote:Who said Oracle does not use "Parallelism" ? It may be how you have configured it.

Doing in a single query may inturn avoid a transformer or modify stage and thus reducing number of process spawned.

Perform few sample tests in oracle and DS to see which is useful. Also check the plan in both.
Sainath this is kind of interesting. How to check the plan in datastage?
Can you please let me know briefly or direct me to some documentation.
Appreciate your help.
Venkat Velagapudi
mk_ds09
Participant
Posts: 72
Joined: Sun Jan 25, 2009 4:50 pm
Location: Pune

Post by mk_ds09 »

Yes..as it said it really depends on the various factors..

However doing the changes in quries is better in one of the aspect as it will avoid more i/p operations.
It will give the DS only the formated data.
Otherwise it will just read and give all the data to DS where DS needs to perform the changes.

In most of the cases, where you say performance is to get rid of the transformations/derivations in the early stage so that the later part of processing will not have that overhead !

Hope this makes some sense ! :D
-----------------------------------
Regards
MK

What would you attempt to do if you knew you could not fail?

-----------------------------------
richdhan
Premium Member
Premium Member
Posts: 364
Joined: Thu Feb 12, 2004 12:24 am

Post by richdhan »

Hi,
velagapudi_k wrote:How to check the plan in datastage?
The plan in Datastage is nothing but the job score.

Set $APT_DUMP_SCORE=1 and check the job score in the director. There will be 2 scores. One from the license operator. The next one is the actual job score.

The job score gives information on the number of processes, framework inserted operators, the partition and collection between operators, operator combination etc.

HTH
--Rich
velagapudi_k
Premium Member
Premium Member
Posts: 142
Joined: Mon Jun 27, 2005 5:31 pm
Location: Atlanta GA

Post by velagapudi_k »

Thanks Rich.
Venkat Velagapudi
Post Reply