TOAD vs DataStage

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

arpitchopra
Participant
Posts: 35
Joined: Mon Nov 01, 2010 2:48 am
Location: New Delhi

Post by arpitchopra »

Reduced it and checked again. Array size is 2000 now. Kept Rows per Transaction as 0 and Prefetch memory to 1024.
The job takes 1 hr to run now.
Arpit Chopra
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

'Sweet spot' means you have to try various values, dialing it up and back down again until you find the appropriate value for this combination. Many things can affect this with the setting you've mentioned only part of it, for example your average record length, indexes on the target, disk subsystem write speed, database settings, how often you commit, network traffic, etc etc etc.

DataStage isn't anything magical and is just another Oracle client. It leverages the standard OCI interface and TCPIP transport layer and tuning loads for a job isn't all that different from going through the same exercise outside of the tool with the exception of understanding how the DataStage engine plays a role here and the movement path of the data (possibly) across your network. For large data volumes with minimal transformations with a single instance, sometimes it is better fall back on native tools like sqlplus / sqlldr / plsql rather than a tool that is meant to be able to extract and load heterogeneous data from anywhere to anywhere.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Abhijeet1980
Participant
Posts: 81
Joined: Tue Aug 15, 2006 8:31 am
Location: Zürich
Contact:

Post by Abhijeet1980 »

Arpit,

Pls get in touch with your DBA, he would certainly be in good position to help you.

He may also help you know the differences in the behaviors of your queries (Toad and DataStage).

Lastly, try posting your query here.

Kind regards,
Abhijit Gaikwad
daignault
Premium Member
Premium Member
Posts: 165
Joined: Tue Mar 30, 2004 2:44 pm
Contact:

Post by daignault »

Hi All,

Why not create custom SQL within the Oracle stage? This way you won't pull 17Million rows of data across the net. Datastage can initiate the SQL thru a Datastage job.

I'm at a site right now where we are trying to do something similar, but with 2.3 BILLION rows of data on a multi-node grid. The designer tried to pull the data into Datastage across the network and process. By processing the data first on the source system before pulling the subset of rows into the datastage environment, we shaved 8 hours off the job execution time.

In larger sites with higher data volumes, the Network can become the primary bottleneck. So we removed it.

Regards
Post Reply