Read DB2 table performance

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
mukejee
Participant
Posts: 37
Joined: Thu Dec 11, 2008 5:38 am
Location: India,Bangalore

Read DB2 table performance

Post by mukejee »

Hi Experts,

I am trying to extract a DB2 table which holds 60 million records and i am trying to pull the entire data from the table.

What are the possible options that I should take care to get the best performance.

I am using DB2 connector stage to pull the data ..
Current array size = 1000
Record count = 2000

Its taking more than 20 - 25 minutes to extract the whole 60 million and load a dataset. There is no other stages in between.

Regards
Mukejee
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

First do some diagnosis - create a job that pulls the DB2 table into a Copy stage (which acts as a sink). That will allow you to determine whether the slowness is in the DB2 stage or the Data Set write operation.

Let us know, then we can address the correct problem instead of wasting time.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

You can use the partitioned read method which often speeds up DB2 access, assuming that is your bottleneck and not network performance.

If your source table is partitioned, then use that or partition your read using a non-nullable integer column with sufficient cardinality and the connector's "modulo" method, or create your own virtual column using a DB2 OLAP function.

This will ensure that there are multiple processes reading from DB2 rather than the default single thread and might speed up the reading process .
Post Reply