TAB_s3 ----> TAB_t3

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
jack66
Participant
Posts: 1
Joined: Tue Dec 02, 2008 10:25 am

TAB_s3 ----> TAB_t3

Post by jack66 »

hi,
I've a parallel DsJob with multiple "task" that done
more SELECT from source tables then INSERT into target tables.

The Job must "move" data from a set of table from a "source DB" forward a "target DB" with the same tables.

For example:

TAB_s1 ----> TAB_t1
TAB_s2 ----> TAB_t2
TAB_s3 ----> TAB_t3
...
...

The source DB is Sybase 12.5 and the target DB is Microsoft SqlServer 2007. The table "sx" and "tx" are similar.

Now, my job run correctly, but I've low performance when I use 20 multiple "task" (20 tables as source and 20 tables as target)...
Consider that the tables contains not more of 1 milion of record.


Today I've "split" the job in more job with 5 "multiple task" and the performance are better!


There is some documentation or best practies for help me to find the "right number" ?

Do you know how manage this kind of problem?

Thanks in advance

Ugo Ribecai
ray.wurlod
Participant
Posts: 54595
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

The only approach is trial and trial. Start with one stream, then increase gradually until elapsed time doesn't reduce any more. Of course all this is predicated on "all else being equal" on the server machine.

You can use the DataStage resource estimator tool, but this does not report an estimate of elapsed time.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply