data stage job running more time.

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
suresh_dsx
Participant
Posts: 160
Joined: Tue May 02, 2006 7:49 am

data stage job running more time.

Post by suresh_dsx »

Hi,

We have job delete data in 5 tables.
Db2 stage type is DB2UDBAPI.


Code: Select all

                             |------------------------>targer table1
                             |------------------------>targer table2
                             |
Source table ----->copystage------>targer table3
                             |
                             |------------------------>targer table4
                             |------------------------>targer table5
                             

The source data count 600000 records. the job is running 15 hours to complete in sit environment.

the same job completed in the development environment is 3 hours.

As we are taking the help of datasage admin and DBA to moniter the process why it is taking lot of time. they said it is not occupying not that many resources. they are unable to figure out the root cause.

i have searched the fourms realted but dod not find the similar issue.

Any help greatly appriciated.
thanks -suresh
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

What's different between the two environments?

Don't answer "nothing" because that's clearly not the case.

For example, are the tables indexed on the key in dev and not in sit? Or is the database local to dev and remote from sit? There are several other possibilities.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
suresh_dsx
Participant
Posts: 160
Joined: Tue May 02, 2006 7:49 am

Post by suresh_dsx »

Thanks ray for the swift responce.

First check we have done indexes on the tables. both are in sync.
second data stage sctrach space. we have enough space.
third check compared two jobs any change in the jobs(different environments). jobs are same.


Any other check which we dont have access to the admin side to monitor the job.

thanks Suresh
suresh_dsx
Participant
Posts: 160
Joined: Tue May 02, 2006 7:49 am

Post by suresh_dsx »

Now the jobs is running avarage with 10records per sec.

DBA happy with the indexes.
datastage admin-enough space in the temp area.

source data which is having 400000 records completed with in one hour.
when we have 500000 records running 15 hrs.

any checks need to request DBA /admin on this issue.

any help greatly appriciated.
Thanks -Suresh
PaulVL
Premium Member
Premium Member
Posts: 1315
Joined: Fri Dec 17, 2010 4:36 pm

Post by PaulVL »

Are all of the 5 tables on the same server / database?
suresh_dsx
Participant
Posts: 160
Joined: Tue May 02, 2006 7:49 am

Post by suresh_dsx »

yes all the tables are in same database schema and same server
mail2krishna79
Charter Member
Charter Member
Posts: 56
Joined: Fri Apr 21, 2006 9:53 am

Post by mail2krishna79 »

Try recreating the same job because some times internal functions may not work properly while deleting and placing new stages. this might help
mail2krishna79
Charter Member
Charter Member
Posts: 56
Joined: Fri Apr 21, 2006 9:53 am

Post by mail2krishna79 »

Try recreating the same job because some times internal functions may not work properly while deleting and placing new stages. this might help
jwiles
Premium Member
Premium Member
Posts: 1274
Joined: Sun Nov 14, 2004 8:50 pm
Contact:

Post by jwiles »

What other differences may there be between the two systems? Hardware? Physical location? Are the network connections different on SIT and DEV? Perform a traceroute and ping from each DS server to each database server (sources and targets).

Also look at the DB2 client configuration on each DS server.

What about how you're running the job? Degree of parallelism on SIT vs DEV?

What kind of throughput do you get from the source table on dev and sit? Create a job which just reads the table and dumps to a peek or copy and see what the difference in runtime is on the two environments. If you can do so with the targets, run a job that is a row generator going into the DB2 stage and write to a temporary table on each of the target database servers.

This above is just a list of things to check in the process of elimination.

Regards,
- james wiles


All generalizations are false, including this one - Mark Twain.
greggknight
Premium Member
Premium Member
Posts: 120
Joined: Thu Oct 28, 2004 4:24 pm

Post by greggknight »

I am assuming you have watched this job run from director and are not receiving a warning message for each row processed.

Have you confirmed that the records are actually going in the tables.
600000 rows to a DB2 should be less then 5 min. It is on my system anyway
"Don't let the bull between you and the fence"

Thanks
Gregg J Knight

"Never Never Never Quit"
Winston Churchill
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

A lot of factors can play into database deletes. The tables primary key may be foreign keys in several different tables. It has to go check for the existence of child before actually deleting the record. Then depending upon the constraint definition, it may be delete cascade or do nothing.
If this is zos DB2 then it also depends upon how much load is on the cpu, if zip processors are involved etc.
So depending upon everything that goes around under the covers the normal time may be 5 mins or may be 5 hours.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
mantums
Participant
Posts: 6
Joined: Wed Aug 30, 2006 10:14 pm
Location: Channai

Re: data stage job running more time.

Post by mantums »

Do some RND

Take single source and load to the Single target table, check the same in both the enviroment (sit and dev)

Source table ---------> Target table1

Rgds
Mahantesh
Mahantesh Shivanagi
Basel-Developement team
Standard Chartered
Bicchu
Participant
Posts: 26
Joined: Sun Oct 03, 2010 10:49 pm
Location: India

Post by Bicchu »

You can provided values based on your environment in Array Size and Transaction Size in the DB2 API target stage in order to make your job faster.

I once faced such situation and by providing correct value as per my environment, my jobs ran faster.
Thanks,
Pratik.
SURA
Premium Member
Premium Member
Posts: 1229
Joined: Sat Jul 14, 2007 5:16 am
Location: Sydney

Post by SURA »

Write it in Dataset and from there to table.
Post Reply