dsjob -report vs DataStage C++ API

A forum for discussing DataStage<sup>®</sup> basics. If you're not sure where your question goes, start here.

Moderators: chulett, rschirm, roy

Post Reply
zulfi123786
Premium Member
Premium Member
Posts: 730
Joined: Tue Nov 04, 2008 10:14 am
Location: Bangalore

dsjob -report vs DataStage C++ API

Post by zulfi123786 »

Hi,

We have DataStage 8.5 (Does it make sense if I say "we would better have individual version in the Release drop down as there is much difference within 8.x versions" ?)

There are 10000 jobs in a project to collect the run time statistics and currently we are using dsjob -report (without logon clause) to get that information by making it loop over 10000 jobs.

Its a Single tier topology so Engine/Metadata/Services all on single Linux box, The question here is "Does the dsjob call have to go through authentication each time in a loop" ?

At this moment it is taking 40 mins to get data for all 10000 jobs and if it has to go through authentication and each time attach to the same project just wondering if redoing this using C++ API would save some time ?

It would take considerable time to code it using C++ API and want to take experts comments to avoid the embarrassing situation of having spent lot of time on getting it to work and finding out no improvement in time consumed.

Thanks.
- Zulfi
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

That's 250 jobs per minute which, given the number of internal tables that need to be opened and interrogated, strikes me as reasonable.

If you don't want to code it in C++ you could always create a routine in DataStage BASIC, which uses the same API. Depending on your skill set this may be a more economical approach.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
zulfi123786
Premium Member
Premium Member
Posts: 730
Joined: Tue Nov 04, 2008 10:14 am
Location: Bangalore

Post by zulfi123786 »

Hi Ray,

I completely agree that 250 jobs per minute is a reasonable speed but wondering if the dsjob is going to authentication and attaching to project each time we can get have a better approach with C++ API's where we authenticate only once and attach to project only once but would like to take your advice if doing this in C++ API would be any faster (theoretically) . No matter the number of jobs 40+ minutes is a considerable time to re-look for better approach.

The Server routine approach is a little off our standards where everything is to be in Parallel and move away from Server jobs and Routines (though we use Sequencers).


Thanks
- Zulfi
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Of course dsjob will authenticate each time you invoke it, which is (I daresay) what your script is doing.

I'm not sure that using the API would be that much faster; authentication is a tiny part of what you're doing.

As to your standards, are you able to determine the reason (if any) behind that decision? There was some FUD sown when 8.0 first came out that server jobs would cease to be supported (mainly by sales people who got a higher commission for enterprise edition sales), but this is NOT TRUE.

Would you discard half the tools in any other kind of toolkit, such as all the even-sized spanners in a spanner set?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
zulfi123786
Premium Member
Premium Member
Posts: 730
Joined: Tue Nov 04, 2008 10:14 am
Location: Bangalore

Post by zulfi123786 »

Hi Ray,

Regarding the Standards, having once heard the rumour of Server jobs being dumped out, the admins are worried if that would be a reality one day in future so they want us to keep away from it as far as possible.

There was a day in the past when all of our code was on Server edition and now we have it all on Parallel, nevertheless it was fun porting it from server to parallel.
- Zulfi
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Server jobs are NOT going away. For a start, a sequence is a server job. I still make great use of server jobs, which can be finished before the parallel job is still composing the score.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply