Server vs parallel jobs

A forum for discussing DataStage<sup>®</sup> basics. If you're not sure where your question goes, start here.

Moderators: chulett, rschirm, roy

Post Reply
rsunny
Participant
Posts: 223
Joined: Sat Jul 03, 2010 10:22 pm

Server vs parallel jobs

Post by rsunny »

Hi ,

As we can run the server jobs in Enterprise edition , then what is the configuration file that it is going to be when we run the server jobs in Enterprise edition . And when we run the server jobs in Enterprise edition , then is it going to compile in Basic language or in C++ compiler. And can we compile the transformer in basic language in Enterprise edition or do we need to compile the transformer in C++ compiler only?

And if i am not wrong , we cant run the parallel jobs in server edition right as there is no parallel engine installed ?

Thanks in advance
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Just because you have the "Enterprise Edition" doesn't mean Server jobs work or run any differently.
Last edited by chulett on Tue Feb 08, 2011 10:55 am, edited 1 time in total.
-craig

"You can never have too many knives" -- Logan Nine Fingers
jwiles
Premium Member
Premium Member
Posts: 1274
Joined: Sun Nov 14, 2004 8:50 pm
Contact:

Post by jwiles »

The Enterprise Edition (Information Server for v8x) contains both the Server (Universe) and Parallel (Orchestrate) engines.

Server jobs are server jobs...they run in the server engine as always. They will not use a parallel configuration file because they do not utilize the parallel engine.

Parallel jobs run only on the parallel engine, except for those instances where you can use a BASIC transformer or other server-based stage as part of a parallel job. Those particular stages will be run in the server engine (probably as a child process of a parallel skeleton operator) while the rest of the parallel job runs in the parallel engine. BASIC transformers cannot be compiled as a C++ operator.

If you have not installed the parallel engine in an instance, due to licensing or other reason, you cannot run parallel jobs.

Regards,
- james wiles


All generalizations are false, including this one - Mark Twain.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Ah... a much more thorough reply. Thank goodness for new fingers. :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
rsunny
Participant
Posts: 223
Joined: Sat Jul 03, 2010 10:22 pm

Post by rsunny »

jwiles wrote: Server jobs are server jobs...they run in the server engine as always. They will not use a parallel configuration file because they do not utilize the parallel engine.

Parallel jobs run only on the parallel engine, except for those instances where you can use a BASIC transformer or other server-based stage as part of a parallel job. Those particular stages will be run in the server engine (probably as a child process of a parallel skeleton operator) while the rest of the parallel job runs in the parallel engine. BASIC transformers cannot be compiled as a C++ operator.

Regards,
Hi ,

Thank you for you reply

So when we purchase enetrprise edition does it also include server engine right?

Can we use server-based stage in a parallel job? As a parallel job contains all the parallel job stages and not server based stages. I am little bit confused as can you tell me what are server-based stages that we can use in parallel jobs?

How does datastage differentiate if we use server based stages in parallel jobs as for eg Sort stage is included in both server and parallel?

Thanks in advance
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

There are two Sort stages, even though they're called the same thing. (What else would you call it?) Open the properties in the Stage Types branch of the Repository to note significant differences.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
rsunny
Participant
Posts: 223
Joined: Sat Jul 03, 2010 10:22 pm

Post by rsunny »

Hi ,

thanks for your reply , but how can we use server-based stage as part of a parallel job. I mean can we use server stages in parallel jobs. I am sorry for giving hardtime but i am confused.as the parallel jobs can use only parallel stages and not server based stages.

Thanks in advance
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Why are you confused? Server jobs use server stages, parallel jobs use parallel stages, sequence jobs use sequence stages. You even get a different Palette depending on what job type you have open.

(You can use server Shared Containers in parallel jobs under certain circumstances, but not server stage types directly.)
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
rsunny
Participant
Posts: 223
Joined: Sat Jul 03, 2010 10:22 pm

Post by rsunny »

jwiles wrote: Parallel jobs run only on the parallel engine, except for those instances where you can use a BASIC transformer or other server-based stage as part of a parallel job. Those particular stages will be run in the server engine (probably as a child process of a parallel skeleton operator) while the rest of the parallel job runs in the parallel engine. BASIC transformers cannot be compiled as a C++ operator.

Regards,
Hi ray,
As james mentioned "use other server-based stage as part of a parallel job". so i was thinking how can we use server based job as a part of parallel job.

Thanks in advance
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Server Shared Container.
-craig

"You can never have too many knives" -- Logan Nine Fingers
jwiles
Premium Member
Premium Member
Posts: 1274
Joined: Sun Nov 14, 2004 8:50 pm
Contact:

Post by jwiles »

(My fingers don't feel all that new today...gotta be the cold here in Chicago :) )

Be sure to read the documentation for the server shared container to understand how to use it properly. Also, it won't be a magic bullet for your server jobs as far a performance...they are still running in the server engine. Performance of your parallel job which uses one will likely be hindered somewhat as well.

Regards,
- james wiles


All generalizations are false, including this one - Mark Twain.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Ah, "performance". Is it time for that argument again?

For small volumes of data a server job can be started, run and finished before a parallel job has completed its startup phase. This startup phase is allegedly improved (quicker) in version 8.5 but I've not yet had a chance to verify that one way or another.

So the real question remains: how should you define "performance" in an ETL context? I tend heavily to the view that it's about being able to meet time window KPIs with a margin for safety, that it's not properly measured as a rate at all.

Server components in parallel jobs will always slow them down, if only because of the need to transition between strongly-typed and typeless environments.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
jwiles
Premium Member
Premium Member
Posts: 1274
Joined: Sun Nov 14, 2004 8:50 pm
Contact:

Post by jwiles »

I agree. Startup time is a killer for small files in the parallel engine (small being a loosely-typed term). 30+ seconds to startup a job that runs in <1 sec is a waste.

Poorly chosen wording after a long day on-site...I'm allowed at least 10-12 of those I think. :)
- james wiles


All generalizations are false, including this one - Mark Twain.
gateleys
Premium Member
Premium Member
Posts: 992
Joined: Mon Aug 08, 2005 5:08 pm
Location: USA

Post by gateleys »

ray.wurlod wrote:For small volumes of data a server job can be started, run and finished before a parallel job has completed its startup phase.
Ray, what would you consider a "small" volume, versus a "medium"??? or "large" volume?

What about load times differences between server and parallel jobs, both in terms of conventional inserts and direct path loads?
gateleys
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Case by case basis. I guess anything under 1-2 MB (possibly larger) would fit in the small category. There's nothing to prevent a server job setting up a parallel load.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply