Page 1 of 1

Parallelism- DataStage Processing Nodes

Posted: Fri Oct 26, 2007 4:59 am
by devidotcom
Hi All,

I have a doubt here on the logical processing nodes defined in the configuration file need to get it clarified.

Lets consider we have a single CPU and memory and the config file has 2 processing nodes. So if my job runs on 2 node config file there would be 2 process that would run in parallel. How would the speed increase in that case as in a sequential mode. Is that the time would be the same as in a sequential mode.

I have heard of config file with 64 processing nodes would that mean i would have 64/2 = 32 CPUs... Really!!!


How do we decide on the number of processing nodes each CPU can run on...

These some of the basic questions i have.
pLease let me know if anyone has some understanding on this... :(

Posted: Fri Oct 26, 2007 5:18 am
by stefanfrost1
i think the general recommendation is 1 node per core, so if you have 4 dual core cpu you should run with 8 nodes config. However it all depends on your job design and hardware support...

Posted: Fri Oct 26, 2007 9:06 pm
by ray.wurlod
You monitor your jobs, and determine what percentage of resources is required for each. Note that you assumption "one process per job" is way off - it's more like one process per stage plus one process per node plus one. Based on the percentages you can figure out how many nodes is appropriate. It will probably be different for different jobs.

Posted: Sun Oct 28, 2007 9:46 pm
by devidotcom
Hi Ray,

I am not a premium member so unable to read the complete reply of yours

Thanks

Posted: Sun Oct 28, 2007 10:20 pm
by ArndW
devidotcom - so it seems that there are two basic choices. You going for the premium membership or Ray posting the answer as non-premium. if I were a betting man I'd know on which one I'd be placing my bets.

The Premium costs is not excessive and I have yet to see a single post in this forum where a premium member complained about the value he/she was getting for the fee.