Fatal Error: Unable to allocate communication resources

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

JPalatianos
Premium Member
Premium Member
Posts: 306
Joined: Wed Jun 21, 2006 11:41 am

Fatal Error: Unable to allocate communication resources

Post by JPalatianos »

Hi,
We have a job that starts with a sequential file and then does a lookup (via the lookup stage) on an ODBC stage. When we run the job we get the following error on the lookup stage.
1st warning:
main_program: Operator "parallel APT_KeyGenerator in skNewPlan" is not wave aware; the operator will be reset and rerun on each wave if multiple waves present. This may lead to incorrect results and memory issues. Update the operator to make it wave aware and calls setWaveAware() in describeOperator() to inform the framework that the operator knows how to handle waves.
2nd warning:
skNewPlan: When checking operator: When binding output interface field "METRIX_PLAN_KEY" to field "METRIX_PLAN_KEY": Implicit conversion from source type "uint64" to result type "int32": Possible range limitation.

3rd warning:
Sequential_File_11: When checking operator: A sequential operator cannot preserve the partitioning
of the parallel data set on input port 0.

Fatal error 1:
lkup_existingPlans,0: Fatal Error: Unable to allocate communication resources.

Fatal error 2:
node_node0: Player 4 terminated unexpectedly.
Player 2 terminated unexpectedly.

Fatal error 3:
main_program: APT_PMsectionLeader(1, node0), player 2 - Unexpected exit status 1.

last fatal error:
main_program: Step execution finished with status = FAILED.

Thanks - - John
JPalatianos
Premium Member
Premium Member
Posts: 306
Joined: Wed Jun 21, 2006 11:41 am

Post by JPalatianos »

Just wanted to add that developers have only recently started developing Parallel jobs in our dev environment and am wondering if it is a configuration issue on my part. Being new to the enterprise edition, I was wondering If I missed something during the install/setup/configuration of DataStage?
Thanks - - John
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Are you doing wave processing (intentionally)? What stage is setting up end-of-waves? Are you sure that you've seen all the erorr messages? If, for instance, a disk fills up you will also get the "Unable to allocate communication resources." message, but that is not the cause of the problem, just an effect.
JPalatianos
Premium Member
Premium Member
Posts: 306
Joined: Wed Jun 21, 2006 11:41 am

Post by JPalatianos »

Definitely not doing wave processing intentionally since neither I or the developer knows wghta that is. Sorry for the ignorance...but we are new to the world of parallel processing.

Regarding the errors these are all I see in director. There are other messages but all informational.
priyadarshikunal
Premium Member
Premium Member
Posts: 1735
Joined: Thu Mar 01, 2007 5:44 am
Location: Troy, MI

Post by priyadarshikunal »

This is a common warning when you use ODBC connector stage as a source and Surrogate key generator stage downstream. In that case it should not cause this error.

You should have some other information in logs. Since the error is coming in a lookup stage (i believe). Check for other log entry which says "Lookup failed" or some thing similar.

Check if you have any message handler attached.
Priyadarshi Kunal

Genius may have its limitations, but stupidity is not thus handicapped. :wink:
JPalatianos
Premium Member
Premium Member
Posts: 306
Joined: Wed Jun 21, 2006 11:41 am

Post by JPalatianos »

The only message I see from the LOOKup is this:
Fatal error 1:
lkup_existingPlans,0: Fatal Error: Unable to allocate communication resources.
JRodriguez
Premium Member
Premium Member
Posts: 425
Joined: Sat Nov 19, 2005 9:26 am
Location: New York City
Contact:

Post by JRodriguez »

JPalatiano,

Did you implemented these suggested steps in your Windows PX processing environment?

http://publib.boulder.ibm.com/infocente ... e_win.html

If yes then can try a couple of steps to fix the issue in order of how easy they are:

Add the environment variable APT_DISABLE_COMBINATION=FALSE

or increase the DSIPC_OPEN_TIMEOUT value
or add the following 2 environment variables to the job, compile and re-run it.

APT_PM_CONDUCTOR_TIMEOUT=60
APT_PM_NODE_TIMEOUT=60

I have given you the initial settings for these environment variables. You can increase them in 30 second increments until the job runs without error


Finally, check the OS users/groups permission on the WINDOWS\TEMP directory and make sure to set the group Users to be 'All Access'
Last edited by JRodriguez on Thu Jun 11, 2009 11:32 am, edited 2 times in total.
Julio Rodriguez
ETL Developer by choice

"Sure we have lots of reasons for being rude - But no excuses
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

JRodriguez wrote:Add the environment variable APT_DISABLE_COMBINATION=FALSE
Guessing you actually meant True here.
-craig

"You can never have too many knives" -- Logan Nine Fingers
JRodriguez
Premium Member
Premium Member
Posts: 425
Joined: Sat Nov 19, 2005 9:26 am
Location: New York City
Contact:

Post by JRodriguez »

Nope, This should be set to "FALSE" to use less time between processes ...
Julio Rodriguez
ETL Developer by choice

"Sure we have lots of reasons for being rude - But no excuses
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

OK... but isn't that the default? I thought the suggestion was to disable it so perhaps the problem stage/area would be more obvious in the score or logs but I guess not.

Carry on.
-craig

"You can never have too many knives" -- Logan Nine Fingers
JRodriguez
Premium Member
Premium Member
Posts: 425
Joined: Sat Nov 19, 2005 9:26 am
Location: New York City
Contact:

Post by JRodriguez »

Chulett,

Normally you disable combining operator to facilitate debugging as you stated .. in this case we are looking that the job used less resources

When you set the variable to TRUE more processes are generated, and the job will used more system resources which can cause this error
Julio Rodriguez
ETL Developer by choice

"Sure we have lots of reasons for being rude - But no excuses
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Got that... but still, I don't believe adding that will change anything as the option of "operator combinality" is on and happening by default regardless. You specifically add the one you mentioned in order to set 'disable' to 'true' and thus stop it from happening.

That was my only point in the last post.
-craig

"You can never have too many knives" -- Logan Nine Fingers
JPalatianos
Premium Member
Premium Member
Posts: 306
Joined: Wed Jun 21, 2006 11:41 am

Post by JPalatianos »

Thanks guys!!!
I added the variable APT_DISABLE_COMBINATION and set it to TRUE. All warning and errors are now gone.
JRodriguez
Premium Member
Premium Member
Posts: 425
Joined: Sat Nov 19, 2005 9:26 am
Location: New York City
Contact:

Post by JRodriguez »

Great!

if your warning and errors are gone after setting the environment variable to TRUE mean that you timeout setting were not enough

Leaving the environment variable set to true disables internal job optimizations and will cause other issues in the future ...

I would suggest fixing the root of this issue ... and then setting the environment variable back to FALS
Julio Rodriguez
ETL Developer by choice

"Sure we have lots of reasons for being rude - But no excuses
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

As noted, it's not really a solution, per se... still best to see if you can track down the root cause.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply