rsh issued, no response received
Posted: Thu Apr 07, 2011 1:59 pm
I have alredy set up SSH for Server1 and Server2 from Engine node. Iam able to do the SSH for the servers manaully. Where as not run the jobs from Datastage. Which gives me Section Leader died error.
Server1 is a ETL resourse node, Server2 is a Database server.
The ID used is part of dstage group.
Other jobs are running in the project with the same id, which does not access that particular database.
I get the follwoing error, when I do the configuration Check.
Kindly help me out.
Server1 is a ETL resourse node, Server2 is a Database server.
The ID used is part of dstage group.
Other jobs are running in the project with the same id, which does not access that particular database.
I get the follwoing error, when I do the configuration Check.
Code: Select all
##I TOCK 000000 11:56:49(001) <main_program> OS charset:
ISO-8859-1.
##I TOCK 000000 11:56:49(002) <main_program> Input charset: UTF-8.
##I TFSC 000001 11:56:49(003) <main_program> APT configuration file: /path/default.apt
##I TFPA 000028 11:56:49(004) <main_program> APT Startup script: /path/startup.apt
##W TFPM 000152 11:57:22(000) <main_program> Accept timed out retries = 28
##E TFPM 000153 11:57:22(001) <main_program> The section leader on SERVER2 died
##E TFPM 000356 11:57:22(002) <main_program>
**** Parallel startup failed ****
This is usually due to a configuration error, such as
not having the Orchestrate install directory properly
mounted on all nodes, rsh permissions not correctly
set (via /etc/hosts.equiv or .rhosts), or running from
a directory that is not mounted on all nodes. Look for
error messages in the preceding output.
##I TFPM 000177 11:57:22(003) <main_program> Step started on node ENGINE_SERVER; it uses 7 nodes.
The program running the step is /path/orchadmin.
##I TFPM 000178 11:57:22(004) <main_program> The ORCHESTRATE startup program in /path/standalone.sh is being used.
##I TFPM 000180 11:57:22(005) <main_program> A startup script (in /path/startup.apt) is being used.
##I TFPM 000183 11:57:22(006) <main_program> The TCP port being used for startup is 10,002; the associated socket number is 5.
##I TFPM 000184 11:57:22(007) <main_program>
Node status:
##I TFPM 000185 11:59:06(012) <main_program> SERVER1 -
##I TFPM 000186 11:59:06(013) <main_program> OK
##I TFPM 000185 11:59:06(014) <main_program> SERVER2 -
##I TFPM 000187 11:59:06(015) <main_program> rsh issued, no response received