Server1 is a ETL resourse node, Server2 is a Database server.
The ID used is part of dstage group.
Other jobs are running in the project with the same id, which does not access that particular database.
I get the follwoing error, when I do the configuration Check.
Code: Select all
##I TOCK 000000 11:56:49(001) <main_program> OS charset:
ISO-8859-1.
##I TOCK 000000 11:56:49(002) <main_program> Input charset: UTF-8.
##I TFSC 000001 11:56:49(003) <main_program> APT configuration file: /path/default.apt
##I TFPA 000028 11:56:49(004) <main_program> APT Startup script: /path/startup.apt
##W TFPM 000152 11:57:22(000) <main_program> Accept timed out retries = 28
##E TFPM 000153 11:57:22(001) <main_program> The section leader on SERVER2 died
##E TFPM 000356 11:57:22(002) <main_program>
**** Parallel startup failed ****
This is usually due to a configuration error, such as
not having the Orchestrate install directory properly
mounted on all nodes, rsh permissions not correctly
set (via /etc/hosts.equiv or .rhosts), or running from
a directory that is not mounted on all nodes. Look for
error messages in the preceding output.
##I TFPM 000177 11:57:22(003) <main_program> Step started on node ENGINE_SERVER; it uses 7 nodes.
The program running the step is /path/orchadmin.
##I TFPM 000178 11:57:22(004) <main_program> The ORCHESTRATE startup program in /path/standalone.sh is being used.
##I TFPM 000180 11:57:22(005) <main_program> A startup script (in /path/startup.apt) is being used.
##I TFPM 000183 11:57:22(006) <main_program> The TCP port being used for startup is 10,002; the associated socket number is 5.
##I TFPM 000184 11:57:22(007) <main_program>
Node status:
##I TFPM 000185 11:59:06(012) <main_program> SERVER1 -
##I TFPM 000186 11:59:06(013) <main_program> OK
##I TFPM 000185 11:59:06(014) <main_program> SERVER2 -
##I TFPM 000187 11:59:06(015) <main_program> rsh issued, no response received