Stopped/started DS, RPC daemon problem
Moderators: chulett, rschirm, roy
-
- Premium Member
- Posts: 1044
- Joined: Wed Sep 29, 2004 3:30 am
- Location: Nottingham, UK
- Contact:
Stopped/started DS, RPC daemon problem
I just stopped and started DataStage using these commands:
bin/uv -admin -stop
(wait 30 seconds)
bin/uv -admin -start
When starting any DS application I get this:
Failed to connect to host: corus, project: UV
(The connection was refused or the RPC daemon is not running (81016))
I've done this loads of times, most recently yesterday (I learned early on that waiting 30 seconds between them is a good idea) but this time, it's broken. Any suggestions? All eyes are on me expectantly...
(the reason I am doing this is that large SAP IDoc loads seem to be more reliable if this is done beforehand)
bin/uv -admin -stop
(wait 30 seconds)
bin/uv -admin -start
When starting any DS application I get this:
Failed to connect to host: corus, project: UV
(The connection was refused or the RPC daemon is not running (81016))
I've done this loads of times, most recently yesterday (I learned early on that waiting 30 seconds between them is a good idea) but this time, it's broken. Any suggestions? All eyes are on me expectantly...
(the reason I am doing this is that large SAP IDoc loads seem to be more reliable if this is done beforehand)
Phil Hibbs | Capgemini
Technical Consultant
Technical Consultant
Phil,
I'm not sure that 30 seconds is sufficient on its own. You should issue the following commands to ensure that nothing is returned prior to starting the services -
If these are clean then you should be able to start, but if it returns anything then you need to wait until it is clear. I have seen in the past that this can take minutes (3 - 5) before it is clean.
Sometimes a memory segment can cause problems and you can list DataStage memory segments with the following -
This will return a listing of memory segments where it may look like -
and remove them using the following -
I've had to use all of the above commands at one point or another to get the services started again. It's been a year or so since I was last on a Unix box, but I believe this will help you. Others may have more current information on naming convention etc...
Regards,
I'm not sure that 30 seconds is sufficient on its own. You should issue the following commands to ensure that nothing is returned prior to starting the services -
Code: Select all
netstat -a | grep rpc
ps -ef | grep phantom
ps -ef | grep dsapi or dsslave
Sometimes a memory segment can cause problems and you can list DataStage memory segments with the following -
Code: Select all
lpcs -mop | grep dae
Code: Select all
0xdaec......
Code: Select all
lpcrm -m [enter the ID from the above command]
Regards,
Last edited by mhester on Wed Mar 30, 2005 8:25 am, edited 1 time in total.
Mike Hester
mhester@petra-ps.com
mhester@petra-ps.com
Hi,
was already posted in previous posts.
read the manuals for more insight.
if there are active connections dsrpcd will not restart till they are removed.
sometimes there is a 5-10 minute timeout for client connections (check using netstat | grep dsrpc)
a good practice is always bring ds service down after making sure no one is connected.
if and when you brought dsrpcd service down via uv -admin -stop make sure you don't have connections, if there are any have them terminated; only then will dsrpcd service go up when you use uv -admin -start.
IHTH,
was already posted in previous posts.
read the manuals for more insight.
if there are active connections dsrpcd will not restart till they are removed.
sometimes there is a 5-10 minute timeout for client connections (check using netstat | grep dsrpc)
a good practice is always bring ds service down after making sure no one is connected.
if and when you brought dsrpcd service down via uv -admin -stop make sure you don't have connections, if there are any have them terminated; only then will dsrpcd service go up when you use uv -admin -start.
IHTH,
Last edited by roy on Wed Mar 30, 2005 8:22 am, edited 2 times in total.
Roy R.
Time is money but when you don't have money time is all you can afford.
Search before posting:)
Join the DataStagers team effort at:
http://www.worldcommunitygrid.org
Time is money but when you don't have money time is all you can afford.
Search before posting:)
Join the DataStagers team effort at:
http://www.worldcommunitygrid.org
-
- Premium Member
- Posts: 1044
- Joined: Wed Sep 29, 2004 3:30 am
- Location: Nottingham, UK
- Contact:
Code: Select all
netstat -a | grep rpc
Code: Select all
tcp4 0 0 *.sunrpc *.* LISTEN
udp4 0 0 *.sunrpc *.*
Last edited by PhilHibbs on Wed Mar 30, 2005 10:52 am, edited 1 time in total.
Phil Hibbs | Capgemini
Technical Consultant
Technical Consultant
Chances are by the time your rechecking all is well so try the uv -admin -start again, you should be fine.
won't hurt checking for zombie phantom jobs in case you had any abnormal terminations.
oops forgot no they are not relevant you need to check for *.dsrpc entries
won't hurt checking for zombie phantom jobs in case you had any abnormal terminations.
oops forgot no they are not relevant you need to check for *.dsrpc entries
Roy R.
Time is money but when you don't have money time is all you can afford.
Search before posting:)
Join the DataStagers team effort at:
http://www.worldcommunitygrid.org
Time is money but when you don't have money time is all you can afford.
Search before posting:)
Join the DataStagers team effort at:
http://www.worldcommunitygrid.org
-
- Premium Member
- Posts: 1044
- Joined: Wed Sep 29, 2004 3:30 am
- Location: Nottingham, UK
- Contact:
I don't have an lpcs or lpcrm command.mhester wrote:Code: Select all
lpcs -mop | grep dae
Do you mean ipcs and ipcrm? (thanks to Google for that suggestion)
Phil Hibbs | Capgemini
Technical Consultant
Technical Consultant
IMHO he ment that, but having a sysadmin at hand performing tasks like this is an advice worth taking.
Roy R.
Time is money but when you don't have money time is all you can afford.
Search before posting:)
Join the DataStagers team effort at:
http://www.worldcommunitygrid.org
Time is money but when you don't have money time is all you can afford.
Search before posting:)
Join the DataStagers team effort at:
http://www.worldcommunitygrid.org
No, those ones are not the DataStage one. If there are they would look like earth.34902 earth.dsrpc 32768 0 32768 0 FIN_WAIT_2PhilHibbs wrote:That command is currently returning:Code: Select all
netstat -a | grep rpc
Are these DataStage related?Code: Select all
tcp4 0 0 *.sunrpc *.* LISTEN udp4 0 0 *.sunrpc *.*
Best bet is to kill off all Data Stage processes in your unix server before restarting. Also make sure all client sessions are logoff.
Nick
check ps before starting uv.
It would be good idea to insert check logic for ds process rather than waiting for 30 second.
bin/uv -admin -stop
(wait 30 seconds)
bin/uv -admin -start
the above script become like this.
bin/uv -admin -stop
ps -ef | grep phantom > temp.ps
ps -ef | grep dsapi or dsslave >> temp.ps
<< count the record in temp.ps >>
if < the number of record != 0 >
wait for serveral minutes and run ps above again.
bin/uv -admin -start
bin/uv -admin -stop
(wait 30 seconds)
bin/uv -admin -start
the above script become like this.
bin/uv -admin -stop
ps -ef | grep phantom > temp.ps
ps -ef | grep dsapi or dsslave >> temp.ps
<< count the record in temp.ps >>
if < the number of record != 0 >
wait for serveral minutes and run ps above again.
bin/uv -admin -start
-
- Premium Member
- Posts: 1044
- Joined: Wed Sep 29, 2004 3:30 am
- Location: Nottingham, UK
- Contact:
Re: Stopped/started DS, RPC daemon problem
Thanks for all the suggestions.
It now turns out that stopping and starting datastage is not a panacea for large IDoc loads, as I just did that and it went more badly wrong than it has ever done before. Oh well.
It now turns out that stopping and starting datastage is not a panacea for large IDoc loads, as I just did that and it went more badly wrong than it has ever done before. Oh well.
Phil Hibbs | Capgemini
Technical Consultant
Technical Consultant
Hi,
one flaw in goma's method!
you should make shure no one is connected before performing uv -admin -stop.
one flaw in goma's method!
you should make shure no one is connected before performing uv -admin -stop.
Roy R.
Time is money but when you don't have money time is all you can afford.
Search before posting:)
Join the DataStagers team effort at:
http://www.worldcommunitygrid.org
Time is money but when you don't have money time is all you can afford.
Search before posting:)
Join the DataStagers team effort at:
http://www.worldcommunitygrid.org