Page 1 of 1

Unable Kill <defunct> OSH processes under dsadm (Zombi

Posted: Mon Nov 15, 2010 4:40 am
by amaresh
Hi,

Is there any issue if we bring the Data Stage Server down even if the Zombie processes exist?

Is there any other way to kill those Zombies without restarting the AIX server? How can we see the reason behind every zombie?
--------------------------------------
--------------------------------------
I have brought the datastage(v8.1) server down (in AIX). Now I checked that there few old processes as below are stil running. I am not able to kill (kill -9) those.

Could you pls let me any resolution to this?

dsadm 1884338 1 0 Jul 22 - 0:00 /opt/IBM/InformationServer/Server/PXEngine/bin/osh -APT_PMsectionLeaderFlag mjnraxd018 10002 1 30 node2 mjnraxd018 1279813249.813487.f90d6 0
dsadm 1958114 2359388 0 0:00 <defunct>
dsadm 1990838 1 0 Oct 27 - 0:00 /opt/IBM/InformationServer/Server/PXEngine/bin/osh -APT_PMsectionLeaderFlag mjnraxd018 10003 0 30 node1 mjnraxd018 1288170077.373548.1910ca 0
dsadm 2359388 1 0 Jul 22 - 0:00 /opt/IBM/InformationServer/Server/PXEngine/bin/osh -APT_PMsectionLeaderFlag mjnraxd018 10002 2 30 node3 mjnraxd018 1279813249.813487.f90d6 0


Thanks
Amaresh

Posted: Mon Nov 15, 2010 6:35 am
by ArndW
There are several ways to get into a zombie or defunct state. These usually boil down to where the process is in an uninterruptable system call and awaiting a response (i.e. communicating to a parent process). The UNIX "kill" signals are actually just interrupt requests and when a process is in this state all they do is queue.

Sometimes you can check a hanging process for its parent and kill the parent, this will cascade and allow any hanging child processes to logout.

Re: Unable Kill <defunct> OSH processes under dsadm (Z

Posted: Mon Nov 15, 2010 6:41 am
by amaresh
Is there any way(any log) to see why the process is in Zombie state?
Without rebooting the AIX server, can we get rid of those?
As I was trying to kill those using kill -9 with ROOT log in, but could not kill those?

Thanks
Amaresh

Posted: Mon Nov 15, 2010 6:50 am
by ArndW
even kill -9 from root won't help kill a process that isn't "listening" to signals. Are these processes waiting on a socket, i.e. try "netstat -a | grep dsrpc".

Posted: Mon Nov 15, 2010 7:14 am
by amaresh
Hi,

There is no process in WAITING state as below:

$ netstat -a | grep dsrpc
tcp4 0 0 *.dsrpc *.* LISTEN
tcp4 0 0 mjnraxd018.dsrpc 10.5.66.153.51870 ESTABLISHED
tcp4 0 0 mjnraxd018.dsrpc 10.5.66.153.53545 ESTABLISHED
tcp4 0 0 mjnraxd018.dsrpc 10.5.66.153.55379 ESTABLISHED
tcp4 0 0 mjnraxd018.dsrpc 10.5.66.153.57468 ESTABLISHED
$