Search the FAQ Archives

3 - A - B - C - D - E - F - G - H - I - J - K - L - M
N - O - P - Q - R - S - T - U - V - W - X - Y - Z
faqs.org - Internet FAQ Archives

Unix - Frequently Asked Questions (3/7) [Frequent posting]
Section - How do I get rid of zombie processes that persevere?

( Part1 - Part2 - Part3 - Part4 - Part5 - Part6 - Part7 - Single Page )
[ Usenet FAQs | Web FAQs | Documents | RFC Index | Houses ]


Top Document: Unix - Frequently Asked Questions (3/7) [Frequent posting]
Previous Document: Is it possible to pass shell variable settings into an awk program?
Next Document: How do I get lines from a pipe ... instead of only in larger blocks?
See reader questions & answers on this topic! - Help others by sharing your knowledge
>From: casper@fwi.uva.nl (Casper Dik)
Date: Thu, 09 Sep 93 16:39:58 +0200

3.13) How do I get rid of zombie processes that persevere?

      Unfortunately, it's impossible to generalize how the death of
      child processes should behave, because the exact mechanism varies
      over the various flavors of Unix.

      First of all, by default, you have to do a wait() for child
      processes under ALL flavors of Unix.  That is, there is no flavor
      of Unix that I know of that will automatically flush child
      processes that exit, even if you don't do anything to tell it to
      do so.

      Second, under some SysV-derived systems, if you do
      "signal(SIGCHLD, SIG_IGN)" (well, actually, it may be SIGCLD
      instead of SIGCHLD, but most of the newer SysV systems have
      "#define SIGCHLD SIGCLD" in the header files), then child
      processes will be cleaned up automatically, with no further
      effort in your part.  The best way to find out if it works at
      your site is to try it, although if you are trying to write
      portable code, it's a bad idea to rely on this in any case.
      Unfortunately, POSIX doesn't allow you to do this; the behavior
      of setting the SIGCHLD to SIG_IGN under POSIX is undefined, so
      you can't do it if your program is supposed to be
      POSIX-compliant.

      So, what's the POSIX way? As mentioned earlier, you must
      install a signal handler and wait. Under POSIX signal handlers
      are installed with sigaction. Since you are not interested in
      ``stopped'' children, only in terminated children, add SA_NOCLDSTOP
      to sa_flags.  Waiting without blocking is done with waitpid().
      The first argument to waitpid should be -1 (wait for any pid),
      the third should be WNOHANG. This is the most portable way
      and is likely to become more portable in future.

      If your systems doesn't support POSIX, there's a number of ways.
      The easiest way is signal(SIGCHLD, SIG_IGN), if it works.
      If SIG_IGN cannot be used to force automatic clean-up, then you've
      got to write a signal handler to do it.  It isn't easy at all to
      write a signal handler that does things right on all flavors of
      Unix, because of the following inconsistencies:

      On some flavors of Unix, the SIGCHLD signal handler is called if
      one *or more* children have died.  This means that if your signal
      handler only does one wait() call, then it won't clean up all of
      the children.  Fortunately, I believe that all Unix flavors for
      which this is the case have available to the programmer the
      wait3() or waitpid() call, which allows the WNOHANG option to
      check whether or not there are any children waiting to be cleaned
      up.  Therefore, on any system that has wait3()/waitpid(), your
      signal handler should call wait3()/waitpid() over and over again
      with the WNOHANG option until there are no children left to clean
      up. Waitpid() is the preferred interface, as it is in POSIX.

      On SysV-derived systems, SIGCHLD signals are regenerated if there
      are child processes still waiting to be cleaned up after you exit
      the SIGCHLD signal handler.  Therefore, it's safe on most SysV
      systems to assume when the signal handler gets called that you
      only have to clean up one signal, and assume that the handler
      will get called again if there are more to clean up after it
      exits.

      On older systems, there is no way to prevent signal handlers
      from being automatically reset to SIG_DFL when the signal
      handler gets called.  On such systems, you have to put
      "signal(SIGCHILD, catcher_func)" (where "catcher_func" is the
      name of the handler function) as the last thing in the signal
      handler, so that it gets reset.

      Fortunately, newer implementations allow signal handlers to be
      installed without being reset to SIG_DFL when the handler
      function is called.  To get around this problem, on systems that
      do not have wait3()/waitpid() but do have SIGCLD, you need to
      reset the signal handler with a call to signal() after doing at
      least one wait() within the handler, each time it is called.  For
      backward compatibility reasons, System V will keep the old
      semantics (reset handler on call) of signal().  Signal handlers
      that stick can be installed with sigaction() or sigset().

      The summary of all this is that on systems that have waitpid()
      (POSIX) or wait3(), you should use that and your signal handler
      should loop, and on systems that don't, you should have one call
      to wait() per invocation of the signal handler.

      One more thing -- if you don't want to go through all of this
      trouble, there is a portable way to avoid this problem, although
      it is somewhat less efficient.  Your parent process should fork,
      and then wait right there and then for the child process to
      terminate.  The child process then forks again, giving you a
      child and a grandchild.  The child exits immediately (and hence
      the parent waiting for it notices its death and continues to
      work), and the grandchild does whatever the child was originally
      supposed to.  Since its parent died, it is inherited by init,
      which will do whatever waiting is needed.  This method is
      inefficient because it requires an extra fork, but is pretty much
      completely portable.

User Contributions:

Comment about this article, ask questions, or add new information about this topic:

CAPTCHA