Eskimo North


          [Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

          Virtual Domain Problems


          • To: outages-list@eskimo.com
          • Subject: Virtual Domain Problems
          • From: Robert Dinse <nanook@eskimo.com>
          • Date: Wed, 17 Nov 1999 09:23:54 -0800 (PST)
          • cc: earthcng@eskimo.com
          • Resent-Date: Wed, 17 Nov 1999 10:03:32 -0800 (PST)
          • Resent-From: outages-list@eskimo.com
          • Resent-Message-ID: <"fuKcv3.0.vd3.ftkCu"@mx2>
          • Resent-Sender: outages-list-request@eskimo.com

          
               I had a call at 7am regarding virtual domains not responding.  I found
          that the server was at 100% CPU; a bunch of stuck 'swish' processes running.
          
               One of the consequences of allowing users to put their own cgi scripts
          on-line without review is that occasionally they will do stupid things that
          exhaust system resources.
          
               In the case of swish, users that run close to their quota limits may
          schedual cron jobs to build the swish database nightly.  If they exceed their
          quota during the swish database build, the result is a truncated swish database
          file. 
          
               When this happens; if someone does a search the swish process that is
          invoked attempts to seek past the end of the file which fails.  It retries
          indefinitely until manually killed.  If the user is impatient and keeps hitting
          the search button, these processes stack up and eventually bog the server down
          into non-functionality. 
          
               To prevent further occurances of this particular problem I have added the
          following to suexec, a program which is used to launch cgi scripts:
          
               /*
                * Place limit on CPU consumption.
                */
          
                  rl.rlim_cur = 30;
                  rl.rlim_max = 30;
                  setrlimit(RLIMIT_CPU, &rl);
          
               What this does is limit any CGI script to 30 seconds of CPU time, so for
          example when someone tries a swish search with a broken database file, instead
          of looping indefinitely, it will abort after 30 seconds, preventing processes
          from stacking up and killing the server. 
          
               On rare occasions I've also seen Apache get stuck and so I've added the
          code to apache, in httpd_main, in the function child_sub_main, I've added:
          
               /*
                * Set limit on CPU utilization by child process
                */
              rl.rlim_cur = 60;
              rl.rlim_max = 60;
              setrlimit(RLIMIT_CPU, &rl);
          
               This limits children to 60 seconds of CPU, which given the setting of
          MaxRequestsPerChild should be sufficient that this value would not, under
          normal circumstances, be exceeded, but will prevent a run-away process from
          bogging the server for long periods of time. 
          
               If a child should exceed this threshold, it will die and that request will
          be aborted; but the parent process will respawn additional child processes as
          required to service requests. 
          
          
          
          

          • Prev by Date: Tacoma/Seattle/Everett 56k
          • Next by Date: Host Problems
          • Prev by thread: Tacoma/Seattle/Everett 56k
          • Next by thread: Host Problems
          • Index(es):
            • Date
            • Thread