Eskimo North


          [Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

          Host Problems


          • To: outages-list@eskimo.com
          • Subject: Host Problems
          • From: Robert Dinse <nanook@eskimo.com>
          • Date: Tue, 16 Nov 1999 14:59:47 -0800 (PST)
          • Resent-Date: Tue, 16 Nov 1999 14:59:51 -0800
          • Resent-From: outages-list@eskimo.com
          • Resent-Message-ID: <"ECvB-3.0.4X1.c7UCu"@mx1>
          • Resent-Sender: outages-list-request@eskimo.com

          
               Problems this afternoon starting at about noon; were all related to human
          error.
          
               Aaron was ill today; Catherine came in to cover for him.  I managed to get
          about an hour sleep between 9am and 10am this morning so I really was not and
          am not in a state to cover.
          
               She saw "tombstones" on the perfmeter for eskimo, this is an indication
          that rcp.statd isn't responding, may be a dead machine, but may also be a load
          spike.
          
               She rebooted eskimo; eskimo failed to come up because it couldn't mount
          file systems from mx1.  She rebooted again, this time without a graceful halt.
          It resulted in file system corruption which had to be manually fixed.
          
               After that was fixed; eskimo would still not mount files from mx1 which
          has the mail spool; and it happens that that is the first nfs mounted file
          system it tries to mount so it basically stopped all the others.
          
               Web servers and other machines are dependent upon eskimo for web files;
          mail is dependent upon mx1 to access the spool.  So this really broke a lot of
          things.
          
               Aaron had been working on trying to get knfsd operational on mx1.  We
          need it because it supports file system locking and the regular nfsd doesn't
          and this causes problems with some mailers.
          
               He wasn't successful but when he put the old daemons back, he put some
          non-functional daemons back instead of the ones that were working.  The
          functional daemons appear to have been deleted.
          
               I had to restore these from tape which is what took so long to get that
          machine operational again.
          
               Everything appears to be back to normal, Eric is here and I'm going to
          attempt to get some sleep.
          
          

          • Prev by Date: Virtual Domain Problems
          • Next by Date: Tacoma 56k Pool Problems...
          • Prev by thread: Virtual Domain Problems
          • Next by thread: Tacoma 56k Pool Problems...
          • Index(es):
            • Date
            • Thread