Eskimo North


          [Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

          Recent Problems with Eskinews...


          • To: outages-list@eskimo.com
          • Subject: Recent Problems with Eskinews...
          • From: Robert Dinse <nanook@eskimo.com>
          • Date: Sun, 26 Sep 1999 15:54:56 -0700 (PDT)
          • Resent-Date: Sun, 26 Sep 1999 15:57:08 -0700 (PDT)
          • Resent-From: outages-list@eskimo.com
          • Resent-Message-ID: <"3owXW3.0.wd1.1Jgxt"@mx2>
          • Resent-Sender: outages-list-request@eskimo.com

           
               There have been severe reliability problems with Eskinews recently and I
          thought I'd take a moment to share the details for those of you who are
          wondering what is going on. 
          
               We replaced the news server with newer hardware a while back after
          allegations from Sprint that our hardware was failing to respond causing the
          feed problems.  There was no, absolutely no evidence at this end that that was
          the case, but for the sake of getting past this issue, and also because the old
          hardware ran very warm and churned out tremendous BTU's, I opted to replace it.
          
               The new hardware would not boot correctly with old 2.0.x versions of Linux
          which we previously had on that machine.  The newer 2.2.x kernels, unforunately
          have some bugs on SMP machines that are causing instabilities.  We are working
          with various developers on this issue, and gradually those problems are being
          resolved. 
          
               The motherboard on the new machine has developed hardware problems causing
          ethernet errors, memory errors, and occasional SCSI errors.  The outfit we
          bought it from had some spares in stock but they tested those and found them to
          be defective, so they are still trying to chase down working replacement
          hardware for us.  So hardware problems are adding to the OS stability problems.
          
               Lastly, some hacker morons have discovered a buffer overflow exploits in
          innd.  On systems where the admins are insane enough to run inn as root, this
          can be exploited to gain root access.  (nothing here news related runs as
          root).  But even on sites where this is not the case, the attempts still crash
          the inn daemon and corrupt some of it's support files, in particular the
          'active' file.  When they do this it becomes necessary to hand-edit the active
          file to fix the problem. 
          
               This is fixed in recent releases of INN, but with the include files and
          libs we presently have on the machine, the new INN will not compile.  It will
          probably be necessary to re-load this machine from scratch to correct this with
          the recent Redhat 6.0 release, however, before we get to this point I want to
          get the hardware problems resolved. 
          
               After the machine is stable, we will add additional spool, and if a site
          survey works out, we'll get a satelite news feed to bypass the Sprint news
          probelms. 
          
               I have determined that part of the problem with the feed from Sprint is
          related to saturation of our T1's in the outbound direction.  Even though news
          is incoming, innd is very sensitive to latency and outbound saturation at times
          increases latency.  We are going to install additional bandwidth to remedy this
          issue, but the real motivation here isn't the news as that will be moved to
          satelite anyway, but better interactive response and web response for people
          hosting web pages here.  We don't want to introduce any bottlenecks at this
          end. 
          
          
          
          

          • Prev by Date: Reboots early 9/26/99
          • Next by Date: www2
          • Prev by thread: www2
          • Next by thread: Reboots early 9/26/99
          • Index(es):
            • Date
            • Thread