[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Recent Problems with Eskinews...
- To: outages-list@eskimo.com
- Subject: Recent Problems with Eskinews...
- From: Robert Dinse <nanook@eskimo.com>
- Date: Sun, 26 Sep 1999 15:54:56 -0700 (PDT)
- Resent-Date: Sun, 26 Sep 1999 15:57:08 -0700 (PDT)
- Resent-From: outages-list@eskimo.com
- Resent-Message-ID: <"3owXW3.0.wd1.1Jgxt"@mx2>
- Resent-Sender: outages-list-request@eskimo.com
There have been severe reliability problems with Eskinews recently and I
thought I'd take a moment to share the details for those of you who are
wondering what is going on.
We replaced the news server with newer hardware a while back after
allegations from Sprint that our hardware was failing to respond causing the
feed problems. There was no, absolutely no evidence at this end that that was
the case, but for the sake of getting past this issue, and also because the old
hardware ran very warm and churned out tremendous BTU's, I opted to replace it.
The new hardware would not boot correctly with old 2.0.x versions of Linux
which we previously had on that machine. The newer 2.2.x kernels, unforunately
have some bugs on SMP machines that are causing instabilities. We are working
with various developers on this issue, and gradually those problems are being
resolved.
The motherboard on the new machine has developed hardware problems causing
ethernet errors, memory errors, and occasional SCSI errors. The outfit we
bought it from had some spares in stock but they tested those and found them to
be defective, so they are still trying to chase down working replacement
hardware for us. So hardware problems are adding to the OS stability problems.
Lastly, some hacker morons have discovered a buffer overflow exploits in
innd. On systems where the admins are insane enough to run inn as root, this
can be exploited to gain root access. (nothing here news related runs as
root). But even on sites where this is not the case, the attempts still crash
the inn daemon and corrupt some of it's support files, in particular the
'active' file. When they do this it becomes necessary to hand-edit the active
file to fix the problem.
This is fixed in recent releases of INN, but with the include files and
libs we presently have on the machine, the new INN will not compile. It will
probably be necessary to re-load this machine from scratch to correct this with
the recent Redhat 6.0 release, however, before we get to this point I want to
get the hardware problems resolved.
After the machine is stable, we will add additional spool, and if a site
survey works out, we'll get a satelite news feed to bypass the Sprint news
probelms.
I have determined that part of the problem with the feed from Sprint is
related to saturation of our T1's in the outbound direction. Even though news
is incoming, innd is very sensitive to latency and outbound saturation at times
increases latency. We are going to install additional bandwidth to remedy this
issue, but the real motivation here isn't the news as that will be moved to
satelite anyway, but better interactive response and web response for people
hosting web pages here. We don't want to introduce any bottlenecks at this
end.