Eskimo North

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Outage Friday May 23

     I apologize for the lengthy outage we had this evening.  A 24-port N-way
data switch that is central to the entire LAN, died.  Unfortunately, it didn't
die hard but flaked out badly which made troubleshooting very difficult.

     Initially it looked like a problem with Eskimo, then Ultra1, then it
looked like some sort of denial-of-service attack but I turned down all of the
T1's isolating us from the net and it continued.

     Then it came down to shutting down just about everything to try to
determine what was causing the problems.

     At one point a core group of machines seemed to communicate OK, could
ping each other, but when we tried to get protocols like NFS and NIS to talk,

     Then I found the last group of eight ports seemed to be hosed on the
switch, and moved everything over to the first 16 that was critical, and things
sort of worked, for a few minutes.

     At that point I ran and got another switch and replaced it and now things
seem to be back to "normal" again.

     Still workin gout a few minor glitches but thing should be basically back
to functional.