Eskimo North

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Eskimo Downtime

     Eskimo went down around 11am with an MMU error on CPU 1.  It did not
fail hardware diagnostics so it was a transient error of some sort.

     In running the automatic file system checks that are run after a
crash; one of them failed to resolve a file system problem and it went
into single user mode and at there.

     Virtually every other server is dependent upon that machine in some
way because it has all the user files on it and does the authentication
for logins.  Other machines crashed as a result of too many processing
backing up on NFS file requests.  Because of cross-mounts between various
machines, when machines with dependencies on each other both go down
getting everything back online is really ugly. 

     I was unaware of this until about 4:30pm when one of the dedicated
PPP customers called at an alternate contact number I gave all the
dedicated people.  John is out of town for a week, Jimmie was at the users
meeting today and I've been battling pneumonia so have been a bit on the
slow side. 

     Others had apparently tried to call that number earlier as well but
were mis-routed to some one elses line.

     I've setup the voice mail now on the main number, 812-0051, so that
it will page me if it is an urgent message (you touch-tone 1121 at the end
of the message).  Please do not mark the message urgent unless it is an
outage like this.