Eskimo North


          [Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

          Broken Things...


          • To: ericj@eskimo.com, hydra@eskimo.com, badger@eskimo.com
          • Subject: Broken Things...
          • From: Robert Dinse <nanook@eskimo.com>
          • Date: Wed, 9 May 2001 02:42:14 -0700 (PDT)
          • cc: outages-list@eskimo.com
          • Newsgroups: lobby, announcements
          • Resent-Date: Wed, 9 May 2001 02:42:28 -0700
          • Resent-From: outages-list@eskimo.com
          • Resent-Message-ID: <"oKMWU2.0.WJ5.44H-w"@mx1>
          • Resent-Sender: outages-list-request@eskimo.com

          
               As Chris knows, the tape drive hooked up to Ultra1 hung, hanging the SCSI
          bus, and resulted in massive file system damage to /userspace.
          
               It took about four hours of hair pulling to get the file system to where
          it would fsck clean.  Numerous files were deleted and/or moved to lost+found,
          over 5400 files in lost+found, so there is a hell of a mess to clean up.
          
               I knew we'd have a lot of stuff to restore from tape so my first thought
          was to take the old eskimo drives, make a big partition out of them (4x 9gb
          would have been enough), and restore the entire dump into that so we could copy
          disk to disk things that were needed. 
          
               Unfortunately, after Eric left, 400MB into that restore approximately, one
          of the old drives crapped out, some sort of mechanical problem with the head
          actuator (it's highly audible to put it mildly).
          
               Long and short of it is we're going to have to restore things from tape.
          I've got a restore going now of our www site and these users whose directories
          were entirely blitzed:
          
          	bhc, baldy, farrago, hshinn, neverstp, and eborders.
          
               Other users who have files that ended up in /userspace/lost+found by inode
          name, but for whom I haven't figured out where they belong yet include:
          
          	6wood, amcc, avery-ec, badger, baldy, bhc, biged98, bloo, bpentium,
          cek, crick, dalus, dempt, earthcng, erict, farrago, ghawk, hideaway, htak,
          jessamyn, jragon, milla, nanook (lucky me), ncpa, neverstp, nobody, ravensys,
          redfoot, root, sandy, smartlst, tweiler, user. 
          
               Keep track of anything users say is missing doesn't work and we'll grab
          those things from tape on additional passes after the initial restore is
          complete.
          
               Customers: Please be patient while we try to put this back together.
          Please let us know any files(s) missing and we'll add those to the list of
          files to restore on the next tape pass.
          
               Per a recommendation on Exabytes website (this is a bug they knew of),
          I've added a second SCSI controller to the machine just for the tape drive so
          that if it hangs it won't affect the disks in the future. 
          
               I've got to tell you I'm feeling majorly bummed right now.  Last month and
          this have expenses have exceeded income.  Problems with MegaPOP, losing DSL,
          etc, has created a lot of additional short term expenses.
          
               And then as we've been experiencing all of the economic problems customers
          have been subscribing for much shorter terms greatly reducing the immediate
          income.
          
               At the same time we've lost hardware (the entire server) which needed to
          be replaced, had overlap in port wholesalers as we moved people from MegaPOP
          mid-month, and I've ordered a third T1 to address data congestion issues (which
          is due to be installed June 11, 2001). 
          
               What I really want to do with Eskimo and what is seeming like a more and
          more distant goal, is to reproduce the original interface we used to have on a
          single-line BBS, except make it available via the web and telnet as well with
          Internet extensions, including the games, etc.
          
               The Internet has huge potential to bring people together that I believe is
          almost totally unrealized, and I want to see that potential realized, but damn,
          it just seems like everytime I get one foot forward I'm blasted back ten.
          
               Folks, I'd really really appreciate your support at this time in any way
          you can, bringing new users to us, subscribing for longer terms so we have
          some much needed immediate income to cover expenses and anyone wishing to help
          develope code we can use it.
          
               I'm really getting bumbed about this kind of marginal existance.  If I had
          my drathers, we'd have a second file server as a hot spare, constantly
          mirroring the data on the main server so if the first server died we could just
          tell the second to go active, UPS for everything, etc.  But right now things
          are extremely tight.
          
               Barring any sudden rush of new users and income I'm going to be forced to
          make some adjustments upwards to the rates which I know will lose a percentage
          of customers, but we need some redundancy and cushion for disasters that
          invetiably happen. 
          
          
          

          • Prev by Date: Tape Drive induced mass damage.
          • Next by Date: Ultra1
          • Prev by thread: News reboot... scanning disks
          • Next by thread: Tape Drive induced mass damage.
          • Index(es):
            • Date
            • Thread