Sick Host Machine

     I think the motherboard is damaged in one of our hosts that holds the mail spool, mail.eskimo.com, mx2.eskimo.com, debian.eskimo.com, and scientific7.eskimo.com.

     In addition to random reboots, the machine is sometimes taking disk errors but the smart status does not show any issue with the drives, no errors recorded, which leaves the controllers which are on board.

     So tonight various services mentioned above will be down for a period of time as I move them off of this failing machine so I can take it out of service and replace the motherboard.

     When the BIOS lost it’s fan settings resulting in the shutting down of a chassis fan, it got quite warm, but it’s hard to say if the heat damaged it, or existing damage caused the BIOS to lose it’s settings.

     At any rate I am moving things off so I can take it out of service for several days to replace the motherboard and then to burn it in properly (extensive testing with mprime, while monitoring temperatures, etc.  This is both to make sure it is stable and to find the minimum voltage the CPU will operate stably on.  The lower the voltage, the less the heat. and the longer the life.

Leave a Reply