Ice Spontaneously booted today

     Ice, a machine which holds the /mail partition as well as mail, mx2, and some private virtual machines, spontaneously rebooted today.

     There would appear to be some stability issues in 5.7 yet.  I suspect these related to NFS as both ice and iglulik export major NFS partitions used by the rest of the machines here and the physical hosts which do not as well as the virtual machines which do not have all been stable.

     There were some major changes in the NFS v4.2 code in 5.7 which I hope, when adequately debugged, will result in better reliability so I’m not giving up on this kernel just yet but will keep up with point releases and try to identify in greater detail what is failing.

     Everything is back in service but I still need to check all the mounts.

Iglulik

     Iglulik, the host which hosts /home directories spontaneously booted about 3pm today.  I do not yet know what caused this.  I will need to re-check all the NFS mounts on the other hosts since invariably some of them fail to remount correctly.

Hosts NFS / NIS Mounts / Binding Verified

     Hosts NFS mounts have been checked and NIS bindings have been checked.  A few hosts failed to come up completely after reboot.  All of these problems have been resolved and all hosts are operational except for OpenSuse.

     OpenSuse has a problem with a library that breaks NIS.  I opened a ticket on this close to half a year ago.  If it is not resolved soon I am going to discontinue this host.

     If anyone has any suggestions for a better Linux distro, please e-mail them to nanook@eskimo.com.

     Thank you.

Reboots Complete – Still Checking Hosts

     The reboots are completed but I am about an hour behind schedule.

     Two things set me back.  First, SOMETHING installed dnsmasq on my stealth master DNS server.  It is a master that is hidden behind a firewall so that hackers can’t inject nastiness into it and then it supplies all the secondary servers with zone records.

     Because it has bind, it does not need dnsmasq.  Further, dnsmasq breaks bind IF it starts first because it uses the same network port (53) as bind thus blocking bind’s ability to attach to that port and function.

     So at some past point when I rebooted, about a week ago, zone records just now expired and all the secondary servers quit serving them, so when I went to ssh into the server, my workstation couldn’t find them (and neither could any external computer), thus it was broken for everyone but because I had posted about the reboots everyone was expecting an outage and nobody called so I was unaware until i tried to connect and then it took me a little while to figure out what the hell was going on.

     And then once that was resolved, one of Canonical’s engineers (the Ubuntu developers) asked me to try an experiment for them in order to try to nail down a problem with a apparmor profile for libvirtd, and that took additional time.

     Everything is rebooted now but I am still checking for proper NFS mounts and NIS binding of hosts to servers.

Server Reboots

     I am planning on rebooting physical hosts which will affect all services tonight starting at midnight.  I should be complete by about 12:30, and then another hour or so to check all the servers for proper NFS/NIS mounting/binding which is not 100% reliable under Linux.

Eskimo SSL Certificate

     We use a site-wide SSL certificate for our web and mail servers which we purchased from RAPIDSSL.

     About two years ago, they were bought out by an outfit called SECTIGO, and asses apparently decided to shut off the intermediate certificate server even though they still had acquired customers using those certificates and without providing ANY warning to those customers (US), at least this is the explanation I’ve gotten from Integraserver, the dealer I bought the certificate through. SSLShopper’s SSL checker tells me the intermediate server expired a day ago.

     Consequently you will get a message when you connect to our mail server saying unable to verify the authenticity.  The web server is still working because Apache allows us to configure the intermediate servers into it so it doesn’t rely on their servers but there is no such option with the mail servers.

     I am having the certificates re-issued to reflect the current intermediate servers and will install as soon as I receive it.

Physical Host Reboots

     Tonight I have to reboot physical hosts which various virtual machines and NFS file system hosts that hosts things like your mail spool and home directories.  I expect to start this around 12AM and the reboots should be completed by about 12:30 but it will take a few more hours to check all the client machines to be sure NFS properly remounted and NIS properly rebinds to the NIS servers.

Mail to Comcast

     A couple of days ago a customers account here was compromised about two PM and used to send about 40,000 spams out before I shut it down.  I managed to delete the majority of them from queue before they were processed but some  got through.

     In response, Comcast black listed out entire domain rather than the one customer all the spam came from.

     I’ve already talked to Comcast Security and have been told it will take them 24-72 hours to remove the block.  The ticket number is #NA250519982.