Mail issue was a problem with Dovecot. It did not open the SASL socket for some reason. Restarted and now is okay.
Category Archives: Uncategorized
Mail Issues
We are having some problems with the mail sub-system that I am still struggling to understand. It is responding slowly and refusing service to some hosts while permitting it to others and I have not yet been able to determine why.
Ice Spontaneously booted today
Ice, a machine which holds the /mail partition as well as mail, mx2, and some private virtual machines, spontaneously rebooted today.
There would appear to be some stability issues in 5.7 yet. I suspect these related to NFS as both ice and iglulik export major NFS partitions used by the rest of the machines here and the physical hosts which do not as well as the virtual machines which do not have all been stable.
There were some major changes in the NFS v4.2 code in 5.7 which I hope, when adequately debugged, will result in better reliability so I’m not giving up on this kernel just yet but will keep up with point releases and try to identify in greater detail what is failing.
Everything is back in service but I still need to check all the mounts.
Iglulik
Iglulik, the host which hosts /home directories spontaneously booted about 3pm today. I do not yet know what caused this. I will need to re-check all the NFS mounts on the other hosts since invariably some of them fail to remount correctly.
Hosts NFS / NIS Mounts / Binding Verified
Hosts NFS mounts have been checked and NIS bindings have been checked. A few hosts failed to come up completely after reboot. All of these problems have been resolved and all hosts are operational except for OpenSuse.
OpenSuse has a problem with a library that breaks NIS. I opened a ticket on this close to half a year ago. If it is not resolved soon I am going to discontinue this host.
If anyone has any suggestions for a better Linux distro, please e-mail them to nanook@eskimo.com.
Thank you.
Reboots Complete – Still Checking Hosts
The reboots are completed but I am about an hour behind schedule.
Two things set me back. First, SOMETHING installed dnsmasq on my stealth master DNS server. It is a master that is hidden behind a firewall so that hackers can’t inject nastiness into it and then it supplies all the secondary servers with zone records.
Because it has bind, it does not need dnsmasq. Further, dnsmasq breaks bind IF it starts first because it uses the same network port (53) as bind thus blocking bind’s ability to attach to that port and function.
So at some past point when I rebooted, about a week ago, zone records just now expired and all the secondary servers quit serving them, so when I went to ssh into the server, my workstation couldn’t find them (and neither could any external computer), thus it was broken for everyone but because I had posted about the reboots everyone was expecting an outage and nobody called so I was unaware until i tried to connect and then it took me a little while to figure out what the hell was going on.
And then once that was resolved, one of Canonical’s engineers (the Ubuntu developers) asked me to try an experiment for them in order to try to nail down a problem with a apparmor profile for libvirtd, and that took additional time.
Everything is rebooted now but I am still checking for proper NFS mounts and NIS binding of hosts to servers.
Server Reboots
I am planning on rebooting physical hosts which will affect all services tonight starting at midnight. I should be complete by about 12:30, and then another hour or so to check all the servers for proper NFS/NIS mounting/binding which is not 100% reliable under Linux.
Eskimo SSL Certificate Replaced
We have been issued and installed new SSL certificates to correct the problems with the previous certificates caused by the issuer.
Eskimo SSL Certificate
We use a site-wide SSL certificate for our web and mail servers which we purchased from RAPIDSSL.
About two years ago, they were bought out by an outfit called SECTIGO, and asses apparently decided to shut off the intermediate certificate server even though they still had acquired customers using those certificates and without providing ANY warning to those customers (US), at least this is the explanation I’ve gotten from Integraserver, the dealer I bought the certificate through. SSLShopper’s SSL checker tells me the intermediate server expired a day ago.
Consequently you will get a message when you connect to our mail server saying unable to verify the authenticity. The web server is still working because Apache allows us to configure the intermediate servers into it so it doesn’t rely on their servers but there is no such option with the mail servers.
I am having the certificates re-issued to reflect the current intermediate servers and will install as soon as I receive it.
Reboots Done
Reboots are completed, all NFS mounts and NIS bindings are checked. As near as I can tell everything is operational again.