Mail Server

     We have had a problem recently with dovecot corrupting it’s index files.  I do not know exactly when this started.  The version that Ubuntu provides is 2.3.11.

     I recently changed the mail client server from keeping things on an NFS mounted partition to local disk as the general consensus within the dovecot community seemed to suggest this was a NFS related problem.

     However, that did not resolve the issue.  Today I compiled the most recent version, 2.4-devel and installed it.  I also changed the locking from fcntl to dotlock which is slower but generally considered more reliable.

Mail

     My last attempt to change the way the mail system is organized failed because the kernel we were using had a bug that caused it to incorrectly read the partition tables.

     The kernel problem has been corrected.  I am going to take mail down for about an hour now to move to igloo and to attempt to change the way it is structured to get the spool off of NFS for the client server because dovecot used to provide imap and pop3 has some issues working properly in an NFS environment.

     During this time you will be able to read mail on shell servers but not via webmail, imap, or pop-3 and you will be unable to send mail.  This work should be completed by 10PM.

Emergency Reboots Tonight

     We have had some issues with three different customers accessing e-mail.  I was unable to replicate this until tonight.  When it did fail for me the failures indicate an NFS problem with the new kernels, consequently on the NFS servers and mail clients I am going to revert to a previously known working kernel shortly.  This will, unfortunately, interrupt everyone’s session.

Drive Taking Errors

     One of our machines has a drive that is taking some errors.  It completely passes the SMART internal diagnostics but the errors indicate problems finding sector headers which can happen if a machine isn’t shutdown properly say during a power outage it can clobber some sector headers.

     While this can be fixed by a format, this drive is older than dirt (about eight years) so I’ve ordered a replacement.  I am expecting to take this machine down Halloween evening for drive replacement.  This is the only non-RAID drive on the machine but it’s used for booting.  If it fails no data will be lost but the machine will be unable to boot until we replace it.  I have a smaller drive which I could replace it with if the replacement does not arrive on time but I would rather replace it with a fresh drive and one with a larger cache should provide faster boot times.

 

Digital Ocean Spam / Virus

     We have received a large number of spams containing a virus from Digital Ocean address spaces.  We are receiving these exclusively from digital address space.  For every one of these I have sent e-mail to their published abuse address, abuse@digitalocean.com and to their NOC at noc@digitalocean.com.

     I have yet to receive a single reply and as a consequence I initially started blocking individual addresses these came from.  But still they continue.  Now I am blocking entire address blocks as we receive this virus / spam.  I am also sending this to blacklist maintainers as well as using it as a source to train our baysian filters.

     At present the following address space is blocked for incoming mail:

167.172.127.122 REJECT Spam Digital Ocean
165.227.147.88 REJECT Spam Digital Ocean
128.199.13.160 REJECT Digital Ocean Virus
159.203.181.43 REJECT Digital Ocean Virus
188.166.64.227 REJECT Digital Ocean Virus
209.97.155.51 REJECT Digital Ocean Virus
204.48.23.113 REJECT Digital Ocean Virus
104.248.58.145 REJECT Digital Ocean Virus
198.199.120.66 REJECT Digital Ocean Virus
138.197.0.0/16 REJECT Digital Ocean Virus
143.110.128.0/17 REJECT Digital Ocean Virus
142.93.0.0/16 REJECT Digital Ocean Virus
159.203.0.0/16 REJECT Digital Ocean Virus
159.89.0.0/16 REJECT Digital Ocean Virus
159.65.0.0/16 REJECT Digital Ocean Virus
174.138.0.0/17 REJECT Digital Ocean Virus
64.227.0.0/17 REJECT Digital Ocean Virus
188.166.0.0/17 REJECT Digital Ocean Virus

     I don’t like to do this but when a company will not respond to complaints and the spams are viral in nature, I am left with little choice.  I have also submitted a copy to clam-av folks to generate a signature for this.

Unannounced Kernel Upgrade

     I apologize for the unannounced kernel upgrade this morning but it was done rapidly because a security flaw was discovered in 5.8 and earlier kernels that I wanted to eliminate as rapidly as possible.  We are now running 5.9 kernels.

     This took two rounds last night because my first build of 5.9 was not correctly configured for our servers so I had to rebuild, re-install, and reboot again.

     5.9 has a minor bug in the NFS code that is printing some warnings.  It involves a race condition when a client attempts to open a file it doesn’t have permissions to open.  Since the open would have failed anyway on the basis of permissions I do not believe this bug has any significant operational consequences other than making noise in the kernel logs.

     I checked bugzilla and there is already a bug report filed though given the relatively low severity I doubt it will get rapid attention, but I’m added to the notification list so we will update again once fixed.

Web Server Downtime

     Sorry for the downtime between 2-4:30AM.  Something got corrupted and I was unable to login with x2go, and I had gotten friendica working since the last backup so I did not want to just restore from backups and lose all the work I had done.

     I had to delete and reinstall about half the operating system, recompile Apache and some of the libraries it depended upon because something caused the system to install Apache and some libs and overwrite my custom compiled version which is much newer than the distros and has some capabilities the distros does not.

     When I finally got everything working again, I made a backup so all that work is saved.