Reboots and Imaging – Maintenance Outage Tonight

     Tonight shortly after midnight I will be rebooting the physical hosts which will reboot everything because recent security updates require a reboot to activate them.  This should take about fifteen minutes.

     Then I will be taking the web server down for about an hour for imaging and then to extend the file system to accommodate a larger database required by a new customer.

 

Stability – Interim

     I have the physical hosts up on 5.4.0 now, this is the kernel that shipped with Ubuntu 20.04.  It is not entirely stable on machines which serve as NFS servers which two of these machines do.

     I have built 5.3.0 kernels which were previously stable but can not boot into them because automatic backups are running.  Also I don’t know if MY 5.3 kernel will be stable as Canonical applies some patches that will not be present but the 5.3 kernel is no longer available as 19.10 is past END OF LIFE now.  I will attempt to do so tomorrow evening.

Kernel Issue Emergency Reboot

     Sorry I had to do an emergency reboot on servers today.  I discovered what caused this morning’s crash and it was a bug in the kernel I was running that caused it to not recover resources as processes exited and new ones were created, so it would continue to eat memory until it ran out then crash and reboot.

     I don’t know if this present kernel is trouble free but I do know it does not have this particular problem and NFS seems to work correctly in it, which has been the main reason for all the kernel experimentation lately.

Mail Servers

      After I installed a newer version of fail2ban on the client mail server, the attackers were no longer successful at overloading the machine as it was able to keep up with changing attacking IP addresses.

      Now they are attacking our incoming mail servers, and although they are increasing the load on these machines they can not elicit login credentials from these machines as they are not setup for authentication.

     I am installing the newer fail2ban on these machines to get the loads down as with the client server.

Client Mail Server Under Attack

     The client mail server is still under attack.  The attack consists of a botnet that is trying to elicit login and password for users by brute force methods using postfix auth.  So far a little over 3,000 IP addresses are blocked but fail2ban can’t work fast enough to get them all in real time.

Reboots and DDoS

     Iglulik spontaneously rebooted again this morning, still no idea what is causing this.

     Mail was under a DDoS attack as of around 9:30 this morning, still ongoing at 10AM, fail2ban is essentially saturated locking out attacking IPs as fast as it can.  Load is heavy but server is still functional but a bit slower than normal.