Server

     Found that the entire web server was not down, only mariadb (what used to be mysql).   MariaDB is necessary to our site and any site which uses WordPress or other content management engines.

     I’ve added a script to crontab now that checks once a minute and restarts it if it dies.  But there is no indication in the logs of what caused it to die.  The only errors logged where an attempt to access a non-existent database, but doing that does not cause MariaDB to die so I still do not know what did.

 

 

Web Server Crash

     Our web server crashed some time this morning.  Only one call so I was unaware that it was down.  I’m going to try to put together some means of automatic monitoring.

     I do not yet know the cause of the crash.  We have been hit with denial of service attacks for the last few days but I do not know if this was related.

     It is back in service now but there are some database errors I am still investigating.

ns3.eskimo.com

     One of our name servers, ns3.eskimo.com, is currently down.

     This was due to a combination of DNSSEC root-key change and my attempts to being lazy in resolving it.  I attempted to purge / re-install the bind package thinking that would get me fresh conf files including the DNS key file however, Redhat had removed the package from the CentOS repository so I could not re-install.

      I am in the process of moving this off of the CentOS6 server onto a Ubuntu based server.  This may cause slow name resolution at times if ns3 is tried first until this is completed.

Web Server Upgrade

     I upgraded our web server operating system from Ubuntu 17.10 to Ubuntu 18.04 tonight.

     I replaced our previous database, MySql 5.7 with MariaDB 10.3.  This was not entirely smooth owing to the existence of mysql in the NIS database conflicting with the local user that the install script wanted to create and NOTHING in the error messages gave a clue as to why ti was failing.

     When all was said and done, this shaved about 150ms off of the time to first byte and about half a second off our total page load time so it was worthwhile.

Router Replacement

     The router I ordered from Amazon did not arrive yesterday.  When I checked shipping, it had not even shipped.

     I called Amazon and they said they were out of stock and it was back ordered, estimated to be here on June 6th.

     I asked to have my one day shipping fee refunded since they obviously did not ship in time for it to arrive the next day, the droid refused.

     I asked to talk to a supervisor and made the same request, the supervisor complied.

     I’ve placed another order with a provider in Oregon that does have one in stock and it should be here Saturday or Monday.  I did not cancel the Amazon order so I will have a spare.  Operating without a hardware firewall is undesirable so wish to resolve this as soon as possible.

     In the meantime, our website can be reached via https://www.eskimo.com/ but not without the ‘www’.

Host Desktop or Terminal (Guacamole)

     Several recent changes I made to our web server, compiling and installing the most recent versions of openssl and gnupg, and installing new SSL certificates with a slightly different more unified naming convention, broke guacd, the daemon that interfaces to the server to provide mapping of vnc or ssh to a desktop or terminal.

     I have since recompiled guacd and fixed the configuration file to reflect the new naming convention for the certificates.  The change to certificate naming and location allows me to drop a new certificate into one place and have it be effective system wide.

Outage – Router Failure

     Our border router failed today, this is the router that connects the outside world to our servers,  This particular model used to be carried by Fry’s but as luck would have it, they were sold out.  There are no other dealers that I can find listed locally so I’ve ordered a replacement from Amazon.

     In the meantime, I’ve configured one of our servers to play router.  This may slow things down somewhat but should not be significant.