What is Wrong

     Figured out what was wrong with the 5.17.11 kernel, it is an option I selected which strengthens the kernel by zeroing the stack to initialize everything before a call and after, this required an argument to gcc which gcc-12.1 doesn’t understand.  It’s only an issue in a module used for Nvidia compatibility but Ubuntu and other debian based releases are depending upon the presence of this particular library and without it act very erratically, the kernel also seems to not be stable with this option so on both notes it was a bad choice.  Was just trying to harden the systems against stack exploits.

Kernel Updates Postponed

     It was fortunate that kernel upgrades were delayed as I had installed it on my workstation, and the “fixes” broke the kernel much worse than it was.  It had the potential to do something wrong, so far unrealized on the servers, but the current release is taking kernel oopses on a regular basis, so much worse.

     And Ubuntu, which I’ve been using since 2012, has become way too Microsofty now, just like Redhat before them and really twisted my arm into changing distros again, at least in my work station, but there is no one distro that really does everything I need, so instead of being dual-boot my workstation is going to become triple boot with Win10, Debian, and Manjaro with Manjaro being my daily driver, but Debian for those cases where I need to run commercial software such as Anydesk which is not available for Manjaro because Manjaro builds pretty much everything from source.

     The things Ubuntu has done recently is screwed up the libs such that I can’t have 32 bit version of Wine peacefully co-exist with virtual machines, and I can’t have anything gtk3 peacefully co-exist with the rest of the system because they’ve gone to distributing libgtk3 with snap and they are sending a version that does not match what most of the other software was compiled under so I’m getting LD failed to preload libgtk3 missing symbol errors left and right.  I don’t like snaps, they are slow as snails, totally insecure, and often unreliable, and they in most cases include their own libs and containerize things which neither have a reason to be containerized nor work well in that environment.

     And to add insult to injury, the most recent firefox they distributed, version 100, is not configured the way I like and when I tried to fix that I got a “Managed by Canonical”, which is Ubuntu’s parent company and it did not let me.  Well if I wanted that I’d run Windows as my daily driver, NO THANK YOU.  So I’m going to be mostly tied up Sunday arranging my machine correctly and kernel upgrades will wait until the next point release and then only after I’ve had a few days to be sure they’ve corrected whatever they broke in 5.17.11, perhaps 5.18.1 will be out and fix the compile problems I found in 5.18 that I did file a bugzilla report on, and in the meantime I will also file a report on the kernel oopses in 5.17.11 to make sure they are aware of them.

     I built version 102 of Firefox from source today and unfortunately it has a nasty bug where it will not save the default profile.

Outage

     Sorry about the outage tonight between about 3am and 4:30am.  I screwed my workstation up to the point where it became necessary to restore from backups, but that did not work.  Even after re-installing grub and update-grub2, etc, it’s trying to boot a kernel that doesn’t exist, so I’m using an old version of Linux I have on a hard drive in case the flash fails but I’m going to have to re-install.

     Anyway, while connected to our main file server via x2go I forgot I was connected remotely and went to shutdown the machine thinking it was my workstation I was shutting down, then I discovered it was the server.

     So I had to drive to the co-lo and power the server back up.  When I went to leave my car would not start.  Something is screwed with the theft prevention system and it does not recognize either key fob and would not let me start.  So had to spend $332 to tow it to the garage, get my wife out of work early to come get me, and I’ve been up all night, and my workstation is still broken.

     So I will not be available during the day to day.  Sorry but I need to get some sleep.

     So in the last week, two of our cars broke down, one that was broken magically fixed itself, I think it was the fuel pump relay but it started working again so we’re driving it now, and I broke two of my teeth which won’t be fixed until mid to late July because the dentist is so backed up, and I broke my workstation.  Great, so trouble coming not in threes but at least fives or sixes.

     If anything is still down, please generate a ticket via the ticket system (https://www.eskimo.com -> Support -> Tickets) and I will attend to it when I am among the living again.

Kernel Upgrade

     I did not get the kernel upgrade done last night, between the fact that I broke a tooth and so had to go to bed early so I can get to the dentist today and the fact that 5.18.0 had some compiler errors, I’ve got patches from the developers that fix them but haven’t had time to make and install the new kernels yet, the kernel upgrade didn’t happen.  So far I haven’t had anyone express connection difficulties, but if you experience them in the meantime, mosh is a possible work-around.  Else I’ll at least get the guest machines done tonight or tomorrow.  Host servers (physical machines) will have to wait until I know I’ve got a working car.

     With respect to the new router, I have it here, still in the process of figuring out how to configure it.  The interface is quite different than the old machine owing to vastly expanded capabilities.  This is really a full fledged multi-media box not just a router although in our case it will be used just as a router.  It is also a video recorded, network controller, etc.

Emergency Kernel Upgrade Tonight

     This affects all eskimo.com services:

     I am planning an emergency kernel upgrade of at least the most used shell servers, the mail server, and the web server tonight.  May not be able to get to everything as I can not stay up late as I broke a tooth last night and need to get up to see a dentist tomorrow.

     This upgrade mostly involves the fallback of MCTCP to TCP.  MCTCP is a multi-path TCP protocol that allows a connection to be maintained even when the end point IP addresses change.  These are mostly used with mobile phones but any device that does not talk this protocol needs to fall back to TCP, these would mostly be older devices without wireless capability.

     The existing kernel has a bug in this fall-back code.  Because some of my customers have antique computers that may be affected, I want to try to get the publicly facing computers upgraded if possible tonight but I can’t stay up late to do it.  I’m going to focus on the physical servers in case they fail to reboot.  I will not have a car available after tonight until sometime next week as my wife’s car needs to go into the shop and she will need mine to get to work.

     I also have the new router but have not had time to install the drive and get it configured.  This should happen in the next week or so.

Non-censored Federated Search Engine

     Now that duckduckgo.com and swisscows.com/ch are censoring, as bing.com and google.com have always done, and since what else is out there, save for yandex.com which censors but on different content, there is nothing else out there that does not censor except for yacy.com.  Searx DOES only proxy requests from the others above, it provides privacy but does not provide uncensored results.

     So to that end, I’ve installed a federated Yacy server and made it publicly available.

     You can reach it at https://yacy.eskimo.com/ or from https://www.eskimo.com/ (web apps) menu.

     It’s just started indexing so the local database isn’t very complete yet but it will return results from ALL of the federated Yacy peers so you will see good results even now.  They will only get better with time.

     However, use with caution, remember this is NOT a censored search engine so no doubt you will find things in the Not Safe For Work and NOT Family Friendly category as well as ALL political views, even those that offend you.

Replacement Router Ordered

     I finally received the info I needed from Ubiquiti and we have a new router ordered.  It should be here in 7-10 days and will provide 6.8x the CPU power of the existing router.

     Our existing router is sometimes saturating between 6pm-9pm Pacific Daylight Time, even though the bandwidth is not anywhere near maxed out.  This causes latency and dropped packets.

     The existing router handles high throughput if TCP traffic if packets are large, but it does not handle high UDP traffic or small TCP packets.  The new router should improve this substantially.

     I also bought an optional 1TB hard drive for the router (it has 128GB of flash internally) which will allow me to install a nice desktop and more security software.

     For anyone looking for a router, I really love Ubiquiti products because of their excellent firewall capabilities and interface and the fact that they are Debian based so you can basically install anything from Debian as long as it is present in the MIPS ports, and if it isn’t you can self-compile as it also includes the normal Debian port of GCC and libs.

 

Kernel Upgrades are Completed

     Kernel upgrades and reboots are completed.

     All services are active.  All NFS mount points working.  All NIS binding successful.

     Only server where there was a problem was one customers private virtual server and that because he had configured SSL but the SSL certificates were not present.

     Not really sure what changed, but all the machines booted significantly faster this time around.  Not sure if this kernel boots better or if some changes Ubuntu made improved efficiency.

     This impacts all eskimo.com services including shared web hosting, virtual private servers, e-mail, shell access, https://friendica.eskimo.com/, https://hubzilla.eskimo.com/, and https://nextcloud.eskimo.com/ federated social media services.

Maintenance Tonight

     In about 1-1/2 hours I will be going to the co-location facility to extract a broken server.  This should not be service impacting except that it is in a physical stack that will require disturbing other machines and whenever that happens there is always to potential for a power cord to fall out or something similar.