Manjaro Is Available Again

     Manjaro finally released an ISO that correctly builds x2goserver, nis, and other necessary tools.  It is installed and operational.  If there are software packages you would like installed let me know and I’ll at least attempt.  ELM will not build on it just like most other modern Linux platforms so I can not install that for you.  Pine is there as Alpine, bash, csh, tcsh, ksh, and zsh shells are installed.  If you would like others let me know.

Older Web Server Outage

     Last night the old web server was rebooted by me to change an IP address that for some reason the networking would not let go of even with a network restart.  I neglected to check to make sure everything came back up, my bad.  Somehow ownership of the encryption key for mariadb got changed so that mariadb could not read it’s key.  This caused it to fail to start.

Mail

Owing to the ypbind unbound issues causing mail to be returned as no such address, I’ve created the following script which is run on all the mail servers out of crontab once a minute.  The purpose of this script is to keep track of the status of ypbind, if unbound, shutdown postfix so sending mail will only get a temporary error and queue and resend.  Then once a minute try to restart ypbind until it succeeds at which point restart postfix.  This should prevent long outages if an update disables ypbind but forgets to re-enable when completed.

#!/bin/bash
if test -f /opt/status/ypmon.dat
then
YPSTAT_PRIOR=`cat /opt/status/ypmon.dat`
else
YPSTAT_PRIOR=”unknown”;
fi
if ypwhich > /dev/null
then
if [[ “$YPSTAT_PRIOR” == “bound” ]]
then
exit 0;
else
echo ‘Change from ypbind unbound to bound – starting postfix’
systemctl start postfix;
echo “bound” > /opt/status/ypmon.dat
exit 0;
fi
else
systemctl restart ypbind
if [[ “$YPSTAT_PRIOR” == “unbound” ]]
then
exit 0;
else
echo ‘unbound’ > /opt/status/ypmon.dat
echo ‘Change from ypbind to unbound.’
echo ‘Stopping postfix, restarting ypbind.’
systemctl stop postfix;
systemctl restart ypbind;
fi

Sorry for the formatting, the “code” option isn’t working on my copy of WordPress.

FTP Server

      The ftp server is broken. This is caused by libraries being replaced by Ubuntu upgrades with libraries no longer compatible with the libraries it was compiled against. Further, the existing source for wu-ftpd will no longer compile in the modern compile environment so I can not fix it as I have done in the past. The last update to the source was in 2006 so this is not likely to be fixed and I will need to move to a newer ftpd, recommendations are welcomed. In the meantime please use scp or other file transfer protocols.

Drive Replacement Successful

Drive replacement went extremely smoothly, total down time of 23 minutes.  Drive is now replicating the other drive in the raid array.  Indications are that it will take another seven hours (it’s been going for 45 minutes) so my projection of 6-8 hours seems to be spot on.  System may be a bit slow during this interval since effectively it’s continuously flushing out the buffer with new data.

Maintenance April 7th 2:00AM ~2:30AM

We will be off line, the entire network, for up to about half an hour, starting around 2AM Sunday morning, to replace a failing drive in the machine which is also acting as a router at present.  This drive is part of a RAID array so all data is duplicated and none will be lost.  If things go smoothly, it could be as short as 15 minutes, if not then maybe half hour or slightly longer.

The big unknown is that sometimes when software RAID comes up in degraded mode, which it will do initially until the new drive is pumped up, sometimes systemd will hang necessitating going through emergency mode and bringing things up by hand.  In my experience this is about 30% of the time.  It will take usually about 6-8 hours for the system to sync a new 4TB drive but the system can operate while this is in progress it just sometimes Poetteringware adds some challenges.

Mail Back to Normal

Today after some three hundred updates, the original SPF checker I was using, the phython3 version, still was not working, so I installed the perl versions of policyd.  I don’t really like perl as I don’t find it very readable relative to python, but presently it is working.

I also found the clamav virus check was dead, re-installed that.  Now all the mail milters, the clamav- virus check, spf, dkim, and dmarc are once again functional so this should reduce the flood of “we’re going to make your life miserable if you don’t send 50,000 bit coins to X” messages.

Also, the perl SPF policyd is actually somewhat better in that it checks both the ehlo host and the mail-from: host to make sure both are allowed from the sites SPF record, while the old checker only checked the mail-from, so this will be somewhat more thorough requiring consistency that the others did not.

I sent myself mail from gmail to make sure incoming was working and also watched the logs a while.

Eskimo Site Status

Ubuntu is back.  Sorry it took so long but many snags along the way.

Our old web server is running without a Network Manager because Ubuntu clods broke it.  I have to set the network interface manually after a boot.

Inuvik is also broken because Ubuntu 24.04 engineers mistakenly put new libs in the main repository instead of the proposed repository before they had recompiled everything compiled against them breaking many things, like the Network Manager on the old www/ftp machine and the mail milters on the mail servers.  They are feverishly working on correcting this but in the meantime some of our machines are hanging on by a thread, and I will be re-loading Inuvik with 22.04 to get it back online.

Ice has a hard drive with one flaky sector.  Normally this would just get re-mapped onto an alternate sector but firmware on this drive is defective.  I could update the firmware but the drive is 11 years old and only has a 64KB cache and has 512 byte physical blocks all of which make it slow by todays standards so I have ordered a replacement drive which has 256MB cache, 4k physical blocks and 7200 RPM rotational speed, all of which will provide better performance.  The failing drive is part of a RAID1 array so no data will be lost as it is duplicated on a mate.