I am going to take the host for these two servers down for a short period to further attempt to resolve CPU overheating issues. These should be reasonably brief outages as I am going to attempt to adjust the CPU voltage in bios and then reboot, it may take multiple shots at this to find the lowest stable voltage.
I apologize for the lengthy interruption of service tonight. We worked on one machine that is running hot, and then installed an additional 4TB drive in Iglulik to provide more generation of backups of users home directories and mail as well as guest machine images.
We re-arranged everything in the rack to make future maintenance easier and accommodate growth.
The work on virtual which has the scientific and debian machines has been rescheduled for tomorrow (Saturday evening) into Sunday morning.
I am going to add an additional hard drive to Iglulik to provide more space for backup and to allow for expansion of the mail spool.
Late this evening (probably around 11pm), I will be taking virtual.eskimo.com, which hosts scientific.eskimo.com and debian.eskimo.com shell servers, down for maintenance.
I need to do this to troubleshoot CPU overheating issues on this machine. It is nearly identical to Iglulik in terms of hardware, but the CPU on Iglulik is barely above room temperature where the CPU on this box is barely below the boiling point of water. They are both i7-2600 CPUs.
I will suspend the guests before shutting down so that anything running on shutdown will be running again on start-up.
I neglected to adjust firewall rules to accommodate the new IP address so ssh / nx was broken. That has been corrected.
Shellx is now restored to service and available for use. The IP address has changed, you may need to flush your DNS cache.
If you use ssh (or anything that tunnels through ssh such as nx or vnc) to connect, you may also need to remove the line from .ssh/known_hosts as the IP is different.
It turns out that shellx was the target of the attack that affected the Bellevue Co-Location facility at Isomedia. They have black-holed the IP address of shellx to protect the rest of their customers and as a result it was necessary to change the IP to bring shellx back into service.
In the process of doing so, I discovered some missing software and configuration issues that are potentially security affecting and so am working to correct those. After that is done I will need to take the machine down for 25 minutes or so to image it.
The co-location facility where we have our equipment is currently undergoing a denial of service attack. They are working to isolate the source and mitigate the attack. In the meantime it is severely impacting the routing to our equipment.
I want to thank all of you who have clicked on the “Advise Us” link and taken the time to fill out our survey.
The major thing that I’ve come away with is that our website needs to be upgraded in at least four major ways.
1) It needs to be responsive so that it is viewable and operational on smart phones and tablets. This was not only stated, especially where e-mail is concerned, but also implied by the fact that so few of our customers are connecting via a smart phones even though in 2012, they made up over 25% of primary browsers and probably even more today. Those are essentially lost customers.
2) It needs a modern interface. People are asking for features and information that is already present which is a strong indicator that said features and information are too difficult to find. People also indicated a desire for better aesthetics.
3) There is a need for information that currently is not available on our website.
4) There is a desire for web applications to do certain things easily such as leaving a vacation message, configuring domain hosting and other services online, etc.
A lot of the complaints centered around problems that have been resolved for the last year or year and a half. We haven’t had a single crash since we upgraded the infrastructure and there have only a few short unintended outages, the result of upgrades gone bad. Not much I can do about the past.
A lot of people would like us to provide our own high speed access. Unfortunately, the level of capitalization that requires works on a scale of millions of customers but not on a scale of hundreds or even thousands so we are forced to work with telephone companies and wholesale providers.
Virtually all respondents said we needed better documentation.
Somehow the key for scientific.eskimo.com got hurt. In order to connect with NX it will be necessary to re-download the key and re-import it into NX-client 3.5 or NX-player 4.0.x.
You can get the key here: scientific.key