Linux woes

2014-12-23

The past week all of my computers seem to crash rather simultaneously, though seemingly independently. The four year old installation of ubuntu 13.10 (running gnome) on my laptop, freshly installed (from an official disk no less) win7 on a desktop, and the half year old kubuntu on a different desktop. What the hell is going on.

So first the Ubuntu (gnome) on my laptop decided to fail after a certain update. It ditched the gnome skin, and give me a rather bare bone KDE'ish menu bar. After a while that menu will disappear as well leaving me with.. well, nothing. Alt-tabbing still works, as do _certain_ key combo's. For example, ctrl-alt-t works to open a terminal, but alt-f2 doesn't. This means I have to use terminal to start certain apps and even to shutdown or reboot. Quite annoying. Luckily it's a laptop so the lid will make it sleep. At least until I get around fixing that mess.

I tried doing an upgrade to 14.10 but it hung the system at some point. After reboot, nothing seems to have changed. It's running a dist-upgrade through cli now, I'll be curious to see whether that does fix things.

Next was my old computer that I turned into a game rig. Installed a fresh win7 on it and let steam download some games on it. When I returned to it it crashed horribly. After close inspection, it _seemed_ as if the ssd died on it. What the hell? Super coincidence or what? Happily the standard IT trick worked on it (open the machine, have a long look at it, close the rig, fixed). The drive showed the same problem again later, once. But since then it seems to function correctly. I'm still worried about that though since, well, it might do the same thing at any point later. But since it's been demoted to a game rig, I guess it's not a big deal. That is, until it happens mid-way a big campaign without save points, of course.

And last week the weirdest one of them happened; after watching a movie my machine acted as if the main root ssd died as well. Quite similarly to the windows machine, actually. When I rebooted it, which was pretty much the only option, grub started to complain about "the disk drive for /dev/mapper/cryptswap is not ready yet or not present". What the fuck? Never had that before nor was there any reason for that to happen suddenly.

A few reboots later and it wasn't going away. Managed to get in the grub menu (hold shift) and use safe mode to, eventually, get back into ubuntu. I tried fixing grub but the problem persisted on the next reboot. Crap.

So the updater had been begging me to update from 14.04 to 14.10 for a few months now. I thought the upgrade would be a good idea now and hope it would fix all the things. Or at least whatever was wrong. Well... that turned out to be a bad idea.

So picture this; the update UI has an option to show the terminal as updates are being installed. It's pretty limited but still kinda better than hardly any progress information at all. Unfortunately midway it showed that the update exited and there was a long list of similarly looking errors, scrolling the actual problem outside of view. Bleh.

I wanted to inspect what the resulting state of my machine was but I quickly discovered that it was very bad. I couldn't open any program, key bindings were broken, icons were disappearing on hover. Uh-oh. I had literally no option but to reboot now. And the reboot brought me back to the grub problems. The only difference now was that there was no longer an option to get into (my) Ubuntu because X would simply show me a black screen.

I had no idea what was wrong. The only clue I had was that it happened before the network was activated. Double crap. After a while I discovered apt-get was in a bad state. Circular references which were partially missing. Seemed that tripped up the upgrade which just bailed at some point. So I removed the offending packages and tried to reignite the upgrade. Which. Failed. Of course. Because there was no more network. Craaaaap.

Okay so now what. I tried getting the network to work but to no avail. Neither manual prodding nor dhcp would bring it back so the network option was out. Then I found a way to manually download packages. Basically you get all the urls that apt-get wants to download (sudo apt-get --qq --print-uris dist-upgrade). It will suppress all output except for errors. You'll get the urls to download in there which you can filter out (| cut -d\' -f 2). Store this to a file and have wget download them on another computer. Then transfer these files back to the target computer through usb. From usb copy the files to /var/cache/apt/archives and then apt-get will consider them downloaded, and proceed.

Ok so that seems pretty nice except for one thing; usb is not that trivial to do from just cli. So the good news was that I got closer acquainted with using usb through cli. Basically you figure out the name of your usb drive (which may change during the process, that doesn't help), lsblk is one of many ways to get that name. Create an enpoint for your drive, like sudo mkdir /media/usb. Then mount the drive to this endpoint (sudo mount -rw /dev/sdf1 /media/dir assuming the usb drive is at /dev/sdf1, this is probably different for you) and you should be able to access it. I had some problems writing files to the usb drive (which seemed to resolve after fiddling with the mount params, and a bit more patience before removing the drive). You'll want to write it because the urls are super long and there may be many for a dist upgrade. You don't want to be manually copying them :p

Anyways, after copying the downloaded files from the usb to the target computer, to the /var/cache/apt/archives dir, doing the dist upgrade worked like a charm. Seems to have fixed my computer right back up. Makes me a happy camper. I'm still slightly worried about the grub thing, though. I still don't understand it and I'm not convinced I'll have seen the last of it.

So with the two desktops able to crash at any random moment, but seemingly stable otherwise, I'm left with a laptop that by now failed to upgrade and, I think, is in the same state as my desktop was. At least now I know how to fix that...

Edit: turned out, later, that the SSD broke down and I had to replace it. Luckily it was the OS drive (as SSDs ought to be) so no major data loss. It's just another day and some money lost on it.