2nd June 2009

If you're wondering why this website has been down since Saturday, and why everything is missing, here's something approaching an explanation. The executive summary, if you can't be bothered to read all this, is that dog (the web server you're reading this from) had massive filesystem problems and died, taking all the data with it.

Who ate all the pies

So. At about noon on Saturday I made a tiny little error in a Perl script. It caused it to spawn infinite copies of itself and eat up all the memory, the usual response for which is to reboot the server in question, dog. That's where this website is hosted. So Jon contacted the powers that be to ask them to do just that.

Then the system was fine again. The out-of-memory killer, a kernel-based assassin whose job it is to find which process is hogging all the memory and kill it, had done its job. Everything worked for a couple of minutes until dog got power cycled, it being too late to cancel the request. And then it completely failed to come back up.

Bad superblock. No biscuit.

Something had broken the filesystem a bit. The true cause of this is still unknown. It was using ext3, which as a journalled filesystem shouldn't be susceptible to the problems you can see with non-journalled filesystems when you power them off unexpectedly. In any case, it had errors. Eventually dog was cajoled into waking up again. Running the fsck utility, which again is the normal thing to do in these situations, made it a lot worse. Files were pretending to be directories, the root directory wasn't a directory, and most damaging of all, none of the binaries were executable any more.

At 7.20pm on Saturday, dog was switched off to prevent any further damage. Examination of the filesystem the next day provoked much in the way of swearing but no realistic chance of any data recovery.

And what backup would that be, then?

The last backup of this website was way back in 2007. I managed to rescue some of the newer stuff before dog died completely, but a lot of stuff is lost. The Daily Mail Headlineinator, Clocktails (for those of you who were pointed at it), Spleen Spleen Sploul, and most of the programs since September 2007 are now lining the great bit-bin in the sky.

The main blog and recent content survive, of sorts. I rescued those by catting the files to the terminal and copying and pasting them. Of course, cat didn't work because the filesystem had decided it wasn't executable, and nor did any other binaries. An exercise for the reader is to work out how I managed to cat a file (give or take some whitespace) using only bash builtins. :-)


The good news is that dog is now back up. It would have to be otherwise you wouldn't be reading this. The bad news is that it's devoid of any data.

I'll restore all the stuff I still have over the next few days. One of the Perl scripts that comprised the Trenchcoat blog and comments system didn't survive at all (the file couldn't even be read), so I'll have to redo that one, but hopefully I'll have something resembling the website back up by the end of this week.

Lastly, many thanks to the admins for getting dog back up and running so quickly, especially as they have full-time jobs and maintain dog on a voluntary basis in their spare time. Clearly my rantings of yesteryear about Jon's laziness are no longer justified, so it's time to remove them from my we... oh.