It’s 8pm on Friday… do you know where your servers are?

Yeah. So Friday night we’re out questing in Northrend. Just trying to level up and stuff. I look over to my laptop and notice that the mailserver is offline. Weird.

I switch to the backup wireless, but the mailserver and the website are still not there. Steve does some diagnostic work and can’t get into some (but not all) of our machines. Call the colo, they have no record of anything down.

Since we can’t get into anything, including the remote console and the remote power switch (which has me freaking out because those are supposed to, y’know, mean that we’re not totally crippled by hardware problems and don’t have to go to the colo. But, something is very wrong. So we pile into the car to head down to check hardware.

To add just more stress (as if being totally off the air for unknown reasons isn’t enough) there’s a major accident on 101 and lots of traffic. We missed most of it through judicious use of sideroads, though.

We get to the colo and our two redundant boxes are down. Flashing yellow lights with “call service.” They won’t reboot from the resets or the power button on the front. I am trying not to hover and pacing around the facility to stay out of the way. Eventually, Steve makes the decision to pull the power cords and reboot that way.

Happily, that works. The boxes come back up. Unfortunately, we ran out in such a hurry Steve doesn’t have a power supply for his laptop (and the battery is just about dead). Besides, he needs a windows session to deal with the VMs. We get stuff up enough that (we hope) he can connect to the boxes from home. He can, he does and we were only off the air for about 2 hours.

It’s not clear why we lost 4 power supplies in two boxes at the same time. Yes, they both have redundant power and all 4 power supplies blew.

So on Saturday we’re out questing again and I notice that my mail client is offline. &#@*!% I try to pull up the blog and that’s offline too. I tell Steve we’re down again, and he tries our services. I’m in the middle of almost a full blown panic attack because uh, this is a Very Bad Thing. As Steve tells me that things are up, I happen to notice that my laptop has dropped its wireless connection. Yeah, it’s just me, I have no internet at all. Ooops.

In my defense, the symptoms I noticed Saturday were exactly the same as those I noticed on Friday. And normally I assume it’s me not the servers if things break. But I am still a little worried about why stuff broke so I jumped before checking things.

Ah, the joys of working for yourself.


Filed under Life

