Sunday, January 31, 2010

Rainy Day Outage (update four)

Good news everyone!

The few remaining services that were out should be back shortly.

We moved the failed server from Robert's house to Ask's office today and finally got enough parts replaced that the server is running again. As we hoped all data is intact.  It's currently copying all the data off to a couple of 2TB disks.  Actually, I just checked and the only missing services right now are some pm.org sites and some of the historical mirrors /archives (very old versions of perl, some cpan-testers mails).

The pm.org sites should be back by the morning; the rest might need a few days for us to do the sneakernet thing to get the data to a fast enough network connection that copying hundreds of GB isn't too slow.



Friday, January 29, 2010

Rainy Day Outage (Update Three)

Not a lot of new news.

We got a new motherboard for the system, and have installed it.... but it's not working.  Which leads us to believe that its actually the powersupply thats shot.  We're on the hunt for a compatible powersupply locally, but its probably not worth pouring more money into this box, so we're working on Plan B, which is to get the 3ware RAID controller installed in another system and get the data off the drives.  This is complicated by the fact that we need to connect eight drives at a time and that the controller is PCI-X, a no-longer popular form factor.

If anyone in the Los Angeles area happens to have an idle server we could borrow for a few days with a PCI-X motherboard and at least 8 PATA bays, we'd love to hear from you.  Drop us a line to webmaster at perl.org.

(We're trying to avoid Plan C, which involves taking images of all the drives one at a time and then gluing them together.)

We've initiated a plan to get rt.perl.org up and running again ASAP, and are dedicated to getting the pm.org (and other) data off these drives. 



Wednesday, January 27, 2010

Rainy Day Outage (Update Two)

Today was a beautiful day in Los Angeles, the kind with clear blue skies, almost 70 degrees F, and the rainy weather almost forgotten.....

We went to the datacenter tonight to investigate the failed hardware, and it does appear that the motherboard (or powersupply crossbar bus) has failed entirely. The box is totally unresponsive.  A new motherboard is on order (should arrive Thursday), and the affected machine has been moved from the datacenter to my garage for easier access.  The plan is to try the new motherboard and hope it fixes the problem.  Plan B is to remove the drives from the machine and connect them to another machine to extract the data.  (This is complicated by the lack of spare machines with a PCI-X slot and the number of drives in the machine.)

At this time, we still believe no data has been lost and that we'll be able to safely retrieve all of it, it just might take a few days.

Currently unavailable services (different than last time):


  • rt.perl.org (we've got the data safe and sound, but our code customizations are on that machine)

  • historical cpan-testers data

  • *.pm.org websites hosted by us


This is our first "really big" outage in nearly a decade (only because that's about as far back as our memory goes), and we've definitely got some plans to ensure that it's the only "really big" outage in the next 10 years.




Tuesday, January 26, 2010

Rainy Day Outage (Update One)

Ask and I have confirmed that we've had a hardware failure and are planning to go to the datacenter tomorrow night California time to attempt to repair it.

We've been able to restore most services, but there are a few which were dependent on the failed machine.  It'll be much easier for us to get the machine fixed before bringing these services back. 

We do not believe any data has been lost, it's just temporarily inaccessible.  All email and mailing lists are working (although some mail from earlier this afternoon may be delayed.)

As of now, the following services are unavailable or degraded:


  • rt.perl.org

  • www.pm.org and hosted *.pm.org

  • historical cpan-testers archives

  • svn.perl.org


We'll post another update tomorrow.



Rainy Day Outage

It's raining (again) here in Los Angeles...

Some perl.org services may be experiencing an outage now.  It looks like we've got to go down to the colo to diagnose a possible power issue.  More details later.



Wednesday, January 20, 2010

List Tags Launched

Last week, we asked for help with tagging the lists.perl.org data.  I'd like to publicly thank Arthur Barrett for providing a patch that contained tag data.  We've now got a pretty tag cloud and more flexible categorization.   

Friday, January 15, 2010

List Tagging

When we redesigned lists.perl.org last month we launched with a hard-coded list of categories maintained in static HTML files.  All the rest of the data is stored in a well structured JSON file.  We're looking for a volunteer who is interested in spending a few hours going through the JSON file and adding tag based categorization to all the lists.  This will make it easier for people to find the perl related list they want, plus that modern invention, a tag cloud!



If you're interested, drop us a line at webmaster at perl.org, or even better, a patch that adds a new "tag" element to every list containing a comma separated list of tags.



Link Checking

Once a month we get an email from Nodeworks, the link checking service we use.  

On really good months, the subject looks like this:

[NodeWorks] 0 dead of 3164 links for http://www.perl.org 

(This particular one came after our recent redesign.  Not one dead link slipped in.)