We're celebrating the day after 10/10/10 by rebooting many of the perl.org servers. You may notice some unavailability of perl.org services over the next few hours as we perform rolling reboots.
We're having a brief outage of some services this Sunday evening (or Monday early early morning in Europe) to resize a file system, run fsck etc on one of the servers.
update (9.10pm PST) Woah - those disks are slow. System is starting up again now; all affected services should be back again shortly.
Thsi is reposted from the CPAN Testers blog:
As of this weekend, the final switch to turn off the SMTP gateway for CPAN Testers was flipped. You can no longer post anything to the old cpan-testers mailing list, and any attempts now will result in a bounce message.
Our thanks to Robert and Ask over at the Perl NOC for looking after us all these years, and for being very patient with us while we got the HTTP gateway up and running over the last 9 months.
As a consequence, anyone wishing to still be a part of the CPAN Testers community, now needs to upgrade their test environments, to use the latest smokers and associated libraries. In the main this will involve a simple upgrade of your smoker client and the installation of 4 specific modules (which in turn will install any additional prerequisites needed). You will then need to acquire a metabase profile. For full details of the steps necessary please see the Quick Start page on the CPAN Testers Wiki.
For those casual testers, the upgrade will initially involve some manual intervention, although we hope to automate this as soon as we can. If you do have any problems, or are confused by any of the instructions, please post to the cpan-testers-discuss mailing list, where the developers and other experienced testers can help you.
The end of an era.
If you submit CPAN Testers reports via email, you may receive an email that looks like the following:
Thanks for submitting a test report to CPAN Testers!
On September 1st, 2010, we will be disabling email submissions as part of the migration to CPAN Testers 2.0. You must switch over to the HTTP based submission mechanism before that date (metabase transport), or all your test reports will be rejected.
For more information, please see the documentation: http://wiki.cpantesters.org/wiki/QuickHowToCT20
For more information on CPAN Testers in general, please visit http://static.cpantesters.org/
Please heed the warning, because in a little over three weeks, we will be turning off the inbound email gateway, and we really would like to continue to receive your module test reports.
Since Tuesday 6th June, 16:00 (CEST), profane.mongueurs.net is temporarily down because of a thermal problem in the datacenter hosting it. Most services have been migrated to spectre.mongueurs.net in order to insure continuity of service:
Some services may be unavailable during this period.
Apologies for the troubles.
Feel free to mail <sebastien @ aperghis . net> if you have any further questions.
(Post by Sébastien Aperghis-Tramoni)
We understand that some users are having trouble reaching some perl.org sites (specifically those hosted in our Los Angeles facility.) We've escalated this issue to our network provider, and they're looking into it. We hope to have it resolved soon.
Update 1:19pm: The issues have been resolved.
Good news everyone!
The few remaining services that were out should be back shortly.
We moved the failed server from Robert's house to Ask's office today and finally got enough parts replaced that the server is running again. As we hoped all data is intact. It's currently copying all the data off to a couple of 2TB disks. Actually, I just checked and the only missing services right now are some pm.org sites and some of the historical mirrors /archives (very old versions of perl, some cpan-testers mails).
The pm.org sites should be back by the morning; the rest might need a few days for us to do the sneakernet thing to get the data to a fast enough network connection that copying hundreds of GB isn't too slow.
Not a lot of new news.
We got a new motherboard for the system, and have installed it.... but it's not working. Which leads us to believe that its actually the powersupply thats shot. We're on the hunt for a compatible powersupply locally, but its probably not worth pouring more money into this box, so we're working on Plan B, which is to get the 3ware RAID controller installed in another system and get the data off the drives. This is complicated by the fact that we need to connect eight drives at a time and that the controller is PCI-X, a no-longer popular form factor.
If anyone in the Los Angeles area happens to have an idle server we could borrow for a few days with a PCI-X motherboard and at least 8 PATA bays, we'd love to hear from you. Drop us a line to webmaster at perl.org.
(We're trying to avoid Plan C, which involves taking images of all the drives one at a time and then gluing them together.)
We've initiated a plan to get rt.perl.org up and running again ASAP, and are dedicated to getting the pm.org (and other) data off these drives.
Today was a beautiful day in Los Angeles, the kind with clear blue skies, almost 70 degrees F, and the rainy weather almost forgotten.....
We went to the datacenter tonight to investigate the failed hardware, and it does appear that the motherboard (or powersupply crossbar bus) has failed entirely. The box is totally unresponsive. A new motherboard is on order (should arrive Thursday), and the affected machine has been moved from the datacenter to my garage for easier access. The plan is to try the new motherboard and hope it fixes the problem. Plan B is to remove the drives from the machine and connect them to another machine to extract the data. (This is complicated by the lack of spare machines with a PCI-X slot and the number of drives in the machine.)
At this time, we still believe no data has been lost and that we'll be able to safely retrieve all of it, it just might take a few days.
Currently unavailable services (different than last time):
This is our first "really big" outage in nearly a decade (only because that's about as far back as our memory goes), and we've definitely got some plans to ensure that it's the only "really big" outage in the next 10 years.
Ask and I have confirmed that we've had a hardware failure and are planning to go to the datacenter tomorrow night California time to attempt to repair it.
We've been able to restore most services, but there are a few which were dependent on the failed machine. It'll be much easier for us to get the machine fixed before bringing these services back.
We do not believe any data has been lost, it's just temporarily inaccessible. All email and mailing lists are working (although some mail from earlier this afternoon may be delayed.)
As of now, the following services are unavailable or degraded:
We'll post another update tomorrow.
When we redesigned lists.perl.org last month we launched with a hard-coded list of categories maintained in static HTML files. All the rest of the data is stored in a well structured JSON file. We're looking for a volunteer who is interested in spending a few hours going through the JSON file and adding tag based categorization to all the lists. This will make it easier for people to find the perl related list they want, plus that modern invention, a tag cloud!
If you're interested, drop us a line at webmaster at perl.org, or even better, a patch that adds a new "tag" element to every list containing a comma separated list of tags.
Once a month we get an email from Nodeworks, the link checking service we use.
On really good months, the subject looks like this:
[NodeWorks] 0 dead of 3164 links for http://www.perl.org
(This particular one came after our recent redesign. Not one dead link slipped in.)