The Perl NOC: 2006

Friday, December 29, 2006

Planet Upgrade

We've upgraded our planetarium software to planet 2.0. You probably won't notice anything different, but there are a few minor changes:

Atom feed support

Better handling of bad HTML

Better handling of weird encodings

See it in action at Planet Perl, Planet Perl Six, or Planet Parrot

Saturday, November 18, 2006

cpan.org mail forwarder listed in Sorbs

Sorbs decided to list 63.251.223.170 which is the IP that the cpan.org mail goes through.

I'm not entirely sure why (a bounce? evil user?, ...) but they are making it a pain to get de-listed and I just lost patience navigating their slow, broken-link-ish, wordy website. So, if you or your ISP blocks mail based on SORBS you might miss your cpan.org (and other perl.org) mail. Too bad for you.

(On a similar note: If you or your ISP are using Spamcop you'll every few months lose some perl.org list mail because some idiotic spamcop user/perl.org subscriber submits a mailing list mail to that system).

Update: Robert just told me we had trouble with our virus scanner (it wasn't updating the definitions) so for a few days we were bouncing some new viruses instead of just dropping them in a giant file never to be looked at like we usually do. Their "spamtrap" was joe-jobbed and we bounced a mail to it. Anyway, if you know anyone at SORBS feel free to make them de-list us. Thanks.

Thursday, November 9, 2006

Power outage again!

Yikes, the perl.org data center unbelievably had another power outage. Lots of services didn't start up; I'm working on it.

The list server is slightly broken and can't get going without a kick (we are getting a replacement), so it won't be back until I've been by to, well, kick it. I should be there in a couple of hours ...

Update: All should be well again. Email me at ask@perl.org if you find a service that's still down...

Friday, November 3, 2006

CPAN Ratings upgrade

I did a bit of work on CPAN Ratings today.

The RSS feeds for reviewers and distributions should work (better) now.

I added "alternate" headers for the RSS feeds so your tools can find the RSS feeds more easily.

I also implemented a proper API for the helpful votes. This was just to make it easier to make other API things in the future (and to maybe make the site support non-javascript browser some day ;-) )

If the site doesn't work properly, be sure to clear your cache, shift-reload etc etc. (The .js and .css files aren't versioned so your browser or ISP proxy might have cached the old versions).

Sunday, October 29, 2006

new list archive

I've been working on a new list archive. There's still work to be done, but you can test the Work In Progress at the beta site.

Monday, October 16, 2006

perldoc stats

As part of an effort to translate the perl documentation to other languages Joergen W Lang graphed the perldoc.perl.org logs from the first week of October.

Thursday, September 28, 2006

perlfoundation.org DNS

The perlfoundation.org DNS has temporarily reverted back to ancient settings. We expect to have it sorted tomorrow (Friday).

Emails to @perlfoundation.org might not work until then.

The websites can be accessed with the "-" version, www.perl-foundation.org and news.perl-foundation.org (the non "-" version is usually what we normalize them to).

perl.org and pm.org services are not affected.

Friday, September 1, 2006

EU search.cpan.org server back up

Some months ago our european search.cpan.org mirror went down with some hardware trouble.

Our friends at Digital Craftsmen kindly replaced the server some time ago and Graham and I finished the setup just now and put it "back in rotation".

If you are in Europe you should see a "hosted by digital craftsmen" thing at the bottom of the search.cpan.org page now.

If you notice any trouble with the service, please let us know.

Thursday, August 10, 2006

Power upgrade this weekend

The building is doing some sort of upgrade to the power system this weekend; hopefully it won't impact us in a negative way. (Crossed fingers, knock on wood).

We have been informed by the building management that on Friday, August 11, 2006, the Department of Water and Power, City of Los Angeles will be upgrading one existing 2500 KVA transformer to a new 3750 KVA transformer on the MST Grid here at the Garland Building. This project will commence at 6:00pm PST on Friday, August 11, 2006 and conclude on Sunday, August 13, 2006 at 9:00pm PST.

During the installation of this transformer the normal electrical power for the MST Grid will be transferred from the DWP utility grid to the Building’s emergency diesel generator plant. ABM Engineering has informed Morlin Asset Management that this transformer upgrade will affect the Garland Building in the following ways:

At exactly 6:00pm PST on Friday, August 11, 2006 on the MST Grid only, a two (2) minute outage will occur.

When the DWP utility grid is restored at 9:00pm PST on Sunday evening, another two (2) minute outage will take place.

We have been assured by building engineering that the service that is being upgraded will not have any impact to the power that provides service to IX2's facilities including any of the cooling equipment for the building. This work is the first step in the building's plan to increase the UPS capacity of the building. Again, this outage will not impact IX2 and our clients according to building management.

Saturday, August 5, 2006

svn.perl.org cert expired

The SSL certificate for svn.perl.org expired yesterday. We'll get it replaced ASAP.


Error validating server certificate for 'https://svn.perl.org:443':
- The certificate is not issued by a trusted authority. Use the
fingerprint to validate the certificate manually!
- The certificate has expired.
Certificate information:
- Hostname: svn.perl.org
- Valid: from Aug  4 06:20:36 2004 GMT until Aug  4 06:20:36 2006 GMT
- Issuer: Certificate Authority, Develooper, Los Angeles, CA, US
- Fingerprint: 67:09:93:f3:3b:41:f2:7e:0f:fe:6c:1b:fd:b4:2a:fb:65:f0:29:e7

Wednesday, August 2, 2006

Anatomy of a(n ongoing) Disaster..

Dreamhost's datacenter is in the same building that the perl.org rack is in. They put together a wonderful blog post that's a very good summary of last week's troubles. (Of cousre, their troubles are an order of magnitude or seven greater than ours.)

Saturday, July 29, 2006

it's morning: svn back

svn is running again. I'm very glad I didn't try and repair it last night, because all of the repositories would have ended up vaporized, as I slumped over the keyboard and triggered the rm -rf / macro I have bound to Ctrl-Alt-2oSDLFHq.

Friday, July 28, 2006

svn.perl.org down until morning

we had another "power event" today, and at least one of the repositories on svn.perl.org ended up with some minor corruption. I need sleep, so I'm going to bed. I'll fix things in the morning. (I don't want to do a rush job tonight and mess something else up.)

Monday, July 24, 2006

Spacing Out

Apparently MySpace is hosted in the same building we are. When they go down, they get press coverage and more press coverage and international press coverage.

Sunday, July 23, 2006

Hot!

It has not been a good weekend for our datacenter. It all started on Saturday, when Los Angeles experienced record breaking temperatures. (I spent the afternoon outside, and it was a scorcher... 110 degrees plus.) There was a power failure in the building that hosts our datacenter...

the backup power kicked in, but failed after a time period when it got too hot. Our machines lost power briefly, and all but one came back up. Because Murphy's law always takes effect when you least want it to - the machine that didn't come back up was the one that hosts all the perl.org mailing lists. The datacenter personel attempted to reset it, but as they were dealing with many other customers (much more important than us - I don't blame them), they didn't have time to hook a monitor up to our system and see what was going on. So, at 11pm, I drove down to the datacenter to find out that all it wanted was "press F1 to continue". Further diagnosis showed that the bios battery was gone, and the case open sensor kept tripping. Even if the bios was set to not prompt, it would "conveniently" forget that fact. (Did I mention that I was leaving to go to Portland for OSCON in the morning?)

Today, we recieved a note that we may lose power due to some emergency maintenance the building was going to perform to repair electrical damage caused by yesterday's outage. So, instead of having to deal with fscking and rapid power loss, we shut down all of the systems. Severla hours later, we attempted to turn them back on - but only 50% came up! The datacenter staff helped reset the rest, and gave the ornery list box from above the 'f1' treatment. Everything is back up and happy now.

I know that several other companies hosted in the same building lost power, and not just in our datacenter. One, a large perl shop, is still down -- going on six hours. For larger deployments, they are concerned more with heat dissipation - so need to wait for things to cool down. I'm very happy with our hosting arrangements - they've been very helpful with getting boxes reset - and I know things are worse for them than they are for us.

This weekend has identified some weaknesses in our architechture, and we're going to be working over the next few months to solve them. While it doesn't make sense for us to have a fully distributed system, we could definitely use more redundancy in some core systems. We'll probably be posting here with an updated "wish list" soon.

Fingers crossed that the rest of this week goes smoothly. It's no fun having to deal with a datacenter from hundreds of miles away.

perl.org outage again!

The facility we are in needs to turn off the power for 4-6 hours(!!!) starting around 3pm PST to repair the UPS and switch back to utility power. Yikes, really bad timing with the hackaton at OSCON and all. :-(

Our servers will shutdown around 10 minutes before that and hopefully come back when power comes back. One or two might need an extra kick which we'll get done tonight or tomorrow morning.

More when we know more.

(and of course our European search.cpan.org mirror is out too, so we can't even keep that running. Grrrh).

Saturday, July 22, 2006

perl.org lists are back up

The perl.org mailing lists are back up. (Thanks to Robert who went to the datacenter to kick it!).

In related news it seems like we could use a new 1U box (with two disks, preferably SCSI) to run the mailing lists. Email ask@perl.org if you have something to spare.

When the power went out the UPS kicked in (and then the generators), but apparently the HVAC system failed and ~30 minutes later something overheated and shut down our power momentarily. Whee. :-(

perl.org outage

Our datacenter had a power-outage including the UPS systems (!). Some things didn't start up properly making other things not starting up properly etc etc.

We are working on it and everything should be back up shortly(-ish). Right now we're waiting for a ~400GB partition on an incredibly slow raid-5 system to run fsck ("/dev/vg1/lv0 has gone 309 days without being checked, check forced.").

Update ~19:40PST: We have almost everything running again. The mailing list server didn't seem to come back after the reboot so no mailing list mail yet. Robert is calling the data center to see if they can put a crash cart on it.

Tuesday, July 18, 2006

perl-qa subscribed to perl6-all ?!

Anton Berezin pointed out that the perl-qa postings mistakingly were being posted to the perl6-all meta list. Ooops (and fixed). Thanks Anton!

Wednesday, July 12, 2006

Volunteer (still) needed

We posted in May about needing a volunteer to hack up a simple script. Still do!

This time we have a few pointers ready though so hopefully we can get a volunteer started before he or she gets distracted and disappear on us.

As a slightly larger task then we could use some help from someone who'd be interested in hacking on pgeodns, our geographic load balancing name server written in perl.

Email ask@perl.org if you are interested and I'll get you access to our little Wiki.

Monday, July 3, 2006

perlbug upgraded

Finally, the moment you've all been waiting for is here!

perlbug (http://rt.perl.org/rt3/) has been upgraded to the latest and greatest version (RT 3.6).

Here are some changes you might notice:

- a new shiny look

- no more auth.perl.org, we now authenticate directly from bitcard.org

- a public interface that doesn't require you to log in to see tickets

- a much more powerful search interface

- things that were slow before, are not quite so slow anymore

- saved searches

Likely, you'll discover some things are broken, or don't work the way they used to. Here's a few we know about:

- Old bookmarked searches can't be used anymore. (Sorry!)

- Some bitcard accounts (with accented characters in their names) can't login.

- we have a mild performance issue related to CSS caching.

If you run into issues, big or small, please send an email to perlbug-admin at perl.org. Your message will be answered in the order recieved.

-R

(special thanks to Jesse Vincent, Kevin Riggle, Thomas Sibley, and all the rest of the gang at Best Practical, for their help, patience, and for the rt.cpan.org customizations, which made this much easier than it might have been)

Friday, June 30, 2006

svn.perl.org outage

svn.perl.org is momentarily down while we nudge some of the underlaying berkeley db databases.

update: it's back.

Saturday, June 24, 2006

Robert talks about perl.org on the Perlcast

Josh McAdams interviewed Robert for the Perlcast about perl.org.

There's a little quip about volunteers near the end and speaking of that: we are still looking for someone to help update our fancy DNS server. We had three volunteers a month ago, but they all got busy. The first task we need is a simple tool to convert a text file from one format to another. I hear this perl tool is good for that. :-) I'm insisting on not just fixing the old code we have that does the job because I really want to get a few more people involved with what we do...

Tuesday, May 16, 2006

Volunteer needed for simple script

Hi everyone,

I need someone to hack up a simple script for me. It's critical to the perl.org and CPAN infrastructure, so it's a worthwhile task even if a bit tedious.

It's a 20-30 minute job at most, but I figure if I can spend 5 explaining it to you then I'l have 15-25 minutes to work on something I can't delegate so easily.

Email me at ask@perl.org if interested in helping.

Update: I got a few volunteers, thanks!

Friday, May 5, 2006

perl.org lists and nntp.perl.org down

We are having some hardware trouble with the mail/nntp server. We are working on it...

Update: Robert went to the datacenter to kick the box, check the failing disk etc. The server is up again and mail to the lists has started flowing. We need to do a little work on the nntp server before re-enabling it, so nntp.perl.org is still down.

Update 2 (~5.30pm PDT): Robert finished sorting out the NNTP setup so it shouldn't explode again. One of the drawbacks of using mostly old donated hardware is the retarded BIOSes even in "server hardware" from a few years ago. Anyway, it's all better now.

Thursday, April 6, 2006

Dearest Spammer

Dearest Spammer,

Sending a message to all cpan.org addresses will not get you customers. Even if you are pushing a perl related product and not penis enhancement drugs. You are now blacklisted by us and hopefully listed in spamcop and other blacklists. Do not expect it to be removed unless you bribe us in a big way. I only wish I had caught your spam before you sent it to over 600 addresses.

Sincerely,

me.

Saturday, March 25, 2006

Expanded Universe

The universe is expanding. A few new faces have been added to http://planet.perl.org/. Hope you enjoy them.

Saturday, March 11, 2006

ViewSVN back

ViewVC (formerly ViewSVN / ViewCVS before) starting giving semi-random server errors after Robert upgraded our Subversion server.

Yesterday he upgraded to the latest ViewVC and since then it hasn't given any trouble, so you can browse the repositories again. It might start breaking again of course; so keep your fingers crossed. :-) (If something there isn't working, please let us know).

There were a few brief service interruptions for some services while I was in the datacenter earlier (moving some cables around and trying to get an old switch to talk to our console server). All should be well now.

Not the Terminator

Ask is heading to the datacenter to go install our new Sun Fire T2000.

Friday, February 24, 2006

Subversion Upgrade Complete

To everyone playing along at home, our subversion server has been upgraded, moved, vaccumed, manicured, and gotten a hair cut. It should be faster and more shiny. (Some new features aren't available yet, but will be soon.) Also coming is a new front page, with answers to all those questions you never knew you had. Let svn at perl.org know if you find anything wrong or broken.

Tuesday, February 21, 2006

Subversion Upgrade Soon

At some point later this week, or maybe this weekend, I will be taking the perl.org Subversion server down to move it to different hardware. This most likely won't happen until after the imminent Parrot release.

From an end user perspective, nothing should change. (Although we are upgrading to a newer version of the server and associated programs.) The server will be there, and then it will be down for a fewhours... and then it'll be back, bright, shiny and happy. More details once it is complete.

Tuesday, February 14, 2006

Cover opened, please panic!

You may have noticed that you didn't get any perl.org or pm.org mailing list email from 9am until 12 noon (pacific) this morning. We had a slight failure that required manual intervention. (The machine OOMed, and ssh got hosed as a side effect.)

I drove into the data center, and after a bit of fiddling to get the display working, discovered that the machine was very patiently waiting for a keypress because the cover sensor had been disturbed. That "feature" has been disabled.

No mail has been lost. Everything should have been caught up by now.

Monday, January 23, 2006

Everything is green again

Just got back from the datacenter. Everything was running again about 60-90 minutes ago.

Since the last reboot the server in question had more disks installed. The disks are split across two 3ware raid controllers (one with the new disks and one with the old). The bios decided to try booting off the controller with the new disks – of course there's no boot record on those so it just stalled.

The serial console was working, but for some reason it'd not show me anything until it had skipped past the "press button to enter bios setup" window. Robert could get into the bios remotely, but he was too lazy to check the settings so downtown I went. Grrh.

file server reboot

We have one particular server that's hosting many of our databases, NFS for our internal software distributions etc etc. We reboot it a couple of times a year to get kernel updates and such. One of those times is in a minute (so a short outage of some services will occur). Crossed fingers it'll come up alright, we don't want to go to the datacenter right now. :-)

update: eek, it's not coming back. crap.

Thursday, January 19, 2006

Overload!

You may have noticed some trouble accessing some perl.org services this morning. Our network link is being saturated, because one of the DNS services we host is getting hammered. We've shut down the server on our side, but the other side is still sending. Mad props to InterNAP for being helpful in diagnosing and putting workarounds in place to get us stable again.