Stories
Slash Boxes
Comments

News for nerds, stuff that matters

Slashdot Log In

Log In

[ Create a new account ]

Explosion At ThePlanet Datacenter Drops 9,000 Servers

Posted by kdawson on Sunday June 01, @02:07PM
from the could-happen-to-anyone dept.
An anonymous reader writes "Customers hosting with ThePlanet, a major Texas hosting provider, are going through some tough times. Yesterday evening at 5:45 pm local time an electrical short caused a fire and explosion in the power room, knocking out walls and taking the entire facility offline. No one was hurt and no servers were damaged. Estimates suggest 9,000 servers are offline, affecting 7,500 customers, with ETAs for repair of at least 24 hours from onset. While they claim redundant power, because of the nature of the problem they had to go completely dark. This goes to show that no matter how much planning you do, Murphy's Law still applies." Here's a Coral CDN link to ThePlanet's forum where staff are posting updates on the outage. At this writing almost 2,400 people are trying to read it.

Related Stories

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.
  • by Anonymous Coward on Sunday June 01, @02:11PM (#23618759)
    Electricity is a fickle mistress, one moment she's gently caressing your genitals through gingerly applied electrodes the next she's blowing up your data centers.
  • by QuietLagoon (813062) on Sunday June 01, @02:12PM (#23618761)
    ... for posting frequent updates to the status of the outage.
  • explosion? (Score:5, Funny)

    by Anonymous Coward on Sunday June 01, @02:14PM (#23618775)
    Lesson learned: don't store dynamite in the power room.
  • At this writing almost 2,400 pelople are trying to read it. Posting it on slashdot should help speed it up.
  • by Scuzzm0nkey (1246094) on Sunday June 01, @02:20PM (#23618809)
    I wonder what the dollar value of the repairs will run? I'm sure insurance covers this kind of thing, but I'd love to see hard figures like in one of those mastercard commercials: Structural damage: $15000 Melted hardware: $70000 Halon refill: $however much halon costs Real-Life Slashdot effect: Priceless
      • Re:Recovery costs (Score:5, Insightful)

        by macx666 (194150) * on Sunday June 01, @02:30PM (#23618893) Homepage

        Not to mention the cost of pulling all those consultants in, overnight, on a weekend...

        Also, only the electrical equipment (and structural stuff) was damaged - networking and customer servers are intact (but without power, obviously).
        I read that they pulled in vendors. Those types would be more than happy to show up at the drop of a hat for some un-negotiated products that insurance will pay for anyway, and they'll even throw in their time for "free" so long as you don't dent their commission.
  • Clearly this is bad karma resulting from all their years of human rights violations....especially Tiananmen Square...oh wait--
  • by quonsar (61695) on Sunday June 01, @02:22PM (#23618839) Homepage
    At this writing almost 2,400 people are trying to read it

    and as of this posting, make that 152,476.
  • by martyb (196687) on Sunday June 01, @02:30PM (#23618897)

    Kudos to them for their timely updates as to system status. Having their status page listed on /. doesn't help them much, but I was encouraged to see a Coral Cache link to their status page. In that light, here's: a link to the Coral Cache lofiversion of their status page:

    • http://forums.theplanet.com.nyud.net:8080/lofiversion/index.php/t90185.html
  • by Sentry21 (8183) on Sunday June 01, @03:27PM (#23619379) Journal

    electrical gear shorted, creating an explosion and fire that knocked down three walls surrounding their electrical equipment room.
    But the fourth wall stayed up! And that's what you're getting, son - the strongest data centre in all of Texas!
    • It is often the case that transformers are kept apart from all other components
      And that appears to have been the case here. Had you read the article, or even the unusually accurate headline, you would know that the 9,000 servers were 'dropped' rather than 'blown apart'. They are still physically with us, they are just dropped from service because they don't have any power because the power supply blew up.

      Further, the 9,000 servers were physically, geographically, isolated enough from the power supply (which is what exploded) to be protected. We know this to be the case because we read the article and headline and understood them and they indicate that the 9,000 servers were not blown up.

      To put it another way, only the power supply was damaged by the explosion, the servers were not. Probably there was no way to isolate the power from its own explosion. The servers, however, we protected.

      So, in summary, the 9,000 servers were not blown up. Only the power.

      The power is off due to the explosion but there servers themselves are A-OK.
      • by cecil_turtle (820519) on Sunday June 01, @03:38PM (#23619461)
        ThePlanet has 5 or more datacenters. The cost and complexity of doing a full blown physically separated 2N power system at every datacenter is far more expensive than taking the chance of having to issue a credit against an SLA. Not to mention that when a fire is involved, the fire department has full authority and may instruct you to cut all power anyway - they are coming in to an unknown situation and won't risk their own people just because you say the other power system is isolated.

        Another issue is the complexity of a full blown 2N power system is likely to cause more outages due to human error during routine maintenance over an N+1 system. Complete 2N power systems from grid and backup sources all the way to the servers with no single point of failure (transformers, wiring, switching, PDUs, UPSs, etc.) are enormously complex and expensive, so it's not "the only thing that makes sense". I assure you issuing a one-day pro-rated credit to all your customers is cheaper.
      • by p0tat03 (985078) on Sunday June 01, @02:49PM (#23619091) Homepage

        I'm a mechanical/electrical engineer by training, and what you're saying makes no sense to us. Mistakes are made in the laboratory, where things are allowed to blow up and start fires. Once you hit the real world the considerations are *very different*. While it's possible that this fire could be caused by something entirely unforeseeable (unlikely given our experience in this field), it's also possible that this was due to improperly designed systems.

        I don't suppose you'd be singing the same tune if this was a bridge collapse that killed hundreds. There's a reason why engineering costs a lot, and that's directly correlated to how little failure we can tolerate.

    • Re:Explosion? (Score:5, Informative)

      by Gazzonyx (982402) on Sunday June 01, @02:54PM (#23619137)
      Actually, modern batteries should be sealed valve or Absorbed Glass Mat (AGM) that don't vent (too much) hydrogen. During a thermal runaway, they vent a tiny bit before killing themselves, but hydrogen doesn't become explosive until the concentration in an enclosed environment is ~4%. 4% of a data center is a fairly large area. I've heard of this happening in one data center where the primary and fail over (IIRC) HVAC units failed and no one had been on site for well over a month. IOW, every battery in the place started venting and it took over a month without any air circulation for it to get to 4%.