Car Hits Utility Pole, Takes Out EC2 Datacenter 250
1sockchuck writes "An Amazon cloud computing data center lost power Tuesday when a vehicle struck a nearby utility pole. When utility power was lost, a transfer switch in the data center failed to properly manage the shift to backup power. Amazon said a "small number" of EC2 customers lost service for about an hour, but the downtime followed three power outages last week at data centers supporting EC2 customers. Tuesday's incident is reminiscent of a 2007 outage at a Dallas data center when a truck crash took out a power transformer."
Re:Obvious solution (Score:2, Informative)
Think of the poor strippers man!
Comment removed (Score:2, Informative)
Re:It's failure on multiple levels (Score:5, Informative)
Re:UPS's (Score:3, Informative)
Strips of steel with holes in them? You're kidding, right?
No. It would be 50*15*5 mm steel with a 10mm hole drilled in each end. A bolt goes through each hole into a threaded attachment point.
Now that you mention it I recall that a four inch nail is good for 100A slow blow but thats cylindrical so it conducts nicely. You'd think the rectangular cross section would not conduct quite as well (sharp corners, etc) but maybe it is also tuned for the desired current. A little saw cut half way between the holes would do that.
Not really (Score:5, Informative)
All a fuse is is a piece of metal that will melt fairly quickly when a given amount of current is passed through it. Idea being that it heats up and melts before the wires can. So, the bigger the current, the more robust the metal connecting it. A 100A fuse is usually a fairly large strip of steel.
Now I'll admit that just grabbing an approximate size of steel and placing it in as the GP did isn't going to yield a nice precise fuse. It may have been too high a current. However, it'd work for getting things running again and probably provide a modicum of protection in the event of a short.
Re:Oil's Well (Score:1, Informative)
I do believe there is a wooshing sound you missed. He is referring to the BP gulf oil spill. Although that was not caused by a power failure.
Re:UPS's (Score:3, Informative)
Just be glad nobody got killed...
Shorting out something in a main power junction could easily have created a fairly nasty fire...
Re:UPS's (Score:3, Informative)
What the hell is the "white phase"? Unless I am missing some newfangled data-center lingo, you are talking about the neutral, which is not a "phase" at all, and could never produce such a fault current when "shorted" to ground since it is already tied to ground at the panel. Am I missing something?
You have three actives (red, white, dark blue here in .AU), a neutral and an earth. The wikipedia page says different countries have different color codes so maybe that is the confusion.
Re:Murphy's law (Score:1, Informative)
I did give the meaning of his post a good deal of thought before replying. I am familiar with LOLcats, LOLCODE and more than my share of despair.com inspired "fail" jpegs. Whether his reply was a joke or not could go either way. On one hand, structuring the response so that the "joke" is where a punchline would be is one clue. The "fail" is another minor indication, but not enough to sway. If he had made the degree of epicicity of the fail explicit (even in a binary fashion), that would be sufficient for me to conclude that it was a joke - but he didn't.
OTOH, there was nothing particularly egregious about the OP's grammar or post to warrant the overly pedantic and snarky reply. If the OP hadn't used the semicolon, there wouldn't have been enough of a pause for "unlike yours truly." to have much comedic effect. While not Strunk and White kosher, surely it's acceptable for slashdot. At this point, you have a pointless post and what may or may not be a joke that succeeds or fails based on whether the OP has ridiculously bad grammar. Which equates to pointless post + unfunny joke (fail), or pointless post + unintentional grammatical error in the context of a grammatical correction (epic fail). Which in turn further condenses down to pointless post + fail, and if you consider a pointless post to be a fail, then fail + fail = fail, QED.
Re:Oil's Well (Score:3, Informative)
I have a friend who is an engineer on one of the projects in the North West Shelf (of Western Australia) a few weeks back he asked "how can they build a rig in the gulf of Mexico for one third of our costs". Two days later One blew up an he got his answer.
Re:UPS's (Score:4, Informative)
Re:Murphy's law (Score:1, Informative)
I have seen that myself. When a PHB tells me "security has no ROI", I die a little.
I have been at well architected datacenters. There is a reason they have two physical drops for lines on the building, and it is exactly to deal with the backhoe problem. This way, if someone cuts their primary connection, it would fail over to the peer. Power was distributed the same way, so even though most of the building might go dark, the CRAC/HVAC system would keep running. If both lines got cut, then that is what the diesel generator and batteries were for (batteries were replaced every 2 years.)
Having multiple drops for anything is a requirement for anything Tier III or above, and in reality needs to be a part of Tier II. Tier I it doesn't need to be an issue, but people have to realize that the risk is there and if the company can handle multi-day downtime. Failing to heed to this is a misrepresentation of quality of service.
Re:Where's your cloud now? (Score:3, Informative)
Re:Again: The IT Uptime Lightweights (Score:5, Informative)
When was the last time anyone heard of a TV Network going dark for an hour?
Hmm, let me think. How about yesterday [bbc.co.uk]?
Re:Again: The IT Uptime Lightweights (Score:2, Informative)
It's a matter of price versus "what if". YOU try to convince a pointy haired boss to spend thousands and thousands of extra dollars on something that "may" happen.
It's often hard enough to convince higher ups to just upgrade old infrastructures that are maxed out on resources. Even if you have proof of issues or near failures. The ONLY time they will happily spend money on upgrades and making your infrastructure more robust is after there has been a critical failure and they actually see their bottom line being hurt and even then if you don't get the approval and dollars fast enough, you run the risk of "What are the chances THAT will happen again?"
More often than not, infrastructure is patches built on patches, one I.T. guy coming in trying to "correct" mistakes of his/her predecessor (who they then realize was working with an underwhelming budget), THEN realizing that it's such a mish mash of bubblegum and duct tape, that any serious fixes would require serious downtime with a complete overhaul. Otherwise you run the risk of the whole thing imploding like a blackhole.
How many I.T. guys seriously have the guts to walk up to their boss after being on the job for only a week and say, "I need 50k and you're network will be going up and down for two weeks as I rebuild and fix it all."
I tried it. I, however, had the ammunition that my company went from 3 people to 40 people in 18 months with another 20 predicted in the next 6 months and that the two box servers were maxed out AND that we were renovating a newly purchased building so we could plan everything from cabling, to telephony to security and future planning for 250+ people.
It also didn't hurt that my boss knows that I.T. is an investment when done right and NOT an expense. Even then with everything on my side it still took 3 months of planning, proving, mapping, designing and quoting from vendor after vendor before approval went through.
Re:UPS's (Score:1, Informative)
Only on small stuff are the hot wires black, red or blue.
Brown, Orange and Yellow are used to mark out 3 phase systems.
Still, a white wire is typically neutral.
Oh, and on larger fuses, they're often a copper (or even better, silver) strip. in a barrel filled with sand. When the strip melts, the resulting arc melts the surrounding sand, fusing everything together into a non-conductive mess. The result is a fuse that can quickly and reliably interrupt a large current, even when its highly inductive. Simply melting a strip of metal can allow an arc to linger across the gap for a dangerous amount of time.
Re:Again: The IT Uptime Lightweights (Score:3, Informative)