Forgot your password?

typodupeerror
Cloud Microsoft Windows Technology

Microsoft Azure Failure: SSL Certificates Were Updated... Sort Of 103

Posted by Unknown Lamer
from the embrace-the-slack dept.
judgecorp writes "Microsoft has published an explanation of the failure of Windows Azure earlier this month. Users of the Azure storage saw that an SSL certificate had expired. Microsoft's explanation says that the certificate had in fact been renewed, but an update with the new certificate details was not prioritized, and hadn't actually been implemented till after the old certificate expired. There are more interesting details, but Microsoft says better alerts and more automation will stop this particular fault happening again."
This discussion has been archived. No new comments can be posted.

Microsoft Azure Failure: SSL Certificates Were Updated... Sort Of

Comments Filter:
  • by phantomfive (622387) on Tuesday March 05, 2013 @01:34AM (#43076075) Journal
    Yeah, and they also had the Sidekick outage [cnet.com] with actual data loss. A lovely quote from that article:

    "I asked Microsoft for comment Saturday when I was writing this, in particular as to how the rest of its cloud might differ from the Danger set up. Microsoft said Sunday that its the fabric controller that manages the Azure service is built with redundancy in mind. "

    It may be built with redundancy in mind, but apparently it still has at least one single point of failure.

  • by phantomfive (622387) on Tuesday March 05, 2013 @02:33AM (#43076303) Journal
    Maybe. It seems to me that if the engineers have let the manager become powerful enough to be a single point of failure, they've designed the system wrong.
  • by frinkster (149158) on Tuesday March 05, 2013 @12:49PM (#43079953)

    None of which can claim to be better than 99.999% uptime, since it's practically impossible to achieve.

    Having worked for half a decade on mobile communications infrastructure that regularly exceed 99.999% uptime, I feel qualified to say that it is neither impossible nor super difficult. If it is a goal and you are willing to spend a lot of money than you can accomplish it.

    But nobody is going to pay $X for 99.99999% uptime when 98% uptime is available for $X / 100 unless they are forced to. Look at all of the various highly-funded internet services that go down completely when a single Amazon data center has an outage. They aren't even willing to pay a little bit extra and do the extra work to make their services run on multiple data centers at a time. Clearly, it is not a requirement of the venture capital that they are getting.

Sinners can repent, but stupid is forever.

Working...