Three strikes and out? Microsoft, Twitter, Google clouds suffer

It’s been a pretty heavy week for the cloud, with three major providers suffering outages of varying severity, giving us another reminder that no cloud is perfect.

Microsoft’s Windows Azure cloud was down across Western Europe for approximately two and a half hours, with a flag on the Azure service dashboard stating:

“We are experiencing an availability issue in the West Europe sub-region, which impacts access to hosted services in this region. We are actively investigating this issue and working to resolve it as soon as possible”.

Indeed, the situation was eventually resolved with full service functionality restored. Similar problems were experienced in the West US region concerning management degradation; this took just over four hours to rectify.

This isn’t the first time Windows Azure has felt the wrath of the cloud gods. Back in February Microsoft’s cloudy outing was down for several hours, and during testing in 2009 the system went dark for a mighty 22 uninterrupted hours.

Please don’t take a picture...

Over in Silicon Valley things weren’t much better, as Google Talk was intermittently unavailable.

The service update page posted a service disruption flag at 1140 GMT on Thursday, but went to a full-blown outage ten minutes later, with the dashboard note stating: “The affected users are able to access Google Talk, but are seeing error messages and/or other unexpected behaviour”.

Hourly updates came and went offering nothing but the promise of another update until 1625 when the green light came back on and Google announced: “The problem should be resolved.

“Please rest assured that system reliability is a top priority at Google, and we are making continuous improvements to make our systems better.”

Twitter was also down for a brief period on Thursday. The company posted in a tweet: “Howdy folks, looks like we’re experiencing a small interruption of and some mobile clients. Thanks for your patience!”

Reports have suggested that a surge of Olympic-related tweets was the primary factor for taking the site down for around two hours, although the site could still be accessed remotely and through some mobile apps during this period.

But Twitter wasn’t the only platform to be knocked by Olympic frenzy. The UK government’s feted G-Cloud suffered an embarrassing outage on the eve of the 2012 London Olympics, reportedly due to a hosting error.

Elsewhere in the cloud...

These four isolated incidents in the same week add a huge amount of weight to the idea that the cloud is not infallible. But the next question is: what are the repercussions going to be?

Amazon, whose EC2 cloud went down at the end of last month due to storms in the Virginia area taking down services such as Instagram and Netflix with it, didn’t get away lightly.

Dating site fell out of love with Amazon Web Services (AWS) and left to Las Vegas-based FiberHub, citing that 100% uptime should be a required service level agreement (SLA).

Elsewhere, Hewlett Packard’s new SLA promised increasing numbers of service credits dependent on how far away their Cloud Object was from a 99.95% uptime, which is certainly above the 99.917% overall uptime research from the International Working Group on Cloud Computing Resiliency (IWGCR) found last month.

Plenty of differing opinion, but what would be an acceptable level of uptime for your enterprise? Were you affected by any of the big outages this week?

Leave a comment


This will only be used to quickly provide signup information and will not allow us to post to your account or appear on your timeline.