Microsoft Azure cloud outage blamed on undetected testing bug
Picture credit: iStockPhoto
Microsoft Azure cloud storage went down for several hours overnight knocking various sites and services offline, with the tech giant blaming the fault on an undetected bug during testing.
The incident can be located on the Azure status history page, under ‘Multiple Azure Services’ and multiple regions.
“From 19 Nov 2014, 00:52 to 05:50 UTC a subset of customers using Storage, Virtual Machines, SQL Geo-Restore, SQL Import/Export, Websites, Azure Search, Azure Cache, Management Portal, Service Bus, Event Hubs, Visual Studio, Machine Learning, HDInsights, Automation, Virtual Network, Stream Analytics, Active Directory, StorSimple and Azure Backup Services in West US and West Europe experienced connectivity issues," it explains, adding: "This incident has now been mitigated.”
According to an Azure blog, the problem came about after an issue arose during an upgrade rollout which went unchecked during the several weeks of testing prior to release, which Microsoft calls ‘flighting’.
“During the rollout we discovered an issue that resulted in storage blob front ends going into an infinite loop, which has gone undetected during flighting,” Jason Zander, CVP at Microsoft Azure wrote. “The net result was an inability for the front ends to take on further traffic, which in turn caused other services built on top to experience issues.
“We will continue to investigate what led to this event and will drive the needed improvements to avoid similar situations in the future,” he added.
Microsoft is engaged in being the global leader in cloud and infrastructure, and has made good progress in recent months. Last month a note from Synergy Research found that while Amazon Web Services (AWS) was still the clear market leader in cloud infrastructure services, Microsoft had pulled clear of its competition, including Google and IBM, for second place.
According to Synergy, in the second quarter of this year Microsoft grew at 164%, compared to IBM at 86%, AWS at 49% and Google 47%.
John Dinsdale, chief analyst and managing director of Synergy Research Group, told CloudTech the outage was “really not good” and Microsoft’s response was “an awful lot less than stellar.”
“There will be blushes at Microsoft for sure, but there really needs to be a lot more than that,” he said. “As Microsoft and its competitors try to broaden the appeal of cloud infrastructure services beyond DevOps-type applications, a key to growth is proving that cloud services are enterprise grade and capable of supporting business critical applications.
“Events like this clearly do not help the cause,” he added.
The firm has also made aggressive product and partnership plays in recent months, including sealing a deal with Dropbox for enterprise storage. Analysts at the time questioned the validity of the partnership, arguing Dropbox needed an enterprise footprint, while Microsoft needed to give assurances it will ‘play nicely’ with others.
You can read more about the Azure outage here. Were you affected?