Microsoft pins Azure outage on network miscue

Thursday’s Windows Azure outage in Europe was caused by a misconfigured network device, according to Microsoft.

Users reported that the cloud service went down early Thursday and Microsoft’s Azure dashboard confirmed a 2.5-hour outage. Now the company has offered its first clue into what went awry. In a blog post, Mike Neil, GM of Windows Azure, wrote:

The interruption impacted our Compute Service and resulted in connectivity issues for some of our customers in the sub-region. The service interruption was triggered by a misconfigured network device that disrupted traffic to one cluster in our West Europe sub-region. Once a set device limit for external connections was reached, it triggered previously unknown issues in another network device within that cluster, which further complicated network management and recovery.

He added that the team is still investigating the full root cause of the incident and will report back in the blog next week.

This was the second major Windows Azure glitch in the last few months, after the big “leap day” outage. Microsoft is trying to build Azure into a viable competitor to Amazon Web Services — although Azure started out as a platform as a service, while Amazon is more basic infrastructure as a service. Both of the behemoths, however, are moving into each other’s turf.

And neither is immune from snafus. Amazon suffered two high-profile outages earlier this summer.

Feature photo courtesy of Flickr user Carlos Gutiérrez G.



GigaOM