Amazon Web Services experienced minor outage on Sunday

Amazon Web Services, the world’s biggest and most well-known cloud computing provider, once again had some availability problems in its US-EAST data center.

The outage stemmed from a network issue and lasted from 12:51 p.m. Pacific Time on Sunday until 1:42 p.m., according to the AWS status page. As of 3:23 P.M., AWS was reporting that most affected Elastic Compute Cloud instances were back up and running and that it was “continuing to work on a small number of instances and volumes that require additional maintenance before they return to normal performance.

It’s hard to say how many sites were affected by the brief outage, but it appears Instagram was, at least minimally. One of the company’s engineers tweeted about the problems pretty early on.

Major shit going down in us-east-1 right now.

— Rick Branson (@rbranson) August 25, 2013

Numerous reports on Twitter noted that Flipboard and Vine were down, as well.

Instagram, Flipboard, and Vine are all down but the AWS dashboard says everything is ok. Something big has broken.

— Peter M. O’Donnell (@pmod) August 25, 2013

Airbnb acknowledged some problems, as well.

Apologies – Airbnb is among several sites & apps that are temporarily down due issues w/ Amazon servers. Investigating now & will update.

— Airbnb (@Airbnb) August 25, 2013

2013 has actually been a pretty good year thus far for AWS, with this being the first outage (at least that I recall; please correct me if I’m wrong) and brief one at that. Last year, however, ended on a sour note with a Christmas Eve outage that even took down Netflix — a guiding light for all developers trying to build resilient application atop Amazon’s seven-year-old, but still evolving cloud platform.

Last week, the Amazon.com front page went down for about 45 minutes and cost the company an estimated $ 5 million.

The US-EAST region is AWS’s most popular (and least expensive) region, which means that outages there often affect a large number of web sites and services. Best practices are beginning to emerge (from places other than Netflix even) about how architect applications that survive an outage in one Availability Zone, but not everyone heeds this advice, and sometimes even the best-laid plans aren’t good enough.

Here is the full AWS report on Sunday’s incident:

aws out

Related research and analysis from GigaOM Pro:
Subscriber content. Sign up for a free trial.

Cloud security market landscape, 2013–2017
The fourth quarter of 2012 in cloud
How direct-access solutions can speed up cloud adoption

GigaOM

Related Posts: