Parts of Amazon Web Services suffer an outage

Updated: Amazon web services are having trouble this evening and in the process are taking down some major sites and services. Among sites being impacted are Quora and HipChat. In addition, the Amazon outage has had an impact on Heroku, a division of Salesforce.

Amazon is one of the key infrastructure providers to some of the biggest and many well known startups such as Pinterest and Dropbox. The outages were related to Amazon’s EC2 and RDS services and the problems it seemed were localized to Amazon’s Virginia datacenter. Other services in the North Virginia data center such as ElastiCache and Elastic Beanstalk were also impacted. On their status website, regarding EC2 Amazon notes:

We continue to investigate this issue. We can confirm that there is both impact to volumes and instances in a single AZ in US-EAST-1 Region. We are also experiencing increased error rates and latencies on the EC2 APIs in the US-EAST-1 Region.

9:55 PM PDT We have identified the issue and are currently working to bring effected instances and volumes in the impacted Availability Zone back online. We continue to see increased API error rates and latencies in the US-East-1 Region.

On the issue of RDS problems, AWS notes:

9:33 PM PDT Some RDS DB Instances in a single AZ are currently unavailable. We are also experiencing increased error rates and latencies on the RDS APIs in the US-EAST-1 Region. We are investigating the issue.
10:05 PM PDT We have identified the issue and are currently working to bring the Availability Zone back online. At this time no Multi-AZ instances are unavailable.

AWS has suffered outages in past. A widespread problem impacted major websites in April 2011. In July 2008, Amazon’s S3 service was offline and caused major problems for many of its customers. I have been in touch with folks from Amazon and Heroku to get better idea of what is going on. In the interim enjoy some of the tweets about the outage.



GigaOM