What caused Amazon’s AWS outage, and why did so many major apps go offline?
A major outage at the Amazon Web Services (AWS) on Monday disrupted a large portion of the internet, taking down apps, websites and online tools used by millions of people around the world, before services were eventually restored. The hours-long breakdown of the cloud system that supports a portion of the internet revealed just how much of mod
ern-life depends on the infrastructure – from banking apps and airlines to smart home devices and gaming platforms. What happened and what caused the AWS outage? At about 07:11 GMT, Amazon’s cloud service experienced a major outage, meaning some of its systems stopped working, which disrupted many popular apps and websites, including banks, gaming platforms and entertainment services. The problem started in one of AWS’s main data centres in Virginia, its oldest and biggest site, after a technical update to the API – a connection between different computer programmes – of DynamoDB, a key cloud database service that stores user information and other important data for many online platforms. The root cause appears to have been an error in the update that affected the Domain Name System (DNS), which helps apps find the correct server addresses. A DNS works like the internet’s phone book, turning website names into the numeric IP addresses that computers use to connect to servers. Because of the DNS issue, apps could not find the IP address for DynamoDB’s API and were unable to connect. As DynamoDB went down, other AWS services also began to fail. In total, 113 services were affected by the outage. By 10:11 GMT, Amazon said that all AWS returned to normal operations, but there was a backlog “of messages that they will finish processing over the next few hours”.
