InfrastructureCritical

AWS us-east-1 Outage Disrupts Major Services Globally

A 3-hour networking incident in us-east-1 cascaded into Lambda, DynamoDB, and IAM degradations across regions.

2h agoupdated 43m ago

Full summary

Amazon Web Services experienced a major service disruption originating in the us-east-1 region. The root cause, per the AWS post-incident summary, was a misconfigured update to internal network telemetry that caused the regional control plane to lose visibility into a subset of availability zones. Lambda invocation, DynamoDB writes, and IAM credential refresh all saw elevated error rates for approximately three hours. The outage propagated to other regions because IAM's global plane runs primarily out of us-east-1. Downstream services including Slack, Coinbase, Atlassian Cloud, and Heroku reported user-visible impact during the window.

Why it matters

us-east-1 remains structurally important to the global AWS control plane. Multi-region architectures that depend on us-east-1 for IAM cannot fully insulate themselves from this class of incident.

Technical explanation

The telemetry update changed BGP metrics inside AWS's internal backbone, causing routes to flap. Customer-visible symptom: API calls returning 5xx with retry-after suggestions, but the retry storm extended the recovery time.

Business impact

SLA credits will be issued. Customers with high-availability requirements should re-examine their disaster-recovery topology and specifically the implicit us-east-1 dependency in services like Route 53, CloudFront, and IAM.

⚡ Action needed

Audit your runbooks: do they assume IAM and Route 53 are available during a us-east-1 incident? Plan to use STS regional endpoints and split IAM-dependent code paths.