HistoryJanuary 25, 202610 min read

10 Famous Outages That Changed How the Internet Works

From the Morris Worm to the Fastly CDN outage, these incidents shaped modern internet infrastructure and practices.

Every major outage teaches the industry something new. These ten incidents fundamentally changed how we build and operate internet services.

1. The Morris Worm (1988)

What happened: Robert Morris, a Cornell graduate student, released what's considered the first computer worm. It infected roughly 10% of all computers connected to the internet (about 6,000 machines).

Impact: The internet was effectively unusable for several days.

What changed: This incident led to the creation of CERT (Computer Emergency Response Team), the first organization dedicated to internet security response. It also resulted in the first federal computer crime conviction.

2. The AT&T Long Distance Collapse (1991)

What happened: A single line of buggy code in a software update caused AT&T's switching systems to crash in a cascading failure. 75 million calls failed over 9 hours.

Impact: The entire AT&T long-distance network went down, affecting emergency services and businesses nationwide.

What changed: This became a landmark case study in software testing and the dangers of untested updates. It established the practice of staged rollouts for critical infrastructure.

3. The .com DNS Deletion (1997)

What happened: Network Solutions accidentally deleted the entire .com top-level domain database due to a human error during routine maintenance.

Impact: Every .com website became unreachable for several hours.

What changed: This led to improved backup procedures for critical DNS infrastructure and eventually contributed to ICANN's creation to oversee domain management.

4. The Slammer Worm (2003)

What happened: The SQL Slammer worm spread to 75,000 servers in just 10 minutes, creating so much traffic that it effectively DDoSed large portions of the internet.

Impact: Bank of America ATMs went offline, 911 services in Seattle failed, and Continental Airlines had to cancel flights.

What changed: This highlighted the danger of unpatched systems (the vulnerability had been patched 6 months earlier) and led to more aggressive automatic patching policies.

5. The Amazon S3 Outage (2008)

What happened: Amazon's Simple Storage Service went down for several hours due to a gossip protocol bug that spread corrupted state information.

Impact: Twitter, Dropbox, and thousands of websites using S3 for storage became partially or fully unavailable.

What changed: This was an early wake-up call about cloud concentration risk. It led AWS to improve their internal redundancy and helped establish the practice of multi-region and multi-cloud deployments.

6. The PlayStation Network Hack (2011)

What happened: Hackers breached Sony's PlayStation Network, stealing personal data from 77 million accounts. Sony took the network offline for 23 days to rebuild it.

Impact: The longest outage in gaming history, costing Sony an estimated $171 million.

What changed: This incident transformed how gaming companies approach security and data protection. It also accelerated adoption of two-factor authentication across the industry.

7. The Dyn DDoS Attack (2016)

What happened: A massive botnet called Mirai, composed largely of insecure IoT devices (cameras, DVRs, routers), attacked DNS provider Dyn with traffic exceeding 1 Tbps.

Impact: Twitter, Netflix, Reddit, Spotify, and dozens of other major sites went down for hours.

What changed: This exposed the security risks of IoT devices and led to increased focus on IoT security standards. It also accelerated adoption of multiple DNS providers and Anycast DNS.

8. The GitLab Database Deletion (2017)

What happened: During routine maintenance, an engineer accidentally deleted the wrong database. Their backups also failed, leaving them with only 6 hours of recoverable data.

Impact: GitLab lost hours of customer data and their service was down for 18 hours.

What changed: GitLab famously live-streamed their recovery efforts, setting a new standard for transparency during incidents. The incident became a case study in backup verification (or lack thereof).

9. The Facebook Global Outage (2021)

What happened: A routine BGP configuration update accidentally withdrew the routes to Facebook's DNS servers. Because Facebook's internal tools also depended on these routes, engineers couldn't even access systems to fix it. They had to physically travel to data centers.

Impact: Facebook, Instagram, WhatsApp, and Oculus were down for over 6 hours, affecting 3.5 billion users.

What changed: This highlighted the dangers of depending on your own infrastructure for recovery. It led to widespread review of out-of-band access methods across the industry.

10. The Fastly CDN Outage (2021)

What happened: A customer configuration change triggered a bug in Fastly's software that caused 85% of their network to return errors.

Impact: Amazon, Reddit, Twitch, Pinterest, and many news sites went down simultaneously for about an hour.

What changed: This demonstrated how CDN concentration creates single points of failure for the internet. It accelerated adoption of multi-CDN strategies and renewed focus on graceful degradation.

Common Themes

Looking across these incidents, patterns emerge:

1. Complexity creates fragility - As systems grow more complex, they develop unexpected failure modes

2. Configuration changes are dangerous - Most major outages start with a configuration change, not a hardware failure

3. Redundancy isn't enough - Systems can have redundancy and still fail if the failure mode affects all replicas

4. Recovery tools must be independent - If your recovery depends on the thing that's broken, you're stuck

5. Transparency builds trust - Companies that communicate openly recover their reputation faster

Each of these outages was painful, but they collectively made the internet more resilient. Today's best practices in chaos engineering, incident response, and redundancy all have roots in lessons learned from these failures.

Is Your Service Down?

Check real-time status for 500+ services.

Check Now

← Back to all articles

Home About Us Privacy Policy Terms of Service