Topic

Chaos Engineering

Every system that has never failed in production is a system that has never been properly tested. These are the stories of companies that built tools to break their own infrastructure — deliberately, in production, during business hours — and what they learned when the chaos worked exactly as intended.

Netflix Unleashed a Monkey With a Weapon in Its Own Data Center — On Purpose

It was 2011 and Netflix had just migrated hundreds of microservices to AWS. Their architecture was distributed, horizontally scaled, and theoretically fault-tolerant. But theory and production are different things. The only way to know if a system could survive failures was to cause failures — constantly, deliberately, during business hours, and in production. So they built a monkey.

10 Simian Army members