On November 15, 2024, Netflix did something it had been quietly engineering toward for three years: it streamed the biggest live sports event the internet had ever seen. 65 million concurrent viewers watched Mike Tyson and Jake Paul trade punches at AT&T Stadium in Arlington, Texas — a number that dwarfs any single streaming event Netflix had previously attempted. nodes around the world hammered the origin for segments every two seconds, each chunk potentially several megabytes, from every timezone simultaneously. The pressure on what Netflix engineers call Live Origin — the custom-built microservice bridging the cloud encoding pipeline and the CDN — was unlike anything in the company's history. This was not a load test. There was no rollback button if the system buckled.
Netflix's engineering challenge with live video is categorically different from its on-demand catalog. content is encoded once, uploaded once, and then served almost entirely from the edge with the origin barely involved. Live video destroys this model entirely. Every 2-second segment is brand new — it must be encoded, packaged, DRM-encrypted, and written to the origin within a hard real-time deadline, while simultaneously dozens of CDN nodes are requesting that same segment the moment it should exist. Netflix's existing infrastructure, including its massive Open Connect network, was built for static content; live content required the engineers to rethink storage, traffic management, and quality control from first principles.
📡Netflix's early live events used plain AWS S3 buckets as the segment store — and the results were brutal. Median write latency of 113ms against a 2-second publishing deadline meant the system was spending nearly 6% of every segment's entire window just waiting on storage acknowledgment, with p99 latencies of 267ms making late segments a near-certainty at scale.
The original Live Origin architecture relied on as the backing store for video segments. When the packager finished encoding a segment, it issued a PUT request to S3; when an Open Connect node needed that segment, it issued a GET. The problem was that S3 is designed for general-purpose durability, not microsecond consistency. High latency variation on writes meant segments frequently missed their publishing window. At high request rates exceeding 100 RPS per event, S3 throttled the origin, causing playback latency spikes visible to viewers as buffering. The team knew that scaling to tens of millions of concurrent streams would make this completely untenable — a generic storage solution cannot serve as the foundation of a real-time broadcast.
Even after replacing S3, the engineers faced a second failure mode that they called the . When a new segment becomes available — at the top of every 2-second clock tick — potentially dozens of top-tier Open Connect nodes across different geographic sites all issue GET requests simultaneously. Each segment can be several megabytes of encoded video. Back-of-envelope calculations put worst-case read throughput at over 100 Gbps — a volume that would obliterate write performance on any strongly-consistent database, including Apache Cassandra. The engineers had traded one problem for another: a write-optimized store that couldn't survive its own read traffic.
Problem
S3 Can't Keep Up
Early live events reveal S3 segment writes hitting 113ms median latency and 267ms at p99, against a hard 2-second publishing deadline. CDN nodes requesting segments early get throttled responses; playback stalls and buffering appear for viewers in real time.
Cause
Generic Storage Meets Real-Time Deadlines
S3 lacks the Netflix requires for live. Its kicks in at the exact request rates a live event generates. No amount of tuning can make a general-purpose object store behave like a real-time media system.
Solution
Custom KeyValue Store on Cassandra + EVCache
Netflix builds a custom KeyValue abstraction layered on Apache Cassandra with for durability, and adds EVCache (Memcached-based) as a write-through read cache. Large segment payloads are chunked to enable idempotent retries and load distribution across the Cassandra cluster. Separate EC2 stacks and storage clusters are provisioned for publishing and CDN-facing traffic.
Result
65M Streams Without a Dropped Segment
Median write latency drops from 113ms to 25ms; p99 improves from 267ms to 129ms. The EVCache layer absorbs nearly all read traffic, allowing the system to sustain 200Gbps+ read throughput without touching the write path. The Tyson vs. Paul fight streams successfully to 65 million concurrent viewers — the largest live sports event ever delivered over the internet.
The elegant solution to the origin storm was a write-through cache: every segment written to Cassandra is simultaneously cached in EVCache (Netflix's distributed Memcached layer). When an Open Connect node requests a segment, it hits EVCache first. Cache hits serve at network speed; only misses reach Cassandra. This achieves read-write separation without separate infrastructure — the write path remains clean and fast through Cassandra's LSM engine, while reads are absorbed almost entirely by the in-memory cache. The team validated this against its own back-of-envelope: if the cache hits 90% of reads, then only 10% of the theoretical 100Gbps storm ever reaches Cassandra, putting it comfortably in the sustainable range. In practice the system exceeded 200Gbps sustained read throughput with no write degradation observed.
THE REDUNDANT PIPELINE
Netflix runs two completely independent live encoding pipelines across separate AWS regions, with separate contribution feeds, encoders, and packagers.
ensures the two pipelines produce interchangeable segments. When the Live Origin receives a CDN request, it selects the first valid segment from either pipeline — providing transparent, automatic failover without any client involvement.
Netflix spent three years quietly solving a problem nobody knew it had, and the 65 million people watching the fight had no idea any of this was happening — which is exactly how it was supposed to work.TechLogStack — built at scale, broken in public, rebuilt by engineers