GitHub's architecture evolved over 18 years around a core assumption: the unit of load is a human developer. A human opens a PR, waits for review, pushes a few commits, and merges. The platform's service graph — its Git storage layer, its mergeability computation engine, its branch protection evaluation system, its Actions job dispatch queue, its search indexer, its notification fan-out, its webhook delivery pipeline, its permission evaluation layer, its API gateway — was sized and coupled around this human-paced access pattern. Every service in the chain was both a dependency and a dependency of every other service. This architecture was efficient and made GitHub easy to reason about for years.
AI agents broke the architecture's fundamental assumption. An agent doesn't open a PR and wait. An agent opens 50 PRs in parallel, each triggering the full service chain simultaneously. At scale, this creates a concurrency storm that amplifies through every layer of the graph. GitHub CTO Vlad Fedorov described it precisely: a single PR touches Git storage, mergeability checks, branch protection, Actions, search, notifications, permissions, webhooks, APIs, background jobs, caches, and databases. When the number of concurrent PRs scales 4x in six months, the pressure on every one of those systems scales accordingly — and the interconnected failures begin.
A Single GitHub PR: The 10+ Subsystems It Touches
flowchart TD
pr["AI Agent Opens PR"]
pr --> git["Git Storage\n(commit objects, pack files)"]
pr --> merge["Mergeability Engine\n(conflict detection)"]
pr --> bp["Branch Protection\n(rule evaluation)"]
pr --> actions["GitHub Actions\n(CI job dispatch)"]
pr --> search["Search Indexer\n(Elasticsearch update)"]
pr --> notif["Notification Fan-out\n(email, web, mobile)"]
pr --> perm["Permission Layer\n(repo access check)"]
pr --> webhook["Webhook Delivery\n(external integrations)"]
pr --> api["API Gateway\n(rate limit tracking)"]
pr --> jobs["Background Jobs\n(async processing queue)"]
pr --> cache["Cache Invalidation\n(repo metadata)"]
pr --> db["Database Writes\n(PR state, comments)"]
style pr fill:#24292f,color:#ffffff
style actions fill:#ef4444,color:#ffffff
style search fill:#ef4444,color:#ffffff
GitHub Actions: Weekly Compute Minutes — The AI Agent Surge
xychart-beta
title "GitHub Actions Weekly Compute Minutes"
x-axis ["2023", "2024", "H1 2025", "Dec 2025", "Early 2026"]
y-axis "Minutes (Millions)" 0 --> 2200
bar [500, 750, 1000, 1500, 2100]
THE RUBY GIL: WHY THE MONOLITH COULDN'T SCALE
Ruby's Global Interpreter Lock (GIL) is a mutex that prevents multiple threads from executing Ruby code simultaneously in the same process. For human-paced web traffic — where a request comes in, does some database work, and returns a response — the GIL is rarely the bottleneck. For AI agent traffic — where thousands of operations arrive concurrently and each one fans out across dozens of internal services — the GIL becomes a hard ceiling.
Even on a 64-core server, a Ruby process can use exactly one core at a time for Ruby execution. The fix isn't optimization. It's a different runtime. Go's goroutine scheduler runs across all available CPU cores without a GIL, making it architecturally suited for the concurrency profile that AI agent workflows generate. GitHub's Ruby-to-Go migration for performance-critical services is the right move — not as a language preference, but as a physics constraint.