Learn

Engineering Articles

Real posts from engineering blogs — broken down for system design interviews, with a quiz to test what you learned.

CloudflareSeptember 26, 2025

How Cloudflare Cut Cold Starts 10x: From TLS Pre-Warming to Consistent Hash Sharding

Cloudflare Workers started with 5ms cold starts that were hidden behind TLS handshakes. As Workers grew to support full applications (10 MB scripts, 400ms startup budgets), cold starts outgrew TLS - and the original trick stopped working. This post covers both generations of their solution: the TLS SNI pre-warming trick and the consistent hash ring sharding system that ultimately cut eviction rates 10x and pushed warm request rates to 99.99%.

14 minAdvancedServerlessSystem DesignDistributed Systems
OpenAIJanuary 22, 2026

Scaling PostgreSQL to power 800 million ChatGPT users

OpenAI runs ChatGPT for 800 million users on a single-primary PostgreSQL instance with ~50 read replicas - no sharding. Over the past year, database load grew 10x. This post covers every optimization they made to keep it running: connection pooling, cache stampede prevention, workload isolation, rate limiting, and safe schema management.

12 minAdvancedDatabasesSystem DesignPostgreSQL
OpenAIJanuary 29, 2026

Inside OpenAI's In-House Data Agent: From Question to Insight in Minutes

OpenAI built a bespoke internal AI data agent that lets any employee - not just data engineers - go from natural language question to verified insight in minutes. The agent is powered by GPT-5.2, uses Codex to deeply understand table semantics from source code, retrieves context via RAG over 70k datasets (600 PB), and continuously self-improves through a layered memory system. The post breaks down its six-layer context architecture, conversational reasoning loop, eval-driven quality assurance, and key lessons in agent design.

14 minAdvancedAI AgentsSystem DesignData Engineering
VimeoJanuary 16, 2026

Building AI-Powered Subtitles at Vimeo

Vimeo's AI subtitle system uses LLMs to translate video subtitles across 9+ languages. The core challenge: LLMs optimize for fluency and merge fragmented speech into clean sentences, breaking subtitle timing sync. Their fix is a three-phase "split-brain" pipeline that separates creative translation from structural line mapping, with a self-healing fallback chain that guarantees 100% of subtitle slots are filled.

10 minIntermediateAI/MLSystem DesignLLMs
CursorMarch 2025

How Cursor Built Fast Regex Search with N-Gram Indexing

Cursor's AI-powered code editor needs sub-second regex search across entire codebases to keep agents productive. Ripgrep alone takes 15+ seconds on large monorepos because it scans every file. This post covers how Cursor built a client-side n-gram inverted index that narrows candidates before ripgrep runs — covering trigrams, bloom filter masks, sparse n-grams with frequency-based weight functions, memory-mapped file formats, and why the index lives on the client rather than a server.

10 minAdvancedSearchIndexingSystem Design
OpenAIJanuary–February 2026

How OpenAI built Codex: inside the agent loop and harness

OpenAI's Codex powers a cross-platform coding agent (CLI, web, VS Code, macOS app) from a single shared harness. Two engineering posts reveal exactly how that harness works: the agent loop that orchestrates model inference and tool calls, the prompt structure and caching strategy that keeps it efficient, and the App Server JSON-RPC protocol that lets every client surface share the same core.

14 minAdvancedAI AgentsSystem DesignLLMs