I use AI coding agents every day. I believe they are reshaping how we build software, and I think the teams that adopt them deliberately will outperform those that don't. I am not writing this to warn you away from AI-assisted development. I am writing this because the loudest voices in the AI enthusiasm camp are also the most allergic to discussing what can go wrong. And that worries me more than
Kubernetes and AI have become unlikely bedfellows—and the numbers prove it. New data from CNCF and SlashData reveals that two-thirds of organizations running generative AI models have standardized on Kubernetes for orchestration. But here's the thing: it's not because Kubernetes magically solves AI problems. It's because the engineering fundamentals that make Kubernetes valuable—standardization, r
On Second Thought — Episode 06 The ORM hides the SQL. The cache hides the ORM. The service mesh hides the services. The operator hides the YAML, which already hid the kubelet, which already hid the container, which already hid the process. By Tuesday, nobody quite remembers what the original problem was. They are too busy configuring its sixth wrapper. This is the post about that wrapper. When som
Every team experiences incidents. The teams that grow stronger from them are the ones that take postmortems seriously — not as blame sessions, but as structured learning opportunities. Yet most postmortems end up as a wall of text nobody reads twice, filed away and forgotten until the same incident happens again six months later. This guide walks you through writing postmortems that genuinely chan
How Cloudflare Built Resilience: Lessons from Their Infrastructure Overhaul When a single misconfiguration can cascade across a global CDN and take down customer traffic, every deployment becomes a high-stakes decision. Cloudflare recently completed a massive push to make their infrastructure fundamentally more resilient—and their approach offers critical lessons for anyone operating at scale. M