More rules should mean better output. That's the intuition. I spent weeks building a comprehensive CLAUDE.md — 200 lines covering naming conventions, security rules, error handling, architectural patterns, import ordering, type safety requirements, and more. I was proud of it. I'd thought through every scenario. Then I scored the output. 79.0 / 100. My carefully crafted documentation was actively
Claude + Mobile via MCP: Giving the Model Hands on a Real Phone I plugged in a Pixel two months ago, ran one command in Claude Desktop, and watched it open Maps and start navigation to my home address from a single sentence prompt. It was the first time I'd ever seen a language model physically operate a phone. Latency was about two seconds per action; the part that surprised me was the third st
AI-Native Mobile Testing: What It Actually Means in 2026 The phrase "AI-native" has been thrown around in the testing space since 2019. Almost every tool calling itself that just bolts a language model on top of Appium and ships the same brittle XPath selectors with a new label. That's not AI-native testing. That's Appium with a chatbot. This post is about what AI-native actually has to mean to
The Missing Control Plane for Local AI Agents I sat with my Pixel for 20 minutes trying to get Claude Desktop to dictate a Slack message via accessibility. It was miserable. The model was capable. The transport wasn't. That gap — between an AI that can reason and an AI that can actually do — is what I've been working on with Drengr. This post is the version of the argument I'd give to anyone bui
Have you ever looked at code you wrote six months ago and thought: "Who wrote this monster?"? Relax, it happens to all of us. In software engineering, writing code that a machine understands is the easy part. The real challenge is writing code that other humans (including your future self) can understand, maintain, and scale. This is exactly where Software Design Principles come into play. In this
Part 1 of 5 in The New Engineering Contract — what it means to lead engineers when AI is doing more of the coding. SWE-CI tested 18 AI models across 71 consecutive commits. Most broke something on commit 47 they'd already broken on commit 1. That's not an intelligence problem. That's a learning system that isn't learning. A paper made me uncomfortable this month. Not because of what it found about