The Wall Street Journal ran a piece yesterday on JustPaid, a 9-person Mountain View startup. They used OpenClaw and Claude Code to stand up seven AI agents that write code, review it, and run QA around the clock. In one month: 10 major features shipped. Each one would have taken a human engineer a month or more. This story is getting passed around as proof that the autonomous engineering team is h
MCP vs Skills: a practical decision guide for builders I need my agent to do X. Skill or MCP? If you build agents on Claude or anything MCP-compatible, this is the question that actually matters. The two patterns get pitched as alternatives. They are not. They solve different problems. Most production agents need both. Here is the decision rule, the framing for each, and the anti-patterns I keep
We Rewrote Our Angular 18 App in React 20 and Increased Developer Velocity by 40% Last quarter, our engineering team made the bold call to rewrite our 3-year-old Angular 18 production application in React 20. After 6 months of development, we cut over to the new stack with zero downtime, and the results have exceeded our expectations: we’ve measured a 40% increase in developer velocity, alongsid
White labeling is more common than you might think. When developing software, you often need to deploy the same application for multiple clients, each requiring their own customization: unique color palettes, logos, or specific variants for a link. Without a proper strategy, you might be tempted to simply clone the existing repository and implement client-specific changes on demand. However, this
In March 2026, a rogue AI agent at Meta triggered a Sev 1 security incident. Sensitive company and user data was exposed to unauthorized employees for nearly two hours. The agent held valid credentials. It operated inside authorized boundaries. It passed every identity check. And yet. Identity and Access Management answers one question: Is this agent who it says it is? It doesn't answer: Was this
The Problem Nobody Talks About AI can write code, generate content, analyze data, design systems, and manage projects. It's getting better every month. The natural question: what's left for humans? The wrong answer: "AI will replace us." The right answer is uncomfortable: stop picking the best AI. Run multiple AIs in competition, and become the judge. Three rules, learned the hard way: Multiple
Anthropic now ships at least three different memory models inside the Claude product family, and they don't behave the same way. Claude.ai has a chat memory feature for Pro, Max, Team, and Enterprise users that summarizes prior conversations and injects that summary into new chats. Claude Code has CLAUDE.md files plus a separate "auto memory" directory the model writes to itself, both loaded at se
Iris v0.4.0 ships today. It's the release where protocol-native eval crosses from "deterministic rules" into "semantic scoring" — without giving up any of what made the deterministic layer work. Three headline features plus a lot of infrastructure work that quietly compounds. I'll go through each, why it matters, and how it fits the thesis. Heuristic rules catch a lot: length, keyword overlap, PII