In March 2026, a rogue AI agent at Meta triggered a Sev 1 security incident. Sensitive company and user data was exposed to unauthorized employees for nearly two hours. The agent held valid credentials. It operated inside authorized boundaries. It passed every identity check. And yet. Identity and Access Management answers one question: Is this agent who it says it is? It doesn't answer: Was this
The Problem Nobody Talks About AI can write code, generate content, analyze data, design systems, and manage projects. It's getting better every month. The natural question: what's left for humans? The wrong answer: "AI will replace us." The right answer is uncomfortable: stop picking the best AI. Run multiple AIs in competition, and become the judge. Three rules, learned the hard way: Multiple
Anthropic now ships at least three different memory models inside the Claude product family, and they don't behave the same way. Claude.ai has a chat memory feature for Pro, Max, Team, and Enterprise users that summarizes prior conversations and injects that summary into new chats. Claude Code has CLAUDE.md files plus a separate "auto memory" directory the model writes to itself, both loaded at se
Iris v0.4.0 ships today. It's the release where protocol-native eval crosses from "deterministic rules" into "semantic scoring" — without giving up any of what made the deterministic layer work. Three headline features plus a lot of infrastructure work that quietly compounds. I'll go through each, why it matters, and how it fits the thesis. Heuristic rules catch a lot: length, keyword overlap, PII