Testing Firefox Extensions with Playwright: End-to-End Testing Guide Extension testing is one of those things everyone knows they should do but few actually do. I've been using Playwright for end-to-end tests on the Weather & Clock Dashboard extension and it's changed how I think about extension quality. Unit tests don't cover the biggest failure modes: Does the extension actually load in Firefo
More rules should mean better output. That's the intuition. I spent weeks building a comprehensive CLAUDE.md — 200 lines covering naming conventions, security rules, error handling, architectural patterns, import ordering, type safety requirements, and more. I was proud of it. I'd thought through every scenario. Then I scored the output. 79.0 / 100. My carefully crafted documentation was actively
Why I built another Ruby test runner inspired by Playwright Test Ruby already has great testing tools. If you are building Rails applications today, you probably use one of these combinations: RSpec + Capybara Minitest + Capybara Rails system tests Maybe Selenium, Cuprite, Ferrum, or Playwright through Ruby bindings These tools are mature, battle-tested, and widely used. So the natural question
I wanted to test my web app. That's it. A Next.js portfolio and a SaaS chat — run some accessibility checks, catch console errors, verify nothing's broken on mobile. The kind of thing you do before pushing to production. I opened Claude Code, connected Playwright MCP, typed "test the app" and watched it burn through tokens like there was no tomorrow. Then /compact fired at 18% text context. Then I
Have you ever looked at code you wrote six months ago and thought: "Who wrote this monster?"? Relax, it happens to all of us. In software engineering, writing code that a machine understands is the easy part. The real challenge is writing code that other humans (including your future self) can understand, maintain, and scale. This is exactly where Software Design Principles come into play. In this
Part 1 of 5 in The New Engineering Contract — what it means to lead engineers when AI is doing more of the coding. SWE-CI tested 18 AI models across 71 consecutive commits. Most broke something on commit 47 they'd already broken on commit 1. That's not an intelligence problem. That's a learning system that isn't learning. A paper made me uncomfortable this month. Not because of what it found about