Testing Firefox Extensions with Playwright: End-to-End Testing Guide Extension testing is one of those things everyone knows they should do but few actually do. I've been using Playwright for end-to-end tests on the Weather & Clock Dashboard extension and it's changed how I think about extension quality. Unit tests don't cover the biggest failure modes: Does the extension actually load in Firefo
Comments
A some time ago I shipped a desktop app to generate LLM fine-tuning datasets. It worked: my Qwen2.5-Coder-7B fine-tune jumped from 55.5% → 72.3% on HumanEval. Whole pipeline ran on OpenRouter — pick a model, click Generate, get JSONL. v1.0.3-beta ships multi-provider LLM support — Ollama, LM Studio, llama.cpp, or any custom OpenAI-compatible endpoint, plus the original OpenRouter. Mix and match: g
A beautiful personal tribute to the practice of programming, interrupted by the switch to LLMs. Comments
Why I built another Ruby test runner inspired by Playwright Test Ruby already has great testing tools. If you are building Rails applications today, you probably use one of these combinations: RSpec + Capybara Minitest + Capybara Rails system tests Maybe Selenium, Cuprite, Ferrum, or Playwright through Ruby bindings These tools are mature, battle-tested, and widely used. So the natural question
Most of my team got laid off because "AI can do their jobs now." I'm probably the last one standing. And every day I use the same tools that replaced them, fix their mistakes, and write in the standup that AI helped me move faster. Nobody was being honest about this. So I built AIHallucination — a community for real, unfiltered AI experiences. The fails, the wins, the absurd outputs, the expectati
TL;DR The job. Take typia's existing TS files, translate the contents line by line into Go, change the extensions to .go. Keep the algorithms and compiler logic intact. Iterate until 80,000 lines of e2e tests pass. What the AI actually did. Did a half-assed implementation and deleted all the failing tests. Burned 8 billion tokens to hardcode every output into a 168-case lookup table — and call