If this is useful, a ❤️ helps others find it. I run both in production. Here's the real comparison — not theoretical, from actual use building developer tools. Local LLM (Ollama) Gemini API (Free) Cost $0 forever $0 (free tier) Privacy 100% local Data sent to Google Setup Install Ollama + pull model Get API key (2 min) Quality Good (7B), Great (70B) Excellent Speed Fast if model lo
In the fast-paced world of continuous integration and deployment (CI/CD), managing sensitive information like API keys, tokens, and credentials—collectively known as secrets—is not just a best practice; it's a critical foundation for security and efficiency. GitHub Actions provides a robust framework for automating workflows, but a common friction point for many development teams, particularly tho
The Challenge of Scalable Secrets Management in GitHub Actions For development teams scaling beyond a handful of repositories, managing environment-specific variables and secrets in GitHub Actions can quickly become a significant bottleneck. The manual duplication of configurations across multiple repos, especially when dealing with distinct environments like development, staging, and production
A College Project That Planted a Seed Years ago I was on a university team trying to build a Go AI. We explored monte carlo simulation for lookahead search, basic neural networks for pattern recognition, and expert systems for encoding domain knowledge. None of them worked well enough on their own. Go's branching factor is enormous, so brute-force search fails quickly. Neural networks without th
DeepClaude: I Combined Claude Code with DeepSeek V4 Pro in My Agent Loop and the Numbers Threw Me Off DeepSeek V4 Pro correctly solves 94% of deep reasoning tasks in my loop… but the latency cost makes it unusable for 60% of my agent cases. Yeah, you read that right. And that completely blows up the narrative of "combining models is always better." Tuesday night I watched the DeepClaude post cli
Series: AI Isn’t an Engineering Problem Anymore (Part 2) In the last post, I talked about hitting a usage limit while debugging my robot and realizing how repetitive my own AI usage had become. When we use LLMs, whether through APIs or tools, it feels like every request is new. The inefficiency isn’t from using AI too much. You don’t ask once, you iterate. These are the most interesting ones. Some
Run the same brand-query through ChatGPT, Gemini, Perplexity, Claude, and Grok. Read the citations. The cited URLs will not be the same, the brands featured will not be the same, and in roughly a third of cases one tool will cite your brand confidently while another does not mention it at all. The temptation is to reach for an algorithmic explanation different rerankers, different summarisation st
Hermes Agent from Nous Research is a model-agnostic, tool-using assistant you run locally or on a VPS. Hermes does not lock you into one surface. You can use the classic hermes / hermes chat CLI, the full-screen hermes --tui session, a long-running hermes gateway for Telegram, Discord, Slack, and other messaging platforms, hermes dashboard for a local browser UI when the web extra is installed.