My Hugo blog was downloading 3.6 MB of JavaScript and 40 KB of external CSS on every page load. For a static blog with mostly text and a few diagrams, that was absurd. Here is how I fixed it. HTML: 86 KB JavaScript: 3.6 MB (Mermaid + KaTeX) CSS: 40 KB (KaTeX stylesheets) Problem: render-blocking scripts loaded on every page for math and diagrams Adding minifyOutput = true to hugo.toml shrunk HTML
Some time ago, I was building a chat application using AWS Websocket API gateway. Things were going smoothly. I created a WebSocket API Gateway, added $connect, $disconnect, and sendMessage/addGroup routes. From the frontend (React) side, everything was fire-and-forget. You send a message, and the onMessageHandler takes care of it 💪🏼 But then a new requirement of uploading files using S3 signed
When you have 5 unrelated questions, should you pack them into one message to the LLM, or send 5 requests simultaneously? Which is faster? Splitting into multiple independent parallel requests is almost always faster. This isn't a gut feeling — it's determined by the underlying inference mechanism of LLMs. Let's walk through the reasoning from first principles. To understand this problem, you firs