The setup I was building an AI article generator. Four phases: Research — pull live SERP data, extract entities, score competitors Brief — Claude Sonnet writes a structured brief from the SERP Draft — section-by-section drafting in a fixed persona voice Polish — adversarial review pass that flags AI tells End-to-end: 6–8 minutes. Each phase: 60–180 seconds. Multiple LLM calls per phase. Token st
When you have 5 unrelated questions, should you pack them into one message to the LLM, or send 5 requests simultaneously? Which is faster? Splitting into multiple independent parallel requests is almost always faster. This isn't a gut feeling — it's determined by the underlying inference mechanism of LLMs. Let's walk through the reasoning from first principles. To understand this problem, you firs