Every senior editor knows the feeling: you ask someone to write a 2,000-word article and get back something that opens brilliantly, coasts through the middle, and ends on a sentence that sounds like the writer ran out of energy and caffeine at the same moment. That is not a people problem. It is an architecture problem.
A single “write me a great article about X” prompt hands the model too many responsibilities at once: understand the brief, choose a structure, establish a voice, write a compelling lede, maintain quality across 2,000 words, end well. Each of these is a separate cognitive task. Bundling them into one prompt means each one gets a fraction of the model's attention — and the fraction allocated to sections three through five is smaller than sections one and two, because the context window is now full of everything that came before.
The IO Article Library solves this with prompt decomposition. Each of the 12 prompts has one job.The brief analysis prompt reads the context brief and extracts 6 structured parameters. The voice calibration prompt reads those parameters and outputs a 200-token style specification. The structure design prompt reads the style spec and outputs a locked outline. No subsequent prompt writes freeform — every prompt executes against a tightly constrained input. The quality is consistent because the constraints are consistent.
Why 12 Prompts and Not One
The number 12 is not arbitrary. It is the result of decomposing a publication-ready article into its minimum set of non-overlapping, single-responsibility tasks. Remove any one prompt and you either push its work onto an adjacent prompt (degrading that prompt's output) or you skip the step entirely (producing a detectably worse article).
The key decomposition decisions are three. First, structure before copy: the outline is locked in prompt 3 before any body copy is written in prompts 4–10. This means every section prompt receives the full structure as context, which prevents sections from repeating or contradicting each other — a failure mode endemic to single-prompt generation. Second, sections receive only their brief: each section-body prompt receives the locked outline and its specific section brief, not the full text of prior sections. This prevents voice drift and keeps context windows small. Third, quality pass at the end: prompt 11 reads the assembled article as a whole and flags coherence issues for prompt-level correction, not for manual editing.
Each prompt has one job. Structure before copy. Sections receive only their brief. A quality pass at the end. This is why section five reads as well as section one.
The 12-Prompt Chain — Interactive
The indigo nodes run on Claude Sonnet 4 (reasoning-heavy). The green nodes run on Claude Haiku (fast execution).
Model Routing: Sonnet vs. Haiku
The Article Library does not run all 12 prompts on the same model. It routes each prompt to the model whose capabilities match the task — Sonnet for reasoning-heavy analysis and quality review, Haiku for high-volume content execution. This is not a cost-cutting measure. It is an architectural decision that produces better output: Haiku writes section bodies faster and more cleanly than Sonnet because its smaller, more focused attention window keeps it on task without introducing the complexity Sonnet adds when given creative latitude.
The counterintuitive finding from routing experiments: Haiku-generated section bodies score higher on voice consistency than Sonnet-generated bodies, because Sonnet's tendency to elaborate pushes it off the locked style specification. Haiku executes the specification without editorializing. The best model for a task is not always the most capable model — it is the model whose failure modes are most compatible with the constraint structure.
Before / After: Single Prompt vs. Chain
The most direct demonstration of prompt decomposition's value is a side-by-side comparison. Below are outputs for the same article brief — one generated with a single “write a full article” prompt, one generated through the 12-step chain.
Voice Consistency Matrix
Voice consistency is the most visible quality signal in long-form content. The 12-prompt chain maintains voice consistency because each section prompt receives the style specification (output of prompt 2) as its primary input — not the growing conversation. The result: measurably consistent voice from the lede to the conclusion.