Architecture

System Architecture

How the Annotated HTML Builder pipeline works end-to-end — from browser to agent sandbox to compiler proof.

System Diagram

Browser

Builder UI + Chat

POST /api/run

Next.js (Vercel) · proxies to AHB_API_URL with Bearer AHB_API_SECRET · streams SSE through

agent-service (Railway)

Independent service — clones, converts, verifies, pushes

validateAnnotations()

Gate — checks data-component + at least one annotation

cloneProjectForRun()

Clones the PROJECTS-registered repo fresh into /tmp · npm ci from volume cache

agentConvert() — Claude Agent SDK query()

Reads primitives · writes .tsx · runs tsc · repairs errors · commits · streams events

verifyIndependently()

Server-side tsc --noEmit — independent of agent self-report

pushBranch() + openPullRequest() → SSE done event

Only if committed + verified clean: push claude/ahb-* branch, open PR for review (never straight to main) · nextStep: await_review

Where Each Anthropic Surface Fits

SurfaceFileRoleLayer

Claude Agent SDK

agent-service/src/lib/agent-convert.tsDrives the query() loop in the cloned repo — reads files, writes components, runs tsc, repairs errors, commits.Agent

Messages API (Anthropic SDK)

src/app/api/chat/route.tsPowers the annotation assistant chat panel — streaming text response, no tool use.Chat

Next.js App Router SSE

src/app/api/run/route.tsProxies to the agent service and streams its SSE events straight through to the browser.Streaming

Validation Gate

src/lib/ahb/validate.tsClient- and server-side check before any agent call — no API spend on unannotated HTML.Gate

Build First

What to build first

  • 1

    The annotation gate (validateAnnotations) — stops agent calls on bad input

  • 2

    The SSE orchestrator (api/run) — the backbone that everything plugs into

  • 3

    The agent loop with a tiny test component — proves the SDK wires up

  • 4

    The independent verify step — never ship without it

  • 5

    The builder UI with a working preview — fast iteration loop for authors

Defer

What to defer

  • Playwright render harness — useful but not required for compiler proof

  • Auth / project management UI — wire the workspace resolver manually first

  • Cost dashboard — costUsd is already in the done event; a UI is cosmetic

  • Multi-project workspace provisioning — start with a single hardcoded sandbox

  • Auto-merge of conversion PRs — review stays human; the PR is the gate

Honest Risks

Agent writes bad imports

Agent reads existing primitives via Grep/Read first. The tsc repair loop catches missing exports.

Infinite repair loop

maxRepairs cap (default 3). Agent must stop and report remaining errors if budget exhausted.

Injection in projectId / branch

projectId must match the PROJECTS registry; assertValidBranch() rejects any ref that isn't a plain git ref. Repos are cloned with execFile (no shell).

Long-running agent times out

route.ts sets maxDuration: 300 (Vercel Fluid). The clone + convert runs on Railway, not the Vercel function.

Silent tsc pass on agent self-report

verifyIndependently() re-runs tsc after the agent finishes. The push is blocked unless this independent check is clean.