<?xml version="1.0" encoding="UTF-8"?><rss version="2.0" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>Juho Choi</title><description>I build the systems that make AI agents predictable in production.</description><link>https://juhochoi.me/</link><item><title>Project bootstrap</title><link>https://juhochoi.me/writing/engineering/project-bootstrap/</link><guid isPermaLink="true">https://juhochoi.me/writing/engineering/project-bootstrap/</guid><description>Start a project with a deliberately minimal harness — interactive USER gates, docs-first, no autonomous loop — and graduate to autonomy only after customer evidence (personas, interviews, simulation reports) appears.</description><pubDate>Sun, 10 May 2026 03:29:18 GMT</pubDate><content:encoded>&lt;h1 id=&quot;project-bootstrap&quot;&gt;&lt;a href=&quot;#project-bootstrap&quot;&gt;Project bootstrap&lt;/a&gt;&lt;/h1&gt;
&lt;blockquote&gt;
&lt;p&gt;Start a project with a deliberately minimal harness — interactive USER gates, docs-first, no autonomous loop — and graduate to autonomy only after customer evidence (personas, interviews, simulation reports) appears.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2 id=&quot;when-to-use&quot;&gt;&lt;a href=&quot;#when-to-use&quot;&gt;When to use&lt;/a&gt;&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;A brand-new agentic project with no existing codebase and no customer data.&lt;/li&gt;
&lt;li&gt;Solo or small team where the user holds the domain knowledge and the AI handles implementation.&lt;/li&gt;
&lt;li&gt;The project will run long enough that docs (architecture, ADRs) get re-read by future sessions.&lt;/li&gt;
&lt;li&gt;You want the AI to implement, not author the spec.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&quot;when-not-to-use&quot;&gt;&lt;a href=&quot;#when-not-to-use&quot;&gt;When not to use&lt;/a&gt;&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Retrofitting an existing codebase — docs already exist or the code is the source of truth; the docs-creation step doesn&apos;t fit.&lt;/li&gt;
&lt;li&gt;A one-off script or short experiment — boilerplate overhead exceeds value.&lt;/li&gt;
&lt;li&gt;You already have a complete written spec — skip docs-creation, go straight to per-feature planning.&lt;/li&gt;
&lt;li&gt;Mature project with persona / customer-evidence inputs — switch to an autonomous-loop pattern with a critic agent.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&quot;context&quot;&gt;&lt;a href=&quot;#context&quot;&gt;Context&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;A vague product idea handed to an agent fails in two ways: endless clarifying questions that stall, or silent invention that ships code missing the point. The bootstrap shape forces a documented intermediate step — the spec discussion happens once, gets crystallized into a fixed set of docs, and every downstream phase session reads those docs instead of re-discussing.&lt;/p&gt;
&lt;p&gt;The &quot;Stage 0&quot; qualifier is deliberate. Mature harness patterns (autonomous loops, persona simulation, evidence-based critic agents) need inputs that do not exist at project start. Forcing them in too early either rejects everything (no evidence to evaluate against) or rubber-stamps (the critic abandons its own rule). USER gates are the only honest substitute until customer evidence accumulates.&lt;/p&gt;
&lt;h2 id=&quot;pattern&quot;&gt;&lt;a href=&quot;#pattern&quot;&gt;Pattern&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;Three structural decisions:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;1. Docs are the first AI-generated artifact.&lt;/strong&gt; Before any feature code, the user discusses the product in an interactive session, then triggers a docs-creation prompt. The AI writes a fixed set (e.g. PRD, user flow, data schema, code architecture, ADRs) that becomes the single source of truth for everything downstream.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;2. USER replaces the critic agent at three gates.&lt;/strong&gt; The build skill pauses for user input at clarification, plan approval, and task-creation approval. No agent autonomously approves its own plans. The cost is latency; the benefit is preventing intent-drift in the absence of customer evidence.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;3. Phase isolation in ephemeral sessions.&lt;/strong&gt; Once a task is approved, an external runner spawns one fresh &lt;code&gt;claude -p&lt;/code&gt; per phase, sequentially. Each phase reads docs + previous phase outputs from disk. No long-lived session that accumulates context across phases.&lt;/p&gt;
&lt;p&gt;Minimal file shape:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-text&quot;&gt;prompts/
  docs-create.md          # guidance for the foundational docs
  task-create.md          # guidance for task / phase files
.claude/skills/
  plan-and-build/         # docs → discuss → plan → run phases
  commit/                 # disciplined commit playbook
scripts/
  run-phases.py           # one ephemeral claude -p per phase
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;User flow:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-text&quot;&gt;claude                                              # interactive
  ↓ user describes product
  ↓ user invokes prompts/docs-create.md
docs/{prd, flow, data-schema, code-architecture, adr}.md
  ↓ user invokes plan-and-build skill
plan-and-build  (3 user gates: clarify → plan → task-create)
  ↓
tasks/N-name/{index.json, phase{0..K}.md}
  ↓ scripts/run-phases.py N-name
phase 0 → phase 1 → ... → phase K                  # each in its own claude -p
  ↓
async notification (Discord or equivalent)
&lt;/code&gt;&lt;/pre&gt;
&lt;h2 id=&quot;trade-offs&quot;&gt;&lt;a href=&quot;#trade-offs&quot;&gt;Trade-offs&lt;/a&gt;&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Setup cost.&lt;/strong&gt; You cannot just &lt;code&gt;claude&lt;/code&gt; and start coding; the boilerplate must exist first. Mitigated by publishing it once as a GitHub template repo so cloning is one command.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Three blocking gates per task.&lt;/strong&gt; Each &lt;code&gt;plan-and-build&lt;/code&gt; invocation pauses for user response three times. Fine for solo synchronous work; latency for async teams.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Doc-code drift risk.&lt;/strong&gt; Docs are authoritative only if maintained. Phase 0 of every task is a docs-update step that catches most drift, but stale ADRs can sneak through if not pruned.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Five docs is opinionated.&lt;/strong&gt; PRD + flow + data-schema + code-architecture + ADR is a default, not a prescription. Tiny projects may only need PRD + ADR; complex ones may need more. Adjust the docs-creation prompt.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Async notification is a hard prerequisite for unattended runs.&lt;/strong&gt; Without Discord or equivalent, the user must watch the terminal during phase execution (which can take hours). Set this up before relying on the pattern for anything longer than a single short task.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&quot;example&quot;&gt;&lt;a href=&quot;#example&quot;&gt;Example&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;&lt;a href=&quot;https://github.com/jh-choi98/claude-harness-template&quot;&gt;claude-harness-template&lt;/a&gt; is a concrete Stage 0 boilerplate implementing this pattern. Nine files. Published as a GitHub template repo:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-bash&quot;&gt;gh repo create new-project \
  --template jh-choi98/claude-harness-template \
  --clone
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The template was extracted from a longer-running internal harness (argos) by reading its git log to identify what existed at MVP era versus what was added later for Stage 2+ features (persona simulation, autonomous loops, read-only critic agents). The bootstrap shape was the residue after stripping the Stage 2+ assets that depend on customer evidence.&lt;/p&gt;
&lt;h2 id=&quot;related-patterns&quot;&gt;&lt;a href=&quot;#related-patterns&quot;&gt;Related patterns&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;&lt;em&gt;None yet. Future entries in this category will likely cover the Stage 2 graduation (persona-driven ideation, autonomous loops with rollback, read-only critic agents) and per-task git workflow choices (main-direct vs. feature branch).&lt;/em&gt;&lt;/p&gt;</content:encoded><category>ai-ml</category><category>agent-harness</category><category>patterns</category></item><item><title>Thin hook manifest</title><link>https://juhochoi.me/writing/engineering/thin-hook-manifest/</link><guid isPermaLink="true">https://juhochoi.me/writing/engineering/thin-hook-manifest/</guid><description>Have the hook config file declare only *what events to subscribe to* and route every one of them to a single handler binary; keep all branching, transformation, and policy in that handler.</description><pubDate>Mon, 04 May 2026 18:29:44 GMT</pubDate><content:encoded>&lt;h1 id=&quot;thin-hook-manifest&quot;&gt;&lt;a href=&quot;#thin-hook-manifest&quot;&gt;Thin hook manifest&lt;/a&gt;&lt;/h1&gt;
&lt;blockquote&gt;
&lt;p&gt;Have the hook config file declare only &lt;em&gt;what events to subscribe to&lt;/em&gt; and route every one of them to a single handler binary; keep all branching, transformation, and policy in that handler.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2 id=&quot;when-to-use&quot;&gt;&lt;a href=&quot;#when-to-use&quot;&gt;When to use&lt;/a&gt;&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Building external tooling (telemetry, audit, policy enforcement) on top of a harness that exposes events through a user-edited config file (e.g. Claude Code&apos;s &lt;code&gt;.claude/settings.json&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;You expect the integration&apos;s logic to evolve faster than the user&apos;s machine can be re-configured.&lt;/li&gt;
&lt;li&gt;You need typed payload handling, unit tests, or per-event branching that the manifest schema (&lt;code&gt;{matcher, command}&lt;/code&gt;) can&apos;t express.&lt;/li&gt;
&lt;li&gt;Many event kinds want the same treatment (&quot;log everything&quot;), so per-event configuration would just be repetition.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&quot;when-not-to-use&quot;&gt;&lt;a href=&quot;#when-not-to-use&quot;&gt;When not to use&lt;/a&gt;&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;A one-off shell hook that does one thing and never changes — &lt;code&gt;&quot;command&quot;: &quot;echo done &gt;&gt; ~/log&quot;&lt;/code&gt; directly in the manifest is fine; wrapping it in a CLI is overkill.&lt;/li&gt;
&lt;li&gt;The manifest schema is already expressive enough for the full policy.&lt;/li&gt;
&lt;li&gt;You don&apos;t control a fast distribution channel for the handler. If updating the handler is harder than updating the manifest, the leverage flips and the manifest is the right place for logic.&lt;/li&gt;
&lt;li&gt;Hot paths where per-event handler-process spawn cost matters more than the cost of fanning out matchers in the manifest — then push filtering into the manifest.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&quot;context&quot;&gt;&lt;a href=&quot;#context&quot;&gt;Context&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;The pattern is illustrative, not a law. It earns its keep when &lt;em&gt;both&lt;/em&gt; of these are true: the manifest schema is too narrow for the real policy, AND the handler ships through a faster channel than the manifest does.&lt;/p&gt;
&lt;p&gt;Anything baked into the manifest becomes a deployment artifact — changing it forces every user to re-install or hand-edit. Anything in the handler propagates the next time the user updates the package. Hook config schemas are also typically narrow on purpose: matcher + command. Real policy (&quot;truncate this field, parse the transcript file, drop sub-agent events, count tokens&quot;) has nowhere to live in JSON.&lt;/p&gt;
&lt;h2 id=&quot;pattern&quot;&gt;&lt;a href=&quot;#pattern&quot;&gt;Pattern&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;Two parts, with the asymmetry on purpose.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Manifest (thin).&lt;/strong&gt; Every event of interest points at the &lt;em&gt;same&lt;/em&gt; command. No per-event command, no matcher cleverness. Adding a new event type is one line.&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-json&quot;&gt;{
  &quot;hooks&quot;: {
    &quot;PreToolUse&quot;:   [{ &quot;matcher&quot;: &quot;&quot;, &quot;hooks&quot;: [{ &quot;type&quot;: &quot;command&quot;, &quot;command&quot;: &quot;myagent hook&quot; }] }],
    &quot;PostToolUse&quot;:  [{ &quot;matcher&quot;: &quot;&quot;, &quot;hooks&quot;: [{ &quot;type&quot;: &quot;command&quot;, &quot;command&quot;: &quot;myagent hook&quot; }] }],
    &quot;SessionStart&quot;: [{ &quot;matcher&quot;: &quot;&quot;, &quot;hooks&quot;: [{ &quot;type&quot;: &quot;command&quot;, &quot;command&quot;: &quot;myagent hook&quot; }] }],
    &quot;Stop&quot;:         [{ &quot;matcher&quot;: &quot;&quot;, &quot;hooks&quot;: [{ &quot;type&quot;: &quot;command&quot;, &quot;command&quot;: &quot;myagent hook&quot; }] }]
  }
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;strong&gt;Handler (thick).&lt;/strong&gt; One entry point. Every event lands here and branches on the event name in the payload. This is where typed parsing, transcript reads, network sends, truncation, and the never-block safeties live.&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-text&quot;&gt;myagent hook  (single binary)
  ├─ read JSON from stdin (with short timeout)
  ├─ branch on payload.hook_event_name
  │     ├─ PreToolUse → record tool args
  │     ├─ Stop       → parse transcript, summarize
  │     └─ ...
  ├─ ship to backend (detached, fire-and-forget)
  └─ exit 0 always (try/finally)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Corollaries:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Policy changes ship through the handler&apos;s package manager (&lt;code&gt;npm publish&lt;/code&gt;, &lt;code&gt;pip upload&lt;/code&gt;), not user re-configuration.&lt;/li&gt;
&lt;li&gt;Branching is a single-source-of-truth file; reading it tells the whole story of what the integration does.&lt;/li&gt;
&lt;li&gt;&quot;Never block the harness&quot; safeties — stdin timeout, detached I/O, catch-all exit 0 — are implemented once at the single sink, not duplicated per event.&lt;/li&gt;
&lt;li&gt;The handler&apos;s core functions (e.g. &lt;code&gt;buildPayload&lt;/code&gt;, &lt;code&gt;convertEventType&lt;/code&gt;) are ordinary pure functions, easy to unit-test.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&quot;trade-offs&quot;&gt;&lt;a href=&quot;#trade-offs&quot;&gt;Trade-offs&lt;/a&gt;&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Process spawn per event.&lt;/strong&gt; Every tool call spawns the handler. The cost is real (tens of ms each) and adds up on busy sessions. If measurement shows it matters, push a coarse matcher into the manifest to skip uninteresting events at the source.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Opaque manifest.&lt;/strong&gt; A reader looking at the user&apos;s &lt;code&gt;settings.json&lt;/code&gt; sees only &lt;code&gt;myagent hook&lt;/code&gt; and learns nothing about what runs. You&apos;re trading manifest legibility for handler legibility — document the payload contract somewhere reachable.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Single point of failure.&lt;/strong&gt; If the handler binary isn&apos;t on PATH, every event fails. A common mitigation is to make the &lt;code&gt;SessionStart&lt;/code&gt; entry self-bootstrapping (&lt;code&gt;command -v myagent || install ...; myagent hook&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Coarse user controls.&lt;/strong&gt; Users can&apos;t easily say &quot;track Bash but not Read&quot; — the manifest doesn&apos;t know about the distinction. If user-tunable filtering matters, expose it via the handler&apos;s own config or env vars, not the manifest.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&quot;example&quot;&gt;&lt;a href=&quot;#example&quot;&gt;Example&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;The same shape recurs across infrastructure: webhook receivers (URL registration vs. receiver code), Kubernetes (&lt;code&gt;Service&lt;/code&gt;/&lt;code&gt;Ingress&lt;/code&gt; YAML vs. controller), AWS Lambda + API Gateway (route vs. function), GitHub Actions (&lt;code&gt;on:&lt;/code&gt; vs. step script), systemd (&lt;code&gt;ExecStart=&lt;/code&gt; vs. binary). The manifest answers &lt;em&gt;what/when&lt;/em&gt;, the handler answers &lt;em&gt;how&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;A textbook instance for Claude Code is the Argos telemetry CLI: its &lt;code&gt;.claude/settings.json&lt;/code&gt; injects identical &lt;code&gt;argos hook&lt;/code&gt; entries for every event type, and a single &lt;code&gt;hook&lt;/code&gt; command in the CLI does all routing internally.&lt;/p&gt;
&lt;h2 id=&quot;related-patterns&quot;&gt;&lt;a href=&quot;#related-patterns&quot;&gt;Related patterns&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;&lt;em&gt;(none yet)&lt;/em&gt;&lt;/p&gt;</content:encoded><category>ai-ml</category><category>agent-harness</category><category>patterns</category></item><item><title>Human intervention log</title><link>https://juhochoi.me/writing/engineering/human-intervention-log/</link><guid isPermaLink="true">https://juhochoi.me/writing/engineering/human-intervention-log/</guid><description>A single append-only file where an autonomous harness records every task it could not automate, who handled it manually, and the condition under which automation could resume.</description><pubDate>Sun, 03 May 2026 01:10:12 GMT</pubDate><content:encoded>&lt;h1 id=&quot;human-intervention-log&quot;&gt;&lt;a href=&quot;#human-intervention-log&quot;&gt;Human intervention log&lt;/a&gt;&lt;/h1&gt;
&lt;blockquote&gt;
&lt;p&gt;A single append-only file where an autonomous harness records every task it could not automate, who handled it manually, and the condition under which automation could resume.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2 id=&quot;when-to-use&quot;&gt;&lt;a href=&quot;#when-to-use&quot;&gt;When to use&lt;/a&gt;&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;The harness runs an autonomous loop (e.g. ideate → plan → build → commit → check) that is meant to keep going without a human in each iteration.&lt;/li&gt;
&lt;li&gt;The loop will inevitably hit work that is physically outside its reach: console-only API key issuance, vendor/legal approvals, DNS at the registrar, payments, security-incident judgment calls, library or platform limitations the agent cannot bypass.&lt;/li&gt;
&lt;li&gt;You want future iterations (or future maintainers) to know &lt;em&gt;why&lt;/em&gt; a workaround exists and &lt;em&gt;when&lt;/em&gt; it would be safe to retry the automated path.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&quot;when-not-to-use&quot;&gt;&lt;a href=&quot;#when-not-to-use&quot;&gt;When not to use&lt;/a&gt;&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;One-shot or short-lived agents — there is no later iteration that will read the log.&lt;/li&gt;
&lt;li&gt;Copilot-style interactive harnesses where human turns are the design, not the exception. Every action would qualify and the log degenerates into a transcript.&lt;/li&gt;
&lt;li&gt;Cases where the limit is genuinely permanent and uninteresting (e.g. &quot;only the CEO can sign this contract&quot;). A single static note in a runbook is enough; you do not need a log entry per occurrence.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&quot;context&quot;&gt;&lt;a href=&quot;#context&quot;&gt;Context&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;An autonomous loop hits two kinds of failure: ones it can retry, and ones it physically cannot. If the human silently absorbs the second kind, three things are lost:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The audit trail — six months later nobody remembers why the metric was switched from &lt;code&gt;successRate&lt;/code&gt; to &lt;code&gt;medianDurationMs&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;The retry trigger — the condition that would let the loop reclaim this task is in someone&apos;s head, not in the repo.&lt;/li&gt;
&lt;li&gt;Self-knowledge — the harness has no list of its own ceilings, so it keeps re-attempting impossible work and burning tokens.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The pattern is to make every manual override an explicit, structured entry instead of an undocumented save.&lt;/p&gt;
&lt;h2 id=&quot;pattern&quot;&gt;&lt;a href=&quot;#pattern&quot;&gt;Pattern&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;Maintain one append-only markdown file in the repo (e.g. &lt;code&gt;docs/human-intervention.md&lt;/code&gt;). Each intervention gets one section with four required fields:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-text&quot;&gt;## YYYY-MM-DD — short title

- Context: why the harness could not handle it
- Actor: who intervened
- Action: what they actually did
- Re-automatable: yes/no — &amp;#x3C;trigger condition, with a concrete check&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The four fields are the minimum needed for retrospection, retry, and audit. Drop one and the entry stops being useful:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Context&lt;/strong&gt; answers &quot;why was this not automated.&quot;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Actor&lt;/strong&gt; answers &quot;who owns the follow-up.&quot;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Action&lt;/strong&gt; answers &quot;what is the current state of the system.&quot;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Re-automatable&lt;/strong&gt; answers &quot;when, if ever, should the loop try again?&quot; The trigger should be checkable without judgment — a SQL query, a feature-flag probe, a vendor-changelog URL, an issue link. A vague &quot;someday when the API improves&quot; is not a trigger.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The harness can optionally read the file (or an index of it) at boot and treat listed items as known off-limits until their trigger fires.&lt;/p&gt;
&lt;h2 id=&quot;trade-offs&quot;&gt;&lt;a href=&quot;#trade-offs&quot;&gt;Trade-offs&lt;/a&gt;&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Discipline tax.&lt;/strong&gt; The log is only as good as the operator&apos;s habit of writing entries. Half-logged interventions are worse than none — they imply completeness that is not there.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Drift.&lt;/strong&gt; Triggers go stale (vendors ship APIs, internal limits change). The file needs periodic pruning, otherwise old entries fossilize and the loop never retries things it could now handle.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Not a fix.&lt;/strong&gt; The log makes the autonomy ceiling &lt;em&gt;visible&lt;/em&gt;, it does not raise it. A monotonically growing file is a signal the harness is accumulating debt, not paying it down.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&quot;example&quot;&gt;&lt;a href=&quot;#example&quot;&gt;Example&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;A real entry from an autonomous harness: the loop generated a feature request to add a &lt;code&gt;successRate&lt;/code&gt; column to its agent-run dashboard. Investigation revealed the underlying tool runner did not surface &lt;code&gt;exit_code&lt;/code&gt; for built-in tools, so the data simply did not exist. The intervention recorded:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Context: success/failure data is unavailable for built-in tools at the data-source layer.&lt;/li&gt;
&lt;li&gt;Actor: repo owner.&lt;/li&gt;
&lt;li&gt;Action: replaced the proposed &lt;code&gt;successRate&lt;/code&gt; column with &lt;code&gt;medianDurationMs&lt;/code&gt;, which can be computed from existing fields.&lt;/li&gt;
&lt;li&gt;Re-automatable: yes — when the tool runner ships exit-code support, OR when a sampled SQL query against the runs table starts returning non-null exit codes. The query is pinned in the entry so a future iteration can evaluate the trigger automatically.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Because the trigger is concrete, a future loop can run the query, see the data has appeared, and reopen the original feature request without a human re-deciding the question.&lt;/p&gt;
&lt;h2 id=&quot;related-patterns&quot;&gt;&lt;a href=&quot;#related-patterns&quot;&gt;Related patterns&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;&lt;em&gt;(none yet — see also runbooks, ADRs, and postmortems in general engineering practice; this pattern is the autonomous-loop-specific variant focused on retry conditions.)&lt;/em&gt;&lt;/p&gt;</content:encoded><category>ai-ml</category><category>agent-harness</category><category>patterns</category></item></channel></rss>