PAPER-2025-011

The Hermeneutic Triad

How Reviewers, Harness, and Agent Collaborate—a case study in parallel peer review revealing and resolving DRY violations.

Case Study 12 min read Intermediate

Abstract

This case study covers a live case study from December 2025 where the CREATE SOMETHING harness orchestrated parallel peer reviews that identified critical DRY violations in newsletter subscription code. Three specialized reviewers—architecture, security, and quality—each analyzed the same codebase simultaneously, producing complementary findings. The architecture reviewer detected 4 pairs of nearly-identical files across packages; the security reviewer identified IDOR vulnerabilities; the quality reviewer noted inconsistent error handling. This paper analyzes how this hermeneutic triad (a three-part interpretive system where different perspectives—reviewers, harness, agent—work together to reveal understanding that no single perspective could achieve alone)—the interplay between reviewers, harness, and agent—creates a self-correcting system that surfaces issues no single perspective would catch.

"Each reviewer sees through a different lens. The architecture reviewer asks 'Is this unified?' The security reviewer asks 'Is this safe?' The quality reviewer asks 'Is this clear?' Together, they reveal what none would see alone."

— Harness Peer Review Philosophy

I. The Incident: Taste Harness Pauses

On December 23, 2025, the CREATE SOMETHING harness was running the "Taste Collections & LLM Context" spec—a 6-feature project to enable users to curate design references and expose them to AI agents. After completing 2 of 6 features, the harness paused with an unexpected verdict:

┌─────────────────────────────────────────────────────────────┐
│  PEER REVIEW SUMMARY                                        │
├─────────────────────────────────────────────────────────────┤
│  Outcome: ❌ FAIL                                           │
│  Confidence: 90%                                            │
├─────────────────────────────────────────────────────────────┤
│  Findings: 15                                               │
│    🔴 Critical: 3                                           │
│    🟠 High: 2                                               │
│    🟡 Medium: 5                                             │
│    🔵 Low: 2                                                │
│    ⚪ Info: 3                                               │
├─────────────────────────────────────────────────────────────┤
│  Reviewers:                                                 │
│    ⚠️ security        pass_with_findings (5)               │
│    ❌ architecture    fail               (6)                │
│    ⚠️ quality         pass_with_findings (4)               │
├─────────────────────────────────────────────────────────────┤
│  ⏸️  PAUSING FOR REVIEW                                    │
│    • 3 critical finding(s)                                  │
│    • Blocking reviewer failed: architecture                 │
└─────────────────────────────────────────────────────────────┘

The harness had been implementing features, creating code, and making commits. But when it reached a checkpoint, it invoked three parallel peer reviewers. The architecture reviewer's verdict—FAIL—triggered a pause. What had it found?

II. The Three Reviewers: Parallel Perspectives

The harness employs three specialized reviewers, each analyzing the same codebase through a different philosophical lens:

Architecture

Asks: "Does the structure serve the whole?"

Applies DRY at the system level. Detects duplicate modules, violated boundaries, excessive coupling.

Security

Asks: "Can this be exploited?"

Scans for OWASP vulnerabilities, authentication gaps, authorization flaws, injection risks.

Quality

Asks: "Is this maintainable?"

Evaluates error handling, type safety, code clarity, test coverage, documentation.

These reviewers run in parallel—each receives the same git diff and file context but applies independent analysis. Their prompts are generated from the changed files, ensuring review focus matches implementation scope.

Hermeneutic Complementarity

The three reviewers form a hermeneutic triad—three interpretive lenses that together reveal what no single lens would show. This mirrors the Subtractive Triad:

ReviewerTriad LevelQuestion
ArchitectureDRY (Implementation)"Have I built this before?"
SecurityRams (Artifact)"Does this earn its existence safely?"
QualityHeidegger (System)"Does this serve the whole?"

III. What They Found: The DRY Violations

Each reviewer produced findings, but the architecture reviewer's were critical. It identified 4 pairs of nearly-identical files duplicated across packages:

[CRITICAL] Duplicated unsubscribe UI component across 2 packages
  packages/io/src/routes/unsubscribe/+page.svelte
  packages/space/src/routes/unsubscribe/+page.svelte
  → Only differs in propertyName prop ("io" vs "space")
  → Recommend: Extract to @create-something/components

[CRITICAL] Duplicated unsubscribe page server logic across 2 packages
  packages/io/src/routes/unsubscribe/+page.server.ts
  packages/space/src/routes/unsubscribe/+page.server.ts
  → Identical token decoding, database operations
  → Recommend: Extract to shared newsletter module

[CRITICAL] Duplicated newsletter API endpoint across 2 packages
  packages/io/src/routes/api/newsletter/+server.ts
  packages/space/src/routes/api/newsletter/+server.ts
  → Nearly identical (200+ lines each)
  → Only differs in default source property

The architecture reviewer had detected what the coding agent hadn't noticed: the agent, working feature-by-feature, had created identical implementations for different properties rather than extracting shared functionality.

Security Findings (Complementary)

Meanwhile, the security reviewer flagged a different issue in the same area:

[HIGH] Potential IDOR in taste insights API
  packages/ltd/src/routes/api/taste/insights/+server.ts
  → User ID from query param without validation
  → Could allow access to other users' reading data
  → Recommend: Use authenticated user ID from session

The security and architecture reviewers saw different problems in overlapping code. Neither finding was wrong; both were necessary.

IV. The Harness Response: Pause and Surface

When the architecture reviewer's verdict was FAIL, the harness entered a decision tree. Its configuration specified:

peerReview: {
  reviewers: ['security', 'architecture', 'quality'],
  blockingReviewers: ['security', 'architecture'],
  pauseOnCritical: true,
  minConfidence: 0.7
}

Because architecture was a blocking reviewer and it returned FAIL with critical findings, the harness:

  1. Created Beads issues for each critical finding
  2. Generated a checkpoint summarizing the review
  3. Paused execution for human review
  4. Left the agent context intact for resumption

The Pause as Clearing

Heidegger's concept of the clearing (Lichtung) is relevant here. The pause creates a space where what was hidden becomes visible. The agent was working—hammering away—and the code was ready-to-hand. The reviewer's FAIL verdict made the duplication present-at-hand: visible as a problem rather than invisible as tool-use.

"The clearing is not a bounded space but the opening in which beings can show themselves."

— Adapted from Heidegger

V. The Resolution: Agent as Healer

With the findings surfaced, the agent (Claude Code) could address them. The resolution followed a pattern:

# 1. Create shared module
packages/components/src/lib/newsletter/
├── index.ts          # Exports
├── types.ts          # Shared types
├── unsubscribe.ts    # processUnsubscribe()
├── subscribe.ts      # processSubscription()
└── UnsubscribePage.svelte  # Shared component

# 2. Update consumers (io, space, agency)
packages/io/src/routes/api/newsletter/+server.ts
  - import { processSubscription } from '@create-something/components/newsletter';
  - Now 25 lines instead of 227

# 3. Type-check verification
$ pnpm --filter=io exec tsc --noEmit  # ✓
$ pnpm --filter=space exec tsc --noEmit  # ✓

The resolution reduced ~900 lines of duplicated code to ~200 lines of shared code with property-specific configuration. Each consumer now imports from the shared module and passes its property identifier.

Closing the Findings

With the fix committed, the agent closed the finding issues:

$ bd close csm-cisd csm-fmcw csm-bf8h \
    --reason "Fixed: Consolidated to @create-something/components/newsletter"

✓ Closed csm-cisd: [CRITICAL] Duplicated unsubscribe UI component
✓ Closed csm-fmcw: [CRITICAL] Duplicated unsubscribe page server logic
✓ Closed csm-bf8h: [CRITICAL] Duplicated newsletter API endpoint

VI. Analysis: Why This Works

This case study reveals several properties of the hermeneutic triad approach:

1. Parallel Analysis Prevents Tunnel Vision

The coding agent, focused on completing features, naturally creates local solutions. It wasn't "wrong" to create similar code in different packages—each implementation worked correctly. Only the architectural lens, examining cross-package structure, could see the duplication.

2. Blocking Reviewers Create Natural Checkpoints

By configuring architecture as a blocking reviewer, the harness ensured DRY violations couldn't accumulate silently. The pause forced attention to a structural issue that would otherwise compound.

3. Findings as Issues Enable Tracking

Each finding became a Beads issue. This means:

  • Findings can be prioritized alongside regular work
  • Related findings can be grouped (same root cause)
  • Resolution is tracked with commit references
  • Similar findings from other harnesses can be closed together

4. Agent as Both Creator and Resolver

The same agent that created the duplication could resolve it. This isn't a flaw—it's the system working as designed. The agent operates in two modes:

Zuhandenheit (Creating)

Working within the codebase, implementing features, tool-use is transparent. Duplication is invisible because each file works.

Vorhandenheit (Resolving)

Examining the codebase as object, seeing structure rather than function. Duplication is visible because we're analyzing, not using.

VII. The Broader Pattern: Self-Correcting Systems

The triad of harness, reviewers, and agent forms a self-correcting system—a hermeneutic circle at the scale of software development:

┌─────────────────────────────────────────────────────────────┐
│                  THE HERMENEUTIC CIRCLE                     │
│                                                             │
│        Agent                      Reviewers                 │
│          │                            │                     │
│     creates code ──────────► reveals issues                 │
│          │                            │                     │
│          ▼                            ▼                     │
│        Harness ◄────────────────── Harness                  │
│    (orchestrates)                (pauses)                   │
│          │                            │                     │
│          ▼                            ▼                     │
│     Agent fixes ──────────► Reviewers verify                │
│          │                            │                     │
│          └────────► Harness resumes ◄─┘                     │
│                                                             │
└─────────────────────────────────────────────────────────────┘

This is not a waterfall but a circle. Each iteration improves understanding:

  • Agent understands codebase better through fixing revealed issues
  • Reviewers calibrate detection based on what agent creates
  • Harness learns pause thresholds from human overrides

Canon Alignment in Practice

The Subtractive Triad manifests at multiple levels:

LevelDRYRamsHeidegger
CodeUnify duplicatesEach file earns existenceModules serve system
ReviewThree lenses, one analysisOnly actionable findingsReviews serve quality
ProcessBeads is single sourceMinimal ceremonyHarness serves work

VIII. Implications: Reviewer Design

This case study suggests principles for designing effective review triads:

1. Orthogonal Perspectives

Reviewers should cover different concerns. Our triad—architecture, security, quality— has minimal overlap. Each can fail independently, and each provides unique signal.

2. Graduated Blocking

Not all reviewers should block. In our configuration:

  • Architecture: Blocks (structural issues compound)
  • Security: Blocks (vulnerabilities are critical)
  • Quality: Advisory (minor issues can queue)

3. Parallel Execution

Running reviewers in parallel (not sequence) is essential. Total review time equals longest single reviewer, not sum of all. For large diffs, this saves significant time.

4. Issue Integration

Findings become issues in the same tracker as features. This means:

  • Humans can reprioritize findings relative to features
  • Multiple runs can reference the same underlying issue
  • Resolution ties to commits in the same workflow

IX. How to Apply This

This section shows how to configure parallel peer review in your own autonomous development workflows. The pattern works for any harness system that supports checkpoints and issue creation.

Step-by-Step Process

Step 1: Configure Reviewers (Human)
Define three specialized reviewers in harness config:
- architecture: DRY violations, coupling, module boundaries
- security: OWASP Top 10, auth gaps, injection risks
- quality: Error handling, test coverage, maintainability
Ensure orthogonal perspectives (minimal overlap).

Step 2: Set Blocking Rules (Human)
Decide which reviewers can pause execution:
- architecture: blocking (structural issues compound)
- security: blocking (vulnerabilities are critical)
- quality: advisory (minor issues can queue)
Configure pauseOnCritical: true for blocking reviewers.

Step 3: Enable Parallel Execution (Agent)
Run reviewers simultaneously, not sequentially:
- Each receives same git diff and file context
- Total review time = longest single reviewer
- Results merge into unified findings list
Use Promise.all() or equivalent concurrency primitive.

Step 4: Create Finding Issues (Agent)
When reviewer returns critical findings:
- Create Beads issue for each finding
- Label with reviewer name and severity
- Link to checkpoint for context
- Add dependencies if findings are related

Step 5: Pause and Surface (Harness)
When blocking reviewer fails:
- Halt execution immediately
- Generate checkpoint summary
- Preserve agent context for resumption
- Alert human for review decision

Step 6: Resolve and Resume (Agent + Human)
Human reviews findings, agent implements fixes:
- Close findings with commit references
- Update harness context with resolution
- Re-run reviewers if needed
- Resume execution when clear

Real-World Example: API Endpoint Duplication

Let's say your agent builds three similar API routes across different packages:

// packages/shop/src/routes/api/subscribe/+server.ts
export async function POST({ request }) {
  const { email } = await request.json();
  const token = generateToken(email);
  await db.insert(subscribers).values({ email, token });
  await sendConfirmationEmail(email, token);
  return json({ success: true });
}

// packages/blog/src/routes/api/subscribe/+server.ts
export async function POST({ request }) {
  const { email } = await request.json();
  const token = generateToken(email);
  await db.insert(blogSubscribers).values({ email, token });
  await sendConfirmationEmail(email, token);
  return json({ success: true });
}

// packages/newsletter/src/routes/api/subscribe/+server.ts
export async function POST({ request }) {
  const { email } = await request.json();
  const token = generateToken(email);
  await db.insert(newsletterSubscribers).values({ email, token });
  await sendConfirmationEmail(email, token);
  return json({ success: true });
}

The architecture reviewer detects:

[CRITICAL] Duplicated subscription logic across 3 packages
→ packages/shop/src/routes/api/subscribe/+server.ts
→ packages/blog/src/routes/api/subscribe/+server.ts
→ packages/newsletter/src/routes/api/subscribe/+server.ts
→ Only differs in table name
→ Recommend: Extract to @myapp/subscriptions package

Meanwhile, the security reviewer flags:

[HIGH] Missing rate limiting on subscription endpoints
→ All three endpoints accept unlimited POST requests
→ Vulnerable to subscription bombing
→ Recommend: Add rate limiting middleware

Harness creates issues for both findings, pauses, and alerts human. Agent then:

  1. Creates @myapp/subscriptions package
  2. Extracts shared logic to processSubscription(table, email)
  3. Adds rate limiting middleware to all endpoints
  4. Updates consumers to import shared function
  5. Closes findings with commit references

When to Use Parallel Peer Review

Use this pattern when:

  • Autonomous work: Agent-driven development with harness orchestration
  • Multi-file changes: Checkpoints cover significant scope (3+ files)
  • Quality gates matter: Structural or security issues can't accumulate silently
  • Hermeneutic continuity: Work spans multiple sessions, understanding must persist

Don't use for:

  • Single-file changes or trivial fixes
  • Exploratory prototyping (no established patterns yet)
  • Emergency hotfixes (review adds latency)
  • Human-driven development (peer review happens via PR)

Calibrating Reviewer Sensitivity

Over time, tune your reviewers based on false positive rates:

SymptomAdjustment
Too many trivial findingsIncrease severity threshold for pause
Missing critical issuesAdd specific patterns to reviewer prompts
Reviews taking too longReduce context window or use faster model
Human overriding frequentlyDemote reviewer to advisory (non-blocking)

The goal is a self-correcting system, not a gate-keeping system. Reviewers should catch real issues while allowing good work to proceed. Calibrate continuously based on outcomes.

X. Conclusion: Collaboration, Not Control

The hermeneutic triad—reviewers, harness, and agent—demonstrates a form of AI collaboration that isn't about control but about complementary perspectives. Each element sees what others cannot:

  • The agent sees how to implement; it cannot see duplication it creates
  • The reviewers see patterns across files; they cannot implement fixes
  • The harness sees workflow; it cannot analyze or create

Together, they form a system that is more capable than any element alone. The 3 critical DRY violations were not bugs—the code worked—but architectural debt that would compound. The parallel review caught them before they spread further.

"The hermeneutic circle is not a vicious circle but a virtuous one. Understanding advances through the interplay of parts and whole."

The duplication was resolved in a single commit. The harness resumed. The remaining features continue to be implemented—now with a shared newsletter module that prevents future duplication. The system learned, not through explicit training, but through architectural enforcement.

This is the promise of the hermeneutic triad: not AI that never errs, but AI that catches its own errors through structured self-reflection.

Appendix: Incident Timeline

2025-12-23T03:34 — Harness started from taste-collections-llm.md

2025-12-23T03:35 — Session #1: Reading Insights Dashboard (complete)

2025-12-23T03:40 — Session #2: Agent Context API (complete)

2025-12-23T03:45 — Checkpoint triggered, peer review started

2025-12-23T03:46 — Parallel reviews: security, architecture, quality

2025-12-23T03:46 — Architecture: FAIL (6 findings, 3 critical)

2025-12-23T03:46 — Harness paused, findings created as issues

2025-12-23T03:52 — Agent addresses DRY violations

2025-12-23T03:55 — Creates @create-something/components/newsletter

2025-12-23T03:57 — Updates io, space, agency to use shared module

2025-12-23T03:58 — Type-check passes, commit created

2025-12-23T03:59 — Findings closed, harness resumed

References

  1. Heidegger, M. (1927). Being and Time. Trans. Macquarrie & Robinson.
  2. CREATE SOMETHING. (2025). "The Autonomous Harness." createsomething.io/papers/autonomous-harness-architecture
  3. CREATE SOMETHING. (2025). "Harness Patterns." .claude/rules/harness-patterns.md
  4. CREATE SOMETHING. (2025). "The Subtractive Triad." createsomething.ltd/principles
  5. Anthropic. (2025). "Effective Harnesses for Long-Running Agents."