Specialist Agents

Work agents write code. Specialists turn that code into something you can actually merge. They review, test, click through the UI, resolve conflicts, and hand off to each other automatically — you just click Merge at the end.

Panopticon kanban board with specialists running

Overview

A specialist is a focused agent with one job. It takes a completed piece of work, does that job, reports passed or failed, and either advances the work to the next stage or bounces it back to the work-agent with feedback. Specialists are per-project and ephemeral — they spawn on-demand against a workspace, do their job, and terminate. There is no global specialist pool to warm up, no long-lived tmux sessions to babysit, and nothing to initialize before you start working. What makes them different from work agents:

Narrow scope. Each specialist has one responsibility and a purpose-built prompt.
Per-project. Spawned against a specific project’s workspace with the right tools and context.
Queued. If one specialist is busy, new tasks queue up and drain automatically.
Coordinated. Cloister handles handoffs between stages.

The Five Specialists

Specialist	Purpose	Trigger
review-agent	Code review before merge	Human clicks Review (dashboard)
test-agent	Runs the full test suite	Auto after review passes
inspect-agent	Per-status-change verification	Any specialist reports `passed`
uat-agent	Browser-based acceptance testing via Playwright	Auto after tests pass
merge-agent	All merges + conflict resolution	Human clicks Approve & Merge

Specialists dashboard showing Cloister Deacon and per-project specialists

Review Pipeline Flow

The happy path is a sequential handoff. A human kicks it off; the rest is automatic until the final merge click.

Human clicks "Review"
        │
        ▼
┌───────────────────┐
│   review-agent    │  logic, security, perf, style
└─────────┬─────────┘
          │ passed → queue test-agent
          │ failed → feedback to work-agent
          ▼
┌───────────────────┐
│    test-agent     │  runs full test suite
└─────────┬─────────┘
          │ passed → queue uat-agent
          │ failed → feedback to work-agent
          ▼
┌───────────────────┐
│    uat-agent      │  browser walks through ACs
└─────────┬─────────┘
          │ passed → ready-to-merge
          │ failed → feedback with screenshots
          ▼
┌───────────────────┐
│   Human clicks    │  <— only human gate after Review
│ "Approve & Merge" │
└─────────┬─────────┘
          ▼
┌───────────────────┐
│   merge-agent     │  merges, resolves conflicts, reruns tests
└───────────────────┘

inspect-agent runs alongside this pipeline rather than inside it — every time any specialist reports passed, inspect verifies the state transition against the spec before the next stage runs.

Model + specialist handoffs tracked over time

Review Agent

The review-agent is the gatekeeper. It reads the diff, the PRD, and the vBRIEF plan, and looks for:

Logic errors and missed edge cases
Security vulnerabilities (OWASP top 10, injection, auth bypass)
Performance issues (N+1 queries, unnecessary work, leaked resources)
Code quality and adherence to project conventions

Triggering review-agent:

# Via dashboard: click "Review" on an issue card
# Via CLI:
pan specialists wake review-agent --task "review PAN-655"

Review outcomes:

passed — queues test-agent automatically
failed — sends structured feedback to the work-agent, blocks the pipeline
skipped — not applicable (e.g., docs-only change)

Review can also run in convoy mode — multiple reviewers in parallel, each with a focused lens (security, performance, requirements). See the code-review convoy.

Test Agent

The test-agent runs every configured test suite for the project and analyzes the failures rather than just reporting pass/fail. What it does:

Runs all configured suites (backend, frontend unit, integration, e2e)
Diagnoses failures — flake vs. real regression vs. environmental
Reports results with actionable, file:line-referenced feedback
On pass, queues uat-agent (if enabled) or marks the work ready-to-merge

Test outcomes:

passed — advances to UAT or ready-to-merge
failed — feedback with failing test names and excerpts goes back to the work-agent
skipped — no test suite applies to this change
dispatch_failed — Cloister couldn’t launch the test run (infra issue, not a code failure)

Project configuration in ~/.panopticon/projects.yaml:

projects:
  myproject:
    name: "My Project"
    path: /home/user/projects/myproject
    tests:
      backend:
        command: "npm test"
        timeout: 300000
      frontend_unit:
        command: "cd frontend && npm run test:unit"
        timeout: 180000
      e2e:
        command: "npm run test:e2e"
        timeout: 600000

Inspect Agent

The inspect-agent performs per-status-change verification. Whenever any specialist reports a status transition (review passed, test passed, etc.), inspect re-reads the spec and checks that the new state actually matches what the PRD asked for. Why it matters:

Catches spec drift before it compounds across multiple stages
Prevents a review-agent that’s too generous from sliding a broken change through
Keeps the pipeline honest about the difference between “it compiled” and “it works”

Inspect is wired into /api/specialists/done — it fires on status change, not on a fixed schedule. Enable it per-project:

projects:
  myproject:
    specialists:
      inspect_agent:
        enabled: true

UAT Agent

The UAT (User Acceptance Testing) agent opens a real browser via Playwright MCP and walks through the acceptance criteria like a user would. What it does:

Reads the PRD and the vBRIEF acceptance criteria
Launches a browser against the project’s dev URL
Executes each AC as a user flow
Takes screenshots as evidence (pass and fail)
On pass, marks ready-to-merge; on fail, sends screenshots + DOM snapshots back

UAT outcomes:

passed — all ACs verified, ready for human merge approval
failed — feedback includes the failing step, a screenshot, and the observed DOM
skipped — backend-only or otherwise UI-irrelevant change

projects:
  myproject:
    specialists:
      uat_agent:
        enabled: true
        dev_url: "https://myapp.localhost"

Merge Agent

The merge-agent handles every merge, not just conflicted ones. This is deliberate:

It sees every diff that flows through the pipeline, building context
When conflicts do occur, it already understands the codebase
Tests are always re-run post-merge, catching integration regressions

Inspector panel with merge-agent terminal streaming live

Workflow:

Pull latest main
Analyze the incoming diff
Merge the feature branch
Resolve conflicts with AI, documenting each decision
Re-run tests post-merge
Commit the merge with a descriptive message
Report results back to the dashboard

Triggering merge-agent:

# Via dashboard: click "Approve & Merge" on an issue card
# Via CLI:
pan specialists wake merge-agent --task "merge PAN-655"

The merge-agent prompt forbids force-push, requires test re-runs, and mandates that conflict resolution decisions are documented in the merge commit message.

Queue Processing

Each specialist has a per-project task queue at ~/.panopticon/agents/{name}/hook.json, managed via the FPP (Fixed Point Principle) — borrowed from Gastown, inspired by Doctor Who: any runnable action is a fixed point and must resolve before the system can rest.

1. Task arrives (via API or handoff)
        │
        ▼
2. wakeSpecialistOrQueue() checks if specialist is busy
        │
        ├── IDLE: wake specialist immediately with task
        │
        └── BUSY: push task onto hook.json
                │
                ▼
3. On specialist completion (/api/specialists/done):
        │
        ├── Cloister marks specialist idle
        ├── Drains the next queued task
        └── Wakes the specialist with it

Queue priority order: urgent > high > normal > low. The FPP watchdog notices when a specialist has pending hook work but is idle, and sends escalating nudges until the work resolves.

Agent Self-Requeue (Circuit Breaker)

After a human kicks off the first review, a work-agent can request re-review automatically when it thinks it has fixed the feedback:

pan work request-review MIN-123 -m "Fixed: added tests for edge cases"

The circuit breaker prevents infinite “fail → refix → re-review” loops:

First human Review click resets the counter to 0
Each pan work request-review increments it
After 7 automatic re-requests, the endpoint returns HTTP 429
A human must click Review in the dashboard to unstick it

API endpoint: POST /api/workspaces/:issueId/request-review The constant lives at src/dashboard/server/routes/workspaces.ts:81 (MAX_AUTO_REQUEUE = 7).

Specialist Safeguards

Specialists are constrained to prevent them from corrupting the main project repo:

Spawned at project root — the workspace directory is passed as task context, never by cd-ing into it blindly
Never checkout branches — they work with whatever branch the workspace already has
Workspace-first operations — pan workspace create <ISSUE-ID> is the only way to create new work

These are enforced in three layers:

Prompt — clear, repeated warnings in every specialist prompt template
Code — wake paths validate the target before spawning
Git hooks — scripts/git-hooks/post-checkout auto-reverts any checkout detected inside a specialist tmux session

See Workspace Commands for branch protection details.

Configuration

Specialist configuration lives in ~/.panopticon/cloister.toml:

[specialists.review_agent]
enabled = true
auto_wake = false

[specialists.test_agent]
enabled = true
auto_wake = true

[specialists.inspect_agent]
enabled = true

[specialists.uat_agent]
enabled = true

[specialists.merge_agent]
enabled = true
auto_wake = false

[model_selection.specialist_models]
review_agent   = "sonnet"
test_agent     = "haiku"
merge_agent    = "sonnet"
planning_agent = "opus"

Options:

enabled — whether the specialist runs at all
auto_wake — auto-wake on trigger vs. wait for an explicit wake signal
[model_selection.specialist_models] — per-specialist model override (haiku / sonnet / opus)

When to Disable Each Specialist

Specialists are powerful but not free — each one adds latency and API cost. Sensible tradeoffs:

review-agent — almost never disable. The one gate between work-agents and prod.
test-agent — disable only if your test suite is broken or prohibitively slow. Fix the suite instead.
inspect-agent — disable for projects without a PRD/spec culture; it has nothing to verify against.
uat-agent — disable for backend-only services or CLIs with no browser surface.
merge-agent — keep enabled even for conflict-free projects; it’s your integration test safety net.

Viewing Specialist Status

# All specialists with their current state
pan specialists list

# JSON output for scripting
pan specialists list --json

# What's sitting in a specialist's queue
pan specialists queue review-agent

# Reset a wedged specialist (clears session, starts fresh)
pan specialists reset review-agent

# Clear a stuck queue without touching the session
pan specialists clear-queue review-agent

# Tail an active run in real-time
pan specialists logs myproject review-agent --tail

Project health panel with per-project specialist roll-up

You can also watch handoffs and costs flow through the dashboard as they happen:

Cost tracking across models and specialists

Cloister — the lifecycle manager that coordinates specialists
Convoys — parallel specialist execution for code review
Agent Commands — full CLI reference for working with agents

Features

Documentation Index

​Specialist Agents

​Overview

​The Five Specialists

​Review Pipeline Flow

​Review Agent

​Test Agent

​Inspect Agent

​UAT Agent

​Merge Agent

​Queue Processing

​Agent Self-Requeue (Circuit Breaker)

​Specialist Safeguards

​Configuration

​When to Disable Each Specialist

​Viewing Specialist Status

​Related Guides