MetaCoder Documentation

Command reference for MetaCoder's Harness Engineering pipeline (/harness) and the four automation tools. Syntax, flags and defaults on this page are verified against the MetaCoder source code.

Every command runs inside MetaCoder (Desktop or the agent prompt). Slash commands are typed as shown, e.g. /harness status.

Overview

MetaCoder is a graph-first AI development tool for large-scale systems. It turns a codebase into a Semantic Knowledge Graph and drives AI development through Harness Engineering — a disciplined 10-phase pipeline with machine-verified quality gates. Two command families:

/harness — the discipline system: a 10-phase state machine with roles, gates, context layers and an audit trail. 18 subcommands.
Automation tools — /modernize, /newproject, /systest, /env: end-to-end skills built on the same knowledge graph.

Install & first run

Download the free Desktop app for Windows or macOS from the download page. A typical first session:

› /harness init --template react-nextjs-frontend › /harness spec --advance "user dashboard with auth and i18n" › /harness implement › /harness phase advance

Harness Engineering — concept & pipeline

Harness Engineering gives AI-driven development a fixed order and quality standards verified by machine, not by optimistic self-report. Humans and AI travel the same rails.

The 10-phase pipeline

Each phase advances only when its gate passes; rollback steps back one phase at a time.

requirements → design → test-spec → implementation → code-review → integration-test → performance → security → deploy-plan → confirm ✅

4 roles (auto-assumed per phase)

Role	Phases
`pm`	requirements / design / deploy-plan / confirm
`tester`	test-spec / integration-test
`developer`	implementation / performance
`reviewer`	code-review / security

Machine-verified gates — a gate's checks are defined in harness/workflow/gates.yaml. Honest-fail: a test that never ran, returned 0 cases, or couldn't reach the browser stays red.
3 context layers — session / phase / on-demand, managed with /harness context.
Audit trail — phase-advance / rollback / gate-run / role-assume are appended as NDJSON; /harness ai-rate measures AI adoption from git history.

Gate contents are project-defined. The specific checks each gate runs come from your harness/workflow/gates.yaml. Lists in this doc reflect the standard template that /harness init generates.

spec-kit — design first, build exactly to the design

spec-kit is the key innovation: the AI first produces an industry-standard design (a project constitution plus a spec.md → plan.md → tasks.md chain), then implements strictly to it — no more, no less. This prevents AI hallucination from adding imagined, out-of-scope features. spec-kit is active when METACODER_HARNESS_SPEC_MODE is not off; it powers /harness spec, auto-generates tasks.md on the design → test-spec advance, and promotes cross-cutting features (auth, i18n, AI assistant) to first-class tasks so nothing is buried.

/harness — command reference

18 subcommands in three groups. Flags and defaults below are taken from the source (src/skills/bundled/harness.ts and the _exec registry).

Setup & inspect

/harness init

/harness init [--template TYPE] [--force]

Purpose: Scaffold a harness/ discipline directory in the workspace. Prereq: none (warns if harness/ exists unless --force).

Effect: creates harness/{rules,skills,agents,workflow,context}/ plus workflow/phases.yaml and gates.yaml. With spec-kit on, also writes harness/constitution.md (3 articles) and harness/specs/.gitkeep.

Flag	Default	Meaning
`--template TYPE`	`minimal`	`minimal`, `java-spring-enterprise`, `node-fastapi`, `python-django`, `go-microservice`, `react-nextjs-frontend`, `cobol-modernization`
`--force`	`false`	Overwrite existing `harness/` files without confirmation

/harness spec

/harness spec [--feature <name>] [--inputs <paths>] [--advance] [--db <dir>] <brief>

Purpose: Kick off the requirements/design phase — make the AI author the design documents from declared inputs and write no product code (the symmetric counterpart to /harness implement). Status: implemented.

Prereq: current phase must be requirements or design (auto-initializes to requirements if phase state was never set). <brief> is required in requirements.

Effect: resolves the active feature (scaffolds one if needed), loads design inputs from harness/inputs.yaml + --inputs, writes docs/SPEC_PLAN.md (the authoring SOP), the AI authors spec.md (then plan.md + data-model.md in design), runs a self-check (template placeholders, empty cross-cutting section, unresolved [NEEDS CLARIFICATION]). With --advance and a clean self-check, runs the gate and continues to the next phase.

Flag	Default	Meaning
`--feature <name>`	derived from brief	Feature display name
`--inputs <p1,p2,…>`	`harness/inputs.yaml`	CSV of design-input paths appended to the inputs registry
`--advance`	off	After a clean self-check, run the gate and continue into the next phase
`--db <dir>`	empty	Design-docs directory. The live DB URL recorded as a spec assumption comes from `DATABASE_URL`, not this flag.

/harness validate

Purpose: Check harness/ layout integrity. Prereq: init done.

Effect: reports ok, errors (✗) and warnings (⚠). With spec-kit on, also shows a Constitution section and a cross-artifact consistency section. No flags.

/harness lint

Purpose: Run rules from harness/rules/*.md against your source. Prereq: init done.

Effect: returns a 0–100 score and a violations table (severity / file / line / rule / message). With spec-kit on, constitution-conformance and no-unresolved-clarification always run. No flags.

/harness status

Purpose: Show current harness state. Prereq: none (prints "not initialized" if absent).

Effect: detected template, required/optional files present/missing. With spec-kit on, also constitution version, active feature, unresolved-clarification count. Bare /harness defaults to status. No flags.

/harness phase init

/harness phase init [--phase <id>] [--feature <name>]

Purpose: Initialize the phase state machine. Prereq: init done.

Flag	Default	Meaning
`--phase <id>`	`requirements`	Phase to initialize at
`--feature <name>`	none	spec-kit only: scaffold `specs/<NNN-name>/{spec,plan,tasks}.md` and set the active feature

Pipeline control

/harness phase status

Purpose: Print current phase, role, rollback count and last gate result. Prereq: phase state initialized.

Effect: with spec-kit on, also shows the active feature, task progress (done/total %) and, in requirements, unresolved clarifications. No flags.

/harness phase advance

/harness phase advance [--reason "…"] [--no-interactive]

Purpose: Run the current phase's gate and, on pass, advance to the next phase with automatic role re-assume. The most-used command.

Effect: (spec-kit, requirements) resolves [NEEDS CLARIFICATION] interactively; runs the gate; advances + re-assumes the phase-default role; on design → test-spec auto-generates tasks.md; on entering implementation, prompts you to use /harness implement first.

Flag	Default	Meaning
`--reason "…"`	empty	Audit reason
`--no-interactive`	off	spec-kit only: in requirements, block instead of prompting when clarifications are unresolved

/harness phase rollback

/harness phase rollback [--reason "…"]

Purpose: Roll back one phase. Prereq: rollback-attempt limit not exceeded (else "Human approval required").

Effect: returns to the previous phase and increments the attempt counter. Flag: --reason (audit reason).

/harness gate

/harness gate <gateId>

Purpose: Run a single named gate from gates.yaml (verify only; no phase change). Prereq: <gateId> defined (omit to list gate ids).

Effect: returns a per-check pass/fail table. gate-integration-test is special: it runs the /systest pipeline in-session (inheriting the Claude-in-Chrome MCP) rather than as a programmatic check.

/harness assume

/harness assume <role> [--reason "…"]

Purpose: Manually set the active role (suppresses phase auto-assume). Role: pm | developer | reviewer | tester.

Effect: sets the manual role, writes the persona block to CLAUDE.md, appends a role-assume audit entry.

/harness role

Purpose: Show the active role, whether it is manually overridden, and the phase-default role. No flags.

Implement & advanced

/harness implement

Purpose: Implement the next not-done task from tasks.md, gated by test-plan.md acceptance criteria. Prereq: current phase must be implementation; an active feature with tasks.md must exist.

Effect: aggregates the next task(s) + acceptance criteria into docs/IMPLEMENT_PLAN.md; the AI implements with a mandatory per-task self-check (AC coverage + design re-read for omitted cross-cutting features), then marks tasks done. Mode is driven by permission mode: auto / bypassPermissions → continuous (implements all tasks in order); otherwise single-task then stop. No flags.

/harness context

/harness context <show | refresh | query <text>>

Purpose: Manage the 3-layer context (session / phase / on-demand). Prereq: init done.

show — print the snapshot + a per-layer entries/tokens table (no persist).
refresh — rebuild the snapshot, write it to CLAUDE.md, append an audit entry. With spec-kit, injects the active feature's spec/plan/tasks into the phase layer.
query <text> — run the on-demand layer for the text without persisting.

/harness approve

/harness approve <gateId> [--reason "…"]

Purpose: Record human approval for a pending human-approval gate check. Effect: writes a YAML approval record (consumed on the next gate run) and an audit entry.

/harness drill

/harness drill [--skip-gates] [--from <phaseId>] [--to <phaseId>]

Purpose: Virtual dry-run of the whole pipeline without persisting state. Prereq: init done.

Flag	Default	Meaning
`--skip-gates`	`false`	Walk all phases, structure check only (skip gate execution)
`--from <phaseId>`	default phase	Start phase (inclusive)
`--to <phaseId>`	terminal phase	End phase (inclusive)

/harness ai-rate

/harness ai-rate [--since <days>]

Purpose: Measure AI adoption (% of code authored/accepted by AI) from git history. Prereq: a git repo.

Effect: detects AI commits (author/email matches claude|anthropic, or a Co-Authored-By: Claude trailer) and reports commit-level + line-level rates and a per-author table. Flag: --since <days> (default 30).

/harness ci-init

/harness ci-init [--target github|gitlab|both] [--force]

Purpose: Emit CI scaffold files (idempotent). Effect: github → .github/workflows/harness.yml; gitlab → .gitlab-ci.yml; both → both. Existing files skipped unless --force.

Flag	Default	Meaning
`--target`	`github`	`github` \| `gitlab` \| `both`
`--force`	`false`	Overwrite existing CI files

Automation tools

/modernize and /newproject share the same argument parser; /systest has its own; /env is prompt-driven (subcommands). All flags below are verified against the source.

/modernize

Modernize a legacy codebase — analyze legacy source (and screenshots) via a knowledge graph, decompose into PRP modules, then multi-agent TDD-develop, test and document a new web app, without touching the legacy tree.

/modernize --workspace <legacy> --database <conn> [--output <dir>] [--design-style <name>] [--backend-lang] [--frontend-lang] [--language ja|en|zh] [--team] [-v] [--env <name>]

Flag (long / short)	Default	Meaning
`--workspace` / `-w`	required	Legacy code path (read-only reference)
`--output` / `-o`	= workspace	Output dir for modernized code
`--database` / `--db`	unset	Database connection URL
`--design-docs` / `-d`	unset	Requirements/design docs directory
`--design-style`	unset	Brand design language for the generated UI — see Design styles (70+ presets, e.g. stripe, linear.app, notion)
`--backend-lang` / `--frontend-lang`	unset	Target backend / frontend framework
`--reference` / `-r`	unset	Reference project path
`--language` / `-l`	`ja`	Documentation language
`--team`	`false`	Enable experimental Agent Teams
`-v` / `--verify`	`false`	Pause at end of Phase 3 for review
`--env`	unset	Activate a named infra env (URLs auto-filled)

Phase flow

Phase 0 Graph Init (legacy + output graphs) → Phase 3 Module Decomposition (PRPs) → Phase 3.5 Contract Generation (openapi.yaml) → Phase 4 Multi-Agent TDD Development → Phase 4.5 Acceptance Cross-check → Phase 5 Automated Test → Phase 6 MODERNIZATION_REPORT.md

Two-graph architecture — a read-only legacy graph for analysis/decomposition, plus an output graph grown during Phase 4 and verified by gate queries.
Screenshot-driven UI mapping — up to 10 pre-selected legacy screenshots mapped 1:1 to modern pages.
Contract-first — generates openapi.yaml so backend and frontend cannot drift.
PRP + Completion Markers — grep-detectable, re-runnable evidence per module; tech-stack resolution works even with no package.json (e.g. COBOL).

Design styles (`--design-style`)

Pass --design-style <name> to make the generated frontend follow a ready-made brand design language. MetaCoder reads the design spec (colors, typography, spacing, component patterns) from the open-source awesome-design-md library and applies the matching DESIGN.md during Phases 3–4 — so the modern UI looks intentional, not generic.

70+ brand presets are available, for example: apple, stripe, linear.app, notion, vercel, claude, figma, airbnb, spotify, tesla, shopify, supabase, raycast, nike, uber.

Browse the full catalogue (each preset is a folder with a DESIGN.md): github.com/VoltAgent/awesome-design-md. Use the folder name as the --design-style value, e.g. --design-style stripe.

/newproject

Build a greenfield project from requirement documents — confirm decisions via a Socratic Q&A, decompose into PRPs, then multi-agent TDD-develop, test and document a full-stack web app.

/newproject --workspace <dir> --design-docs <reqs> [--database <conn>] [--reference <proj>] [--design-style] [--backend-lang] [--frontend-lang] [-v] [--env]

Uses the same shared parser as /modernize, so the full flag set is identical. Notable differences below.

Flag	Default	Meaning
`--workspace` / `-w`	required	Output workspace for the new code
`--design-docs` / `-d`	unset	Requirements docs (enumerated only, not regex-parsed)
`--reference` / `-r`	auto	Reference/PoC project; auto-detected from common folders if omitted
`--database` / `--db`	resolved	If omitted: session TiDB (`DATABASE_URL`) → else local SQLite
`-v` / `--verify`	`false`	Pause at end of Phase 3 (the Socratic phase is itself interactive)

Phase flow

Phase 0 Graph Init (empty) → Phase 1.5 Socratic Requirements Confirmation (7–15 Q) → Phase 3 Module Decomposition → Phase 3.5 OpenAPI 3.1 Contract → Phase 4 Code Generation → Phase 4.5 Acceptance Cross-check → Phase 5 Automated Test → Phase 6 Docs

Socratic Phase 1.5 — 7–15 mandatory questions (architecture / business / UI-UX / non-functional / MVP) lock language, auth, DB, deploy target and scope before coding.
DB auto-resolution — explicit --database > session TiDB Cloud > local SQLite.
Docs are not machine-parsed — enumerated only; structuring is the AI's job. Reference code is auto-detected and an env scaffold is generated.
Design styles — like /modernize, --design-style <name> applies one of 70+ brand presets from awesome-design-md (e.g. --design-style notion).

/systest

Automated system QA of a running app — builds a knowledge graph, generates two-layer test cases, then runs API + Chrome-MCP frontend E2E tests with an auto-fix loop and an evidence-backed self-audit.

/systest run --workspace <path> [--backend-url <url>] [--frontend-url <url>] [--database <conn>] [--env <name>] [--design-docs <path>]

Flag (long / aliases / short)	Default	Meaning
`--workspace` / `-w`	required	Project workspace to test
`--backend-url` / `--backend` / `-b`	unset / from env	Backend base URL → enables Phase 5B (API testing)
`--frontend-url` / `--frontend` / `-f`	unset / from env	Frontend URL → enables Phase 5C (E2E). Absent ⇒ 5C skipped
`--database` / `--database-url` / `--db`	unset / from env	Database URL (user creation, admin promotion)
`--env`	unset	Named env; fills missing URL slots (explicit flags win)
`--design-docs` / `-d`	unset	Design documents directory

Phase flow

Step 0 Env Discovery (production ⇒ hard refuse) → Phase 1–3 Init / Graph → Phase 4 Test-case Generation (Layer1 CRUD / Layer2 scenario) → Phase 5A Seed + Quality Gate → Phase 5B Backend API → Phase 5C Frontend E2E (Chrome MCP) → Phase 6 TEST_REPORT.md → Phase 6.5 Evidence Self-audit

Two-layer testing — Layer 1 CRUD/happy-path then Layer 2 negative/auth/validation/boundary/role-routing. Running only one layer is a VIOLATION.
Chrome-MCP-driven E2E — real-browser navigation/form-submit; captures 2xx POSTs as evidence; the only Phase 5C skip trigger is a missing --frontend-url.
Honest-fail self-audit (6.5) — re-reads evidence files; any pass claim without matching evidence is downgraded to FAIL.
Production safety — an env tagged production is rejected before Phase 1.

/env

Prompt-driven infrastructure-environment manager — list, switch, import/export, edit, health-check and diff environments defined in .metacoder/environments.yaml, with secret-safety and a preview → /apply guard.

Subcommand	Meaning
`/env list [--tag TAG]`	List envs + active (bare `/env` = list)
`/env use NAME [Reason: …]`	Switch active env
`/env reset`	Return to the default env
`/env import PATH --as NAME`	Import a `.env`; auto-classify keys var/secret
`/env export NAME [--include-secrets]`	Export as `.env`; secrets masked unless flag
`/env add NAME [extends PARENT] [field=VALUE …]`	Add env (requires cloud.provider, cloud.region, safety.destructive_ops)
`/env edit NAME [field=VALUE …]`	Shallow-merge field updates
`/env remove NAME`	Remove env (warns about dependents)
`/env clone SRC DST [field=VALUE …]`	Copy with overrides
`/env set NAME.PATH VALUE`	Dot-path single-field set
`/env doctor [--level basic\|standard\|full] [--env NAME] [--force] [--offline]`	Health check (full blocked on production unless --force)
`/env diff A B`	Side-by-side diff after extends resolution

Secret safety — list/doctor/diff never print secret values, only ${var:KEY} / ${secret:KEY} names. Import classifies keys containing PASSWORD/SECRET/TOKEN/CREDENTIAL/PASS/PWD/KEY as secret; NEXT_PUBLIC_/VITE_PUBLIC_/PUBLIC_ prefixes as var.
Preview → /apply — every mutating command shows a YAML diff and writes nothing until you type /apply (/cancel discards).
doctor 3 levels — basic (resolve refs), standard (+ TCP reachability), full (+ cloud-CLI identity; production-guarded). --offline skips network checks.
Cross-skill bridge — --env feeds backend/frontend/database URLs into /systest, /modernize, /newproject; production envs are rejected by consuming skills before running.

Accuracy note: a 60-rule CLI allowlist, a SQL classifier and two-phase-commit belong to the separate /deploy skill, not /env. /env only surfaces safety.writes_two_phase_commit as a flagged field in /env diff.

Documentation reflects MetaCoder Desktop v3.6.22. Command syntax is verified against the source code; gate contents are defined per project in harness/workflow/gates.yaml.

MetaCoder Documentation

Overview

Install & first run

Harness Engineering — concept & pipeline

The 10-phase pipeline

4 roles (auto-assumed per phase)

spec-kit — design first, build exactly to the design

/harness — command reference

/harness init

/harness spec

/harness validate

/harness lint

/harness status

/harness phase init

/harness phase status

/harness phase advance

/harness phase rollback

/harness gate

/harness assume

/harness role

/harness implement

/harness context

/harness approve

/harness drill

/harness ai-rate

/harness ci-init

Automation tools

/modernize

Phase flow

Design styles (--design-style)

/newproject

Phase flow

/systest

Phase flow

/env

Design styles (`--design-style`)