MetaCoder Documentation
Command reference for MetaCoder's Harness Engineering pipeline (/harness) and the four automation tools. Syntax, flags and defaults on this page are verified against the MetaCoder source code.
/harness status.Overview
MetaCoder is a graph-first AI development tool for large-scale systems. It turns a codebase into a Semantic Knowledge Graph and drives AI development through Harness Engineering — a disciplined 10-phase pipeline with machine-verified quality gates. Two command families:
/harness— the discipline system: a 10-phase state machine with roles, gates, context layers and an audit trail. 18 subcommands.- Automation tools —
/modernize,/newproject,/systest,/env: end-to-end skills built on the same knowledge graph.
Install & first run
Download the free Desktop app for Windows or macOS from the download page. A typical first session:
Harness Engineering — concept & pipeline
Harness Engineering gives AI-driven development a fixed order and quality standards verified by machine, not by optimistic self-report. Humans and AI travel the same rails.
The 10-phase pipeline
Each phase advances only when its gate passes; rollback steps back one phase at a time.
4 roles (auto-assumed per phase)
| Role | Phases |
|---|---|
pm | requirements / design / deploy-plan / confirm |
tester | test-spec / integration-test |
developer | implementation / performance |
reviewer | code-review / security |
- Machine-verified gates — a gate's checks are defined in
harness/workflow/gates.yaml. Honest-fail: a test that never ran, returned 0 cases, or couldn't reach the browser stays red. - 3 context layers — session / phase / on-demand, managed with
/harness context. - Audit trail — phase-advance / rollback / gate-run / role-assume are appended as NDJSON;
/harness ai-ratemeasures AI adoption from git history.
harness/workflow/gates.yaml. Lists in this doc reflect the standard template that /harness init generates.spec-kit — design first, build exactly to the design
spec-kit is the key innovation: the AI first produces an industry-standard design (a project constitution plus a spec.md → plan.md → tasks.md chain), then implements strictly to it — no more, no less. This prevents AI hallucination from adding imagined, out-of-scope features. spec-kit is active when METACODER_HARNESS_SPEC_MODE is not off; it powers /harness spec, auto-generates tasks.md on the design → test-spec advance, and promotes cross-cutting features (auth, i18n, AI assistant) to first-class tasks so nothing is buried.
/harness — command reference
18 subcommands in three groups. Flags and defaults below are taken from the source (src/skills/bundled/harness.ts and the _exec registry).
/harness init
Effect: creates harness/{rules,skills,agents,workflow,context}/ plus workflow/phases.yaml and gates.yaml. With spec-kit on, also writes harness/constitution.md (3 articles) and harness/specs/.gitkeep.
| Flag | Default | Meaning |
|---|---|---|
--template TYPE | minimal | minimal, java-spring-enterprise, node-fastapi, python-django, go-microservice, react-nextjs-frontend, cobol-modernization |
--force | false | Overwrite existing harness/ files without confirmation |
/harness spec
Prereq: current phase must be requirements or design (auto-initializes to requirements if phase state was never set). <brief> is required in requirements.
Effect: resolves the active feature (scaffolds one if needed), loads design inputs from harness/inputs.yaml + --inputs, writes docs/SPEC_PLAN.md (the authoring SOP), the AI authors spec.md (then plan.md + data-model.md in design), runs a self-check (template placeholders, empty cross-cutting section, unresolved [NEEDS CLARIFICATION]). With --advance and a clean self-check, runs the gate and continues to the next phase.
| Flag | Default | Meaning |
|---|---|---|
--feature <name> | derived from brief | Feature display name |
--inputs <p1,p2,…> | harness/inputs.yaml | CSV of design-input paths appended to the inputs registry |
--advance | off | After a clean self-check, run the gate and continue into the next phase |
--db <dir> | empty | Design-docs directory. The live DB URL recorded as a spec assumption comes from DATABASE_URL, not this flag. |
/harness validate
Effect: reports ok, errors (✗) and warnings (⚠). With spec-kit on, also shows a Constitution section and a cross-artifact consistency section. No flags.
/harness lint
Effect: returns a 0–100 score and a violations table (severity / file / line / rule / message). With spec-kit on, constitution-conformance and no-unresolved-clarification always run. No flags.
/harness status
Effect: detected template, required/optional files present/missing. With spec-kit on, also constitution version, active feature, unresolved-clarification count. Bare /harness defaults to status. No flags.
/harness phase init
| Flag | Default | Meaning |
|---|---|---|
--phase <id> | requirements | Phase to initialize at |
--feature <name> | none | spec-kit only: scaffold specs/<NNN-name>/{spec,plan,tasks}.md and set the active feature |
/harness phase status
Effect: with spec-kit on, also shows the active feature, task progress (done/total %) and, in requirements, unresolved clarifications. No flags.
/harness phase advance
Effect: (spec-kit, requirements) resolves [NEEDS CLARIFICATION] interactively; runs the gate; advances + re-assumes the phase-default role; on design → test-spec auto-generates tasks.md; on entering implementation, prompts you to use /harness implement first.
| Flag | Default | Meaning |
|---|---|---|
--reason "…" | empty | Audit reason |
--no-interactive | off | spec-kit only: in requirements, block instead of prompting when clarifications are unresolved |
/harness phase rollback
Effect: returns to the previous phase and increments the attempt counter. Flag: --reason (audit reason).
/harness gate
Effect: returns a per-check pass/fail table. gate-integration-test is special: it runs the /systest pipeline in-session (inheriting the Claude-in-Chrome MCP) rather than as a programmatic check.
/harness assume
Effect: sets the manual role, writes the persona block to CLAUDE.md, appends a role-assume audit entry.
/harness role
/harness implement
Effect: aggregates the next task(s) + acceptance criteria into docs/IMPLEMENT_PLAN.md; the AI implements with a mandatory per-task self-check (AC coverage + design re-read for omitted cross-cutting features), then marks tasks done. Mode is driven by permission mode: auto / bypassPermissions → continuous (implements all tasks in order); otherwise single-task then stop. No flags.
/harness context
show— print the snapshot + a per-layer entries/tokens table (no persist).refresh— rebuild the snapshot, write it toCLAUDE.md, append an audit entry. With spec-kit, injects the active feature's spec/plan/tasks into the phase layer.query <text>— run the on-demand layer for the text without persisting.
/harness approve
/harness drill
| Flag | Default | Meaning |
|---|---|---|
--skip-gates | false | Walk all phases, structure check only (skip gate execution) |
--from <phaseId> | default phase | Start phase (inclusive) |
--to <phaseId> | terminal phase | End phase (inclusive) |
/harness ai-rate
Effect: detects AI commits (author/email matches claude|anthropic, or a Co-Authored-By: Claude trailer) and reports commit-level + line-level rates and a per-author table. Flag: --since <days> (default 30).
/harness ci-init
| Flag | Default | Meaning |
|---|---|---|
--target | github | github | gitlab | both |
--force | false | Overwrite existing CI files |
Automation tools
/modernize and /newproject share the same argument parser; /systest has its own; /env is prompt-driven (subcommands). All flags below are verified against the source.
/modernize
Modernize a legacy codebase — analyze legacy source (and screenshots) via a knowledge graph, decompose into PRP modules, then multi-agent TDD-develop, test and document a new web app, without touching the legacy tree.
| Flag (long / short) | Default | Meaning |
|---|---|---|
--workspace / -w | required | Legacy code path (read-only reference) |
--output / -o | = workspace | Output dir for modernized code |
--database / --db | unset | Database connection URL |
--design-docs / -d | unset | Requirements/design docs directory |
--design-style | unset | Brand design language for the generated UI — see Design styles (70+ presets, e.g. stripe, linear.app, notion) |
--backend-lang / --frontend-lang | unset | Target backend / frontend framework |
--reference / -r | unset | Reference project path |
--language / -l | ja | Documentation language |
--team | false | Enable experimental Agent Teams |
-v / --verify | false | Pause at end of Phase 3 for review |
--env | unset | Activate a named infra env (URLs auto-filled) |
Phase flow
- Two-graph architecture — a read-only legacy graph for analysis/decomposition, plus an output graph grown during Phase 4 and verified by gate queries.
- Screenshot-driven UI mapping — up to 10 pre-selected legacy screenshots mapped 1:1 to modern pages.
- Contract-first — generates
openapi.yamlso backend and frontend cannot drift. - PRP + Completion Markers — grep-detectable, re-runnable evidence per module; tech-stack resolution works even with no
package.json(e.g. COBOL).
Design styles (--design-style)
Pass --design-style <name> to make the generated frontend follow a ready-made brand design language. MetaCoder reads the design spec (colors, typography, spacing, component patterns) from the open-source awesome-design-md library and applies the matching DESIGN.md during Phases 3–4 — so the modern UI looks intentional, not generic.
70+ brand presets are available, for example: apple, stripe, linear.app, notion, vercel, claude, figma, airbnb, spotify, tesla, shopify, supabase, raycast, nike, uber.
DESIGN.md): github.com/VoltAgent/awesome-design-md. Use the folder name as the --design-style value, e.g. --design-style stripe./newproject
Build a greenfield project from requirement documents — confirm decisions via a Socratic Q&A, decompose into PRPs, then multi-agent TDD-develop, test and document a full-stack web app.
| Flag | Default | Meaning |
|---|---|---|
--workspace / -w | required | Output workspace for the new code |
--design-docs / -d | unset | Requirements docs (enumerated only, not regex-parsed) |
--reference / -r | auto | Reference/PoC project; auto-detected from common folders if omitted |
--database / --db | resolved | If omitted: session TiDB (DATABASE_URL) → else local SQLite |
-v / --verify | false | Pause at end of Phase 3 (the Socratic phase is itself interactive) |
Phase flow
- Socratic Phase 1.5 — 7–15 mandatory questions (architecture / business / UI-UX / non-functional / MVP) lock language, auth, DB, deploy target and scope before coding.
- DB auto-resolution — explicit
--database> session TiDB Cloud > local SQLite. - Docs are not machine-parsed — enumerated only; structuring is the AI's job. Reference code is auto-detected and an env scaffold is generated.
- Design styles — like
/modernize,--design-style <name>applies one of 70+ brand presets from awesome-design-md (e.g.--design-style notion).
/systest
Automated system QA of a running app — builds a knowledge graph, generates two-layer test cases, then runs API + Chrome-MCP frontend E2E tests with an auto-fix loop and an evidence-backed self-audit.
| Flag (long / aliases / short) | Default | Meaning |
|---|---|---|
--workspace / -w | required | Project workspace to test |
--backend-url / --backend / -b | unset / from env | Backend base URL → enables Phase 5B (API testing) |
--frontend-url / --frontend / -f | unset / from env | Frontend URL → enables Phase 5C (E2E). Absent ⇒ 5C skipped |
--database / --database-url / --db | unset / from env | Database URL (user creation, admin promotion) |
--env | unset | Named env; fills missing URL slots (explicit flags win) |
--design-docs / -d | unset | Design documents directory |
Phase flow
- Two-layer testing — Layer 1 CRUD/happy-path then Layer 2 negative/auth/validation/boundary/role-routing. Running only one layer is a VIOLATION.
- Chrome-MCP-driven E2E — real-browser navigation/form-submit; captures 2xx POSTs as evidence; the only Phase 5C skip trigger is a missing
--frontend-url. - Honest-fail self-audit (6.5) — re-reads evidence files; any pass claim without matching evidence is downgraded to FAIL.
- Production safety — an env tagged
productionis rejected before Phase 1.
/env
Prompt-driven infrastructure-environment manager — list, switch, import/export, edit, health-check and diff environments defined in .metacoder/environments.yaml, with secret-safety and a preview → /apply guard.
| Subcommand | Meaning |
|---|---|
/env list [--tag TAG] | List envs + active (bare /env = list) |
/env use NAME [Reason: …] | Switch active env |
/env reset | Return to the default env |
/env import PATH --as NAME | Import a .env; auto-classify keys var/secret |
/env export NAME [--include-secrets] | Export as .env; secrets masked unless flag |
/env add NAME [extends PARENT] [field=VALUE …] | Add env (requires cloud.provider, cloud.region, safety.destructive_ops) |
/env edit NAME [field=VALUE …] | Shallow-merge field updates |
/env remove NAME | Remove env (warns about dependents) |
/env clone SRC DST [field=VALUE …] | Copy with overrides |
/env set NAME.PATH VALUE | Dot-path single-field set |
/env doctor [--level basic|standard|full] [--env NAME] [--force] [--offline] | Health check (full blocked on production unless --force) |
/env diff A B | Side-by-side diff after extends resolution |
- Secret safety — list/doctor/diff never print secret values, only
${var:KEY}/${secret:KEY}names. Import classifies keys containingPASSWORD/SECRET/TOKEN/CREDENTIAL/PASS/PWD/KEYas secret;NEXT_PUBLIC_/VITE_PUBLIC_/PUBLIC_prefixes as var. - Preview → /apply — every mutating command shows a YAML diff and writes nothing until you type
/apply(/canceldiscards). - doctor 3 levels —
basic(resolve refs),standard(+ TCP reachability),full(+ cloud-CLI identity; production-guarded).--offlineskips network checks. - Cross-skill bridge —
--envfeeds backend/frontend/database URLs into/systest,/modernize,/newproject; production envs are rejected by consuming skills before running.
/deploy skill, not /env. /env only surfaces safety.writes_two_phase_commit as a flagged field in /env diff.Documentation reflects MetaCoder Desktop v3.6.22. Command syntax is verified against the source code; gate contents are defined per project in harness/workflow/gates.yaml.