Skip to main content

Devgraph

@bonsai/devgraph is a locally run TypeScript LangGraph orchestrator for taking a ticket, prompt, or PR review task through a watchable lifecycle DAG. Agent work happens in interactive panes, while deterministic steps such as git operations, GitHub calls, test gates, graph state, and artifact validation stay under the orchestrator.

The default implementation profile covers the full ticket-to-PR lifecycle. The built-in pr-review profile reviews an existing PR without checking out or modifying its branch.

What It Automates

Implementation runs can include:

  1. Fetch ticket or prompt input.
  2. Clarify ambiguous requirements.
  3. Define acceptance criteria.
  4. Plan work.
  5. Write red tests when the plan warrants them, classifying durable suite tests separately from temporary verification-only tests.
  6. Verify the red gate.
  7. Implement the change with Codex agent panes.
  8. Verify the green gate.
  9. Record and validate acceptance-criteria videos when configured.
  10. Run cross-model review.
  11. Apply review fixes.
  12. Push the branch and create a PR.
  13. Run the Greptile review loop until the target score or cycle cap is reached.

Every loop has a cap. Exhausting a cap writes BLOCKED.md and exits with a non-zero status instead of spinning.

Prerequisites

  • Node 20 or newer.
  • Git.
  • tmux for the default pane backend (brew install tmux).
  • Codex CLI installed and authenticated.
  • GitHub CLI installed and authenticated.
  • Atlassian CLI (acli) for JIRA-backed tickets, or direct JIRA REST env vars when using ticket.source: "jira".
  • Optional: Bonsai CLI for worktree isolation, Herdr for the experimental backend, VHS for terminal videos, and Greptile GitHub app for review scoring.

Install and Develop

cd apps/devgraph
bun install
bun run build
bun src/cli.ts doctor

Development checks:

bun test --timeout 120000
bun run lint
bun run typecheck
bun run build

Quickstart

Run the fake-agent demo with real graph traversal and artifact writes:

bun src/cli.ts demo

Check prerequisites:

bun src/cli.ts doctor

Run against a ticket from a target repository:

cd /path/to/target-repo
bun /path/to/bonsai/apps/devgraph/src/cli.ts run MB-123

Run from a free-text task:

cd /path/to/target-repo
bun /path/to/bonsai/apps/devgraph/src/cli.ts run "Add a dark-mode toggle to settings"

Review an existing PR without checking out or modifying its branch:

cd /path/to/target-repo
bun /path/to/bonsai/apps/devgraph/src/cli.ts run --task pr-review 123
bun /path/to/bonsai/apps/devgraph/src/cli.ts run --task pr-review https://github.com/org/repo/pull/123

Commands

devgraph run <TICKET|"task"> Run a lifecycle for a ticket or free-text task.
devgraph runs List runs under .lifecycle/runs.
devgraph resume [run-dir|ticket] Continue a paused or interrupted run.
devgraph answer [run-dir|ticket] Answer a waiting clarify gate.
devgraph videos [run-dir|ticket] Open or list recorded AC videos.
devgraph demo Execute the local fake-agent demo.
devgraph graph Print the compiled lifecycle DAG as Mermaid.
devgraph doctor Check prerequisites and config.

Common options:

--config <path> Path to devgraph.config.json.
--task <profile> Task profile, such as implementation or pr-review.
--backend <auto|tmux|herdr> Multiplexer backend.
--detached Start the tmux conductor session without attaching.
--delay-ms <n> Demo stage delay.
--list List videos without opening them.

Configuration

Devgraph reads devgraph.config.json from the current working directory unless --config is supplied. Important sections include:

SectionPurpose
repoTarget repo path, base branch, test command, setup command, branch prefix, and workspace mode.
taskDefault task profile and named profile overrides.
ticketTicket source: acli, jira, file, codex compatibility alias, or prompt.
codexCodex binary, implementation model, review model, network mode, sandbox bypass, disabled MCP servers, and required MCP servers.
stagesAgent, implementation, red-test, review, and blocked-grace limits.
clarifyClarify gate mode and timeout.
videosAcceptance video enablement, strict mode, app URL, and app startup timeout.
greptileGreptile enablement, bot name, target score, polling, cycle timeout, and manual trigger behavior.
muxPane backend: auto, tmux, or herdr.
signalCompletion notification settings.
checkpointerSQLite or Postgres LangGraph checkpoint backend.

Minimal implementation config:

{
"repo": {
"path": ".",
"testCommand": "bunx turbo run typecheck test build"
},
"codex": {
"implementModel": "gpt-5.5",
"reviewModel": "gpt-5.4"
}
}

Run Artifacts

Each run writes to:

<repo>/.lifecycle/runs/<run-id>/
status.json
events.ndjson
graph.mmd
checkpoint.sqlite
snapshots/
diff.patch
questions.md
answers.md
videos/
GO.md
BLOCKED.md

Stage directories also contain the prompt and result.json sentinel files used to coordinate interactive agents.

Human Intervention

The conductor shows live status for each stage. If clarification is required, answer in the conductor or from another terminal:

devgraph answer

If a run stops as blocked, inspect BLOCKED.md, fix the missing human action or external prerequisite, and resume:

devgraph resume

Safety Model

  • Agents write schema-validated result files instead of being scraped from the TUI.
  • The orchestrator owns git, GitHub, test execution, checkpointing, and artifact validation.
  • Red-test agents can write durable testFiles plus temporary verificationFiles; only durable tests are committed, and temporary verification files are removed before implementation or fix commits.
  • The review stage uses a different model and receives a patch file; the orchestrator verifies it did not edit the worktree.
  • Bonsai workspaces are auto-detected when configured, so ticket runs can happen in isolated worktrees.
  • Disabled MCP servers default to github, atlassian, and sentry; add a server to codex.requiredMcpServers to keep it enabled.

Task Profiles

Built-ins:

  • implementation or feature: full ticket or prompt implementation lifecycle.
  • pr-review or review-pr: existing PR review lifecycle.

Named profiles can extend built-ins and append stage-specific prompt guidance:

{
"task": {
"profiles": {
"security-pr-review": {
"extends": "pr-review",
"features": { "clarify": true },
"prompts": {
"review": "Prioritize auth bypasses, leaked secrets, data exposure, and missing abuse-case tests."
}
}
}
}
}