Agents Specification
Overview
Open‑Deep‑Coder is an agentic IDE implementing a graph‑of‑agents for intelligent software development. It combines the research pattern (plan → act → observe → critique → iterate) with advanced LLM integration, development tools, and a sophisticated user interface.
Core Capabilities
LLM Integration
- **Remote LLMs** via OpenRouter for cloud-based models (GPT-4, Claude, etc.)
- **Local LLMs** via Ollama for privacy and offline development
- **RouteLLM** for intelligent request routing based on task type:
- Code generation/debugging → Code-specialized models
- Architecture/planning → Reasoning-focused models
- Documentation/testing → Language-oriented models
- Security analysis → Security-trained models
Development Environment
- **LSP Integration** - Full language server protocol support
- **MCP Servers** - Model Context Protocol for enhanced AI capabilities
- **n8n Workflows** - Automation for CI/CD and development processes
- **Chat-based Configuration** - Natural language setup for integrations
- **Permission System** - Secure management of remote connections
User Interface
- **Four Theme Variants**:
- Light Low Contrast - Gentle on eyes for extended use
- Light High Contrast - Enhanced readability with bright syntax highlighting
- Dark Low Contrast - Comfortable dark mode
- Dark High Contrast - Maximum contrast for accessibility
- **Customizable Keybinds** - JSON-configurable keyboard shortcuts
- **Integrated Chat** - Direct LLM interaction within the IDE
Roles
1) Orchestrator
* Maintains state machine and task graph. * Selects next action based on test/lint outcomes. * Produces/updates artifacts: `plan.md`, `patchset.diff`, `test_report.json`, `pr_body.md`.
2) Planner
* Reads repo + backlog, proposes milestone goals and atomic tasks. * Generates/updates `plan.md` with clear acceptance criteria and owner agent. * Splits tasks by domain: feature, refactor, bugfix, infra/CI, security.
**Planner Output Template**
```yaml cycle: <n> milestone: <name> tasks:
title: <short imperative> rationale: <why> steps: [s1, s2, s3] acceptance:
```
- id: OD-<id>
- tests: <which tests pass>
- quality: <lint/type/coverage thresholds>
3) Implementer
* Edits/creates code and tests. * Uses `shell.run` for installs/builds. * Writes minimal diffs aligned with acceptance criteria.
**Implementer Output Template**
```yaml change: files:
action: modify|create|delete commands: ["pip install -e .", "pytest -q"] notes: <design decisions or tradeoffs> ```
- path: src/...
4) Verifier
* Runs tests (`tests.run`), linters (`lint.run`), type checks (e.g., `mypy`), and coverage. * Produces `test_report.json` and a short diagnosis (e.g., flaky test, unmet import, perf regression).
**Verifier JSON (example)**
```json { "tests": {"passed": 31, "failed": 0, "skipped": 1}, "coverage": 78.2, "lint": {"errors": 0, "warnings": 2}, "typecheck": {"errors": 0}, "artifacts": ["coverage.xml", "pytest.log"], "next_actions": ["Ready for review"] } ```
5) Reviewer
* Reviews `patchset.diff`, `test_report.json`, and security scan results. * Enforces checklist; drafts `pr_body.md` and, with approval, opens PR.
**Security Checklist (minimum)**
* No secrets committed; dependency pins present. * Input validation added/retained; logging avoids sensitive data. * Licenses respected; third‑party code noted in `NOTICE`.
6) Researcher (optional)
* Looks up library/API usage and summarizes sources inside `pr_body.md`. * May propose alternatives or deprecations.
Tools (expected interfaces)
> Implement as MCP tools, LangChain tools, or process‑local adapters.
Core Development Tools
* **File System** * `fs.read(path) → {content}` * `fs.write(path, content) → {ok}` * `fs.glob(pattern) → {paths[]}` * **Shell** * `shell.run(cmd, timeout?) → {exit_code, stdout, stderr}` * **Git** * `git.branch(name)`, `git.diff()`, `git.add(paths)`, `git.commit(msg)`, `git.push()` * **Tests/Lint/Format** * `tests.run(target) → report_json` * `lint.run(target)`, `format.run(target)` * **Security** * `secr.scan(target) → findings[]`
LLM Integration Tools
* **OpenRouter Client** * `openrouter.chat(model, messages, context?) → response` * `openrouter.models() → available_models[]` * **Ollama Client** * `ollama.chat(model, messages, context?) → response` * `ollama.models() → local_models[]` * **RouteLLM Router** * `route.classify(request) → task_type` * `route.select_model(task_type, context?) → optimal_model`
Development Environment Tools
* **LSP Manager** * `lsp.start(language, workspace) → server_instance` * `lsp.completion(file, position) → suggestions[]` * `lsp.diagnostics(file) → errors[]` * **MCP Handler** * `mcp.connect(server_url) → connection` * `mcp.invoke(tool, params) → result` * **n8n Connector** * `n8n.create_workflow(definition) → workflow_id` * `n8n.execute(workflow_id, data) → result`
UI/UX Tools
* **Theme Manager** * `theme.set(variant) → {ok}` // light-low, light-high, dark-low, dark-high * `theme.get_current() → current_theme` * **Keybind Manager** * `keybind.load(config_path) → bindings` * `keybind.save(bindings, config_path) → {ok}` * **Permission Gateway** * `permission.request(resource, reason) → granted` * `permission.check(resource) → has_access`
HTTP & External Access
* **HTTP** (with permission control) * `http.get(url, headers?) → {status, text}` // requires permission * `http.post(url, data, headers?) → {status, response}` // requires permission
Repo Conventions
* Python default (switchable to JS/TS by setting `LANG=js` in `plan.md`). * `pyproject.toml` with `ruff`, `black`, `pytest`, `coverage`, `mypy`. * `.github/workflows/ci.yml` runs: install → lint → typecheck → tests → coverage upload.
CI Skeleton
```yaml name: ci on: [push, pull_request] jobs: build: runs-on: ubuntu-latest steps:
with: { python-version: "3.11" }
```
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
- run: pip install -U pip
- run: pip install -e .[dev]
- run: ruff check .
- run: black --check .
- run: mypy src
- run: pytest -q --maxfail=1 --disable-warnings --cov=src --cov-report=xml
`README.md` Seed (have the Implementer generate on init)
```
Open‑Deep‑Coder
A multi‑agent coding workflow (Planner → Implementer → Verifier → Reviewer) running on a LangGraph/MCP‑style toolbelt.
Quickstart
make init && make test ```
Starting Backlog (copy into `plan.md` on init)
* OD‑1 Initialize repo + CI * OD‑2 Implement tool adapters (fs, shell, git, tests, lint, secr) * OD‑3 Add example module + tests * OD‑4 Wire PR body generator * OD‑5 Add security baseline * OD‑6 Explore OpenHands and OpenDevin as inspiration or tool sources * OD‑7 Extend LangGraph for dynamic agent spawning and parallel Implementers
External Projects & Extensions
* **OpenHands**: provides concrete examples of multi‑modal tools (edit code, run shell, interact with browser). Open‑Deep‑Coder can reuse its shell+fs abstractions or adapt its action protocol as a compatibility layer. * **OpenDevin**: showcases autonomous software‑engineering flows, including SWE‑Bench benchmarks and persistent workspaces. Useful for evaluation harness and workspace persistence ideas. * **LangGraph**: serves as the backbone orchestrator. Extend with custom nodes (e.g., for spawning multiple Implementers in parallel) and richer state transitions (e.g., retry on flaky tests, branch merging heuristics).
Guardrails
* No network writes (publishing, package uploads) without explicit approval. * Max diff 500 LOC unless override in `plan.md`. * Always add/adjust tests for new behavior.
Human Approval Gates
* Creating remote branches / pushing. * Opening PRs. * Changing licenses or adding new dependencies.