← Back to agents

AGENTS.md from marcusglee11/deal-ai

1 starsLast commit Jun 4, 2025

Deal‑AI — Agents Specification

This file formalises the autonomous/LLM “agents” that the Deal‑AI stack will expose once we switch to a Codex‑style multi‑agent runner.

> **Why this file?** Codex and similar orchestrators expect an `agents.md` inventory so they can bootstrap tool‑use policies, call‑routes, and permissions without scraping ad‑hoc comments from source code.

---

1 Glossary

| Term | Meaning | | -------------- | ------------------------------------------------------------------------------------- | | **Tool** | An executable capability (e.g. “Google Drive Search”, “Chroma Query”, “Jinja Render”) | | **Agent** | A named persona/policy that can invoke one or more tools to fulfil a sub‑task | | **Controller** | Top‑level router (provided by Codex) that maps user intent → agent chain |

---

2 Current Agents (MVP)

| Name | Description / Role | Tools authorised | Triggers | | ------------------- | --------------------------------------------------------------------------------------- | ------------------------------------------------------------------------ | -------------------------------- | | **drive\_ingestor** | Lists files in a Google Drive folder, downloads bytes. | `drive.list`, `drive.download` | `process-deal` HTTP POST | | **doc\_parser** | Converts raw bytes → `ParsedDocument` objects (PDF → text, XLSX → tables, DOCX → text). | `unstructured.parse_pdf`, `pandas.read_excel`, `unstructured.parse_docx` | downstream from `drive_ingestor` | | **validator** | Runs Pydantic validation and Great Expectations checks (TBD) on parsed data. | `pydantic.validate`, `great_expectations.check` | after `doc_parser` | | **db\_writer** | Persists deal JSON to Postgres. | `postgres.insert_jsonb` | after `validator` | | **embedder** | Splits text, calls OpenAI embeddings, upserts to Chroma. | `openai.embed`, `chroma.upsert` | after `db_writer` | | **rag\_qa** | Retrieves top‑k chunks from Chroma and calls Chat Completion. | `chroma.query`, `openai.chat` | `/chat` endpoint | | **report\_builder** | Loads deal JSON, renders lender‑overview via Jinja template. | `jinja.render`, `postgres.select_jsonb` | `/report/{deal_id}` endpoint |

> **Note** – At MVP we run all agents sequentially inside one FastAPI worker; Codex upgrade will allow parallel/task‑queue execution.

---

3 Tool Catalogue

3.1 drive.list

```yaml input : {folder_id: str} output: List[ {id:str, name:str, mimeType:str} ] ```

Lists non‑trashed, non‑archived files in a Google Drive folder.

3.2 drive.download

```yaml input : {file_id: str} output: bytes # raw file contents ```

3.3 unstructured.parse\_pdf

Converts PDF bytes → plain text using Unstructured partitioner.

3.4 pandas.read\_excel

Reads XLSX → DataFrame list.

3.5 unstructured.parse\_docx

Reads DOCX/DOCM → plain text.

3.6 pydantic.validate

Uses `schemas.ParsedDocument` to raise on missing fields.

3.7 postgres.insert\_jsonb / select\_jsonb

Simple SQLAlchemy helpers.

3.8 openai.embed

`model: text-embedding-ada-002`, returns 1536‑d vector.

3.9 chroma.upsert / query

HTTP insert + nearest‑neighbour search.

3.10 openai.chat

`model: gpt-3.5-turbo` (default) – deterministic (`temperature=0`).

3.11 jinja.render

Renders `backend/templates/overview.md.jinja` with deal context.

---

4 Routing Logic (Codex pseudocode)

```mermaid flowchart TD A[process-deal] --> B(drive_ingestor) B --> C(doc_parser) C --> D(validator) D --> E(db_writer) E --> F(embedder) chat_request --> R(rag_qa) report_request --> S(report_builder) ```

---

5 Permissions Matrix

| Agent → Tool | list | download | parse | validate | db | embed | query | render | | ------------------- | ---- | -------- | ----- | -------- | -- | ----- | ----- | ------ | | **drive\_ingestor** | ✔ | ✔ | – | – | – | – | – | – | | **doc\_parser** | – | – | ✔ | – | – | – | – | – | | **validator** | – | – | – | ✔ | – | – | – | – | | **db\_writer** | – | – | – | – | ✔ | – | – | – | | **embedder** | – | – | – | – | – | ✔ | – | – | | **rag\_qa** | – | – | – | – | – | – | ✔ | – | | **report\_builder** | – | – | – | – | ✔ | – | – | ✔ |

---

6 Future Agents (post‑MVP)

* **excel\_cashflow\_agent** – classify and normalise cash‑flow tables. * **term\_sheet\_agent** – draft term‑sheet JSON → legal markup. * **anomaly\_detector** – compare new deal metrics vs historical DB.

---

7 How to Extend

1. Define a new tool in this file. 2. Add a Python function in `backend/tools/`. 3. Add an agent entry with authorised tools. 4. Update routing in `controllers.py` once Codex orchestrator is integrated.

---

> *Generated 4 Jun 2025 via ChatGPT.*