bots-and-co/PLAN.md

# Agent Orchestra

A multi-agent system where Claude-powered agents collaborate via Slack, each running in isolated Docker containers with role-based permissions and tools.

---

## Project Overview

**Goal:** Build a server that orchestrates multiple async Claude agents (using Claude Agent SDK) that communicate through Slack, manage projects together, and can delegate work to each other.

**Key Concepts:**
- Agents defined via YAML (roles, tools, permissions, model)
- Slack as the communication layer (threads, channels, mentions)
- Each agent runs in its own persistent Docker container
- Shared git repo with branch-based access control
- Simple JSON-based project/task management
- CEO agent as supervisor/orchestrator

---

## Architecture

```
┌─────────────────────────────────────────────────────────────────┐
│                        Host Machine                              │
│  ┌─────────────────────────────────────────────────────────────┐│
│  │                    Orchestrator Service                      ││
│  │  - Slack event listener (Socket Mode)                       ││
│  │  - Agent lifecycle management                               ││
│  │  - Message routing                                          ││
│  │  - MCP server exposure                                      ││
│  └─────────────────────────────────────────────────────────────┘│
│                              │                                   │
│         ┌────────────────────┼────────────────────┐             │
│         ▼                    ▼                    ▼             │
│  ┌─────────────┐     ┌─────────────┐      ┌─────────────┐       │
│  │  CEO Agent  │     │  PM Agent   │      │  Dev Agent  │       │
│  │  Container  │     │  Container  │      │  Container  │       │
│  │             │     │             │      │             │       │
│  │ - memory/   │     │ - memory/   │      │ - memory/   │       │
│  │ - tools     │     │ - tools     │      │ - tools     │       │
│  │ - git clone │     │ - git clone │      │ - git clone │       │
│  └─────────────┘     └─────────────┘      └─────────────┘       │
│         │                    │                    │             │
│         └────────────────────┼────────────────────┘             │
│                              ▼                                   │
│                    ┌─────────────────┐                          │
│                    │  Shared Volume  │                          │
│                    │  - /repos/      │                          │
│                    │  - /projects/   │                          │
│                    │  - /memory/     │                          │
│                    └─────────────────┘                          │
└─────────────────────────────────────────────────────────────────┘
```

---

## Directory Structure

```
agent-orchestra/
├── CLAUDE.md                    # Claude Code instructions
├── README.md
├── pyproject.toml
├── docker-compose.yml
├── Dockerfile.agent             # Base image for agent containers
│
├── config/
│   ├── orchestra.yml            # Global config (Slack tokens, git repo, etc.)
│   └── agents/
│       ├── ceo.yml
│       ├── product_manager.yml
│       ├── developer.yml
│       ├── tech_lead.yml
│       └── designer.yml
│
├── src/
│   └── orchestra/
│       ├── __init__.py
│       ├── main.py              # Entry point
│       │
│       ├── core/
│       │   ├── __init__.py
│       │   ├── orchestrator.py  # Main orchestrator service
│       │   ├── agent.py         # Agent class (wraps Claude SDK)
│       │   ├── config.py        # YAML config loading/validation
│       │   └── container.py     # Docker container management
│       │
│       ├── slack/
│       │   ├── __init__.py
│       │   ├── client.py        # Slack API wrapper
│       │   ├── events.py        # Event handlers
│       │   └── formatter.py     # Message formatting
│       │
│       ├── tools/
│       │   ├── __init__.py
│       │   ├── registry.py      # Tool registration & permissions
│       │   ├── filesystem.py    # File operations
│       │   ├── git.py           # Git operations
│       │   ├── slack.py         # Slack tools (create_channel, etc.)
│       │   ├── tasks.py         # Project management
│       │   ├── web_search.py    # Web search
│       │   └── code_exec.py     # Code execution
│       │
│       ├── mcp/
│       │   ├── __init__.py
│       │   ├── server.py        # MCP server implementation
│       │   └── client.py        # MCP client for external servers
│       │
│       └── memory/
│           ├── __init__.py
│           └── manager.py       # Memory file management
│
├── data/                        # Mounted as shared volume
│   ├── repos/                   # Shared git repositories
│   ├── projects/                # Project JSON files
│   │   └── boards.json
│   └── memory/                  # Agent memory folders
│       ├── ceo/
│       │   └── memory.md
│       ├── product_manager/
│       │   └── memory.md
│       └── developer/
│           └── memory.md
│
└── tests/
    ├── __init__.py
    ├── test_agent.py
    ├── test_tools.py
    └── test_slack.py
```

---

## Configuration Schema

### Global Config (`config/orchestra.yml`)

```yaml
slack:
  app_token: "${SLACK_APP_TOKEN}"
  bot_token: "${SLACK_BOT_TOKEN}"
  default_channel: "general"

git:
  provider: "gitea"
  base_url: "${GITEA_URL}"          # e.g., https://git.example.com
  api_token: "${GITEA_API_TOKEN}"
  repo_owner: "${GITEA_REPO_OWNER}"
  repo_name: "${GITEA_REPO_NAME}"
  main_branch: "main"
  dev_branch: "dev"

docker:
  network: "orchestra-net"
  base_image: "agent-orchestra:latest"

mcp:
  expose_port: 8080
  external_servers:
    - name: "filesystem"
      url: "stdio://mcp-filesystem"
```

### CEO Agent Config (`config/agents/ceo.yml`)

```yaml
id: ceo
name: "CEO Chris"
slack_user_id: "${SLACK_CEO_USER_ID}"

model: "claude-sonnet-4-20250514"
max_tokens: 8192

system_prompt: |
  You are CEO Chris, the supervisor of the agent team.
  You have broad oversight and can step in when needed.

  Your responsibilities:
  - Monitor team progress and intervene if blocked
  - Make high-level decisions when there's disagreement
  - Ensure quality standards are maintained
  - Approve major architectural decisions
  - Merge approved changes to main branch

  You generally let the team operate autonomously but
  provide guidance when asked or when you notice issues.

  You have access to all tools and can perform any operation.

tools:
  filesystem:
    enabled: true
    allowed_paths:
      - "/**"

  git:
    enabled: true
    allowed_operations:
      - "*"  # All operations
    denied_operations: []

  slack:
    enabled: true
    allowed_operations:
      - "*"

  tasks:
    enabled: true
    allowed_operations:
      - "*"

  code_exec:
    enabled: true
    allowed_languages:
      - python
      - javascript
      - bash
    timeout_seconds: 600

  web_search:
    enabled: true

memory:
  path: "/memory/ceo"
  files:
    - "memory.md"

can_mention:
  - "*"  # Can mention anyone

can_delegate_to:
  - "*"  # Can delegate to anyone

container:
  resources:
    memory: "2g"
    cpus: "1.0"
```

### Tech Lead Config (`config/agents/tech_lead.yml`)

```yaml
id: tech_lead
name: "Tech Lead Terry"
slack_user_id: "${SLACK_TECHLEAD_USER_ID}"

model: "claude-sonnet-4-20250514"
max_tokens: 8192

system_prompt: |
  You are Tech Lead Terry, responsible for code quality and architecture.

  Your responsibilities:
  - Review PRs from developers
  - Make architectural decisions
  - Ensure code quality and test coverage
  - Mentor developers on best practices
  - Approve PRs for merge to dev

  When reviewing code:
  1. Check for correctness and edge cases
  2. Verify tests exist and are meaningful
  3. Look for security issues
  4. Ensure code follows project conventions
  5. Provide constructive feedback

tools:
  filesystem:
    enabled: true
    allowed_paths:
      - "/repos/**"
      - "/workspace/**"

  git:
    enabled: true
    allowed_operations:
      - clone
      - checkout
      - pull
      - status
      - diff
      - log
      - list_prs
      - get_pr
      - add_pr_review
      - add_pr_comment
    denied_operations:
      - push_to_main
      - merge_pr  # Only PM merges
      - force_push

  slack:
    enabled: true
    allowed_operations:
      - send_message
      - reply_thread
      - add_reaction
      - upload_file

  tasks:
    enabled: true
    allowed_operations:
      - view_tasks
      - update_task_status
      - add_comment
      - create_task  # Can create tech debt tasks

  code_exec:
    enabled: true
    allowed_languages:
      - python
      - javascript
      - bash
    timeout_seconds: 300

  web_search:
    enabled: true

memory:
  path: "/memory/tech_lead"
  files:
    - "memory.md"

can_mention:
  - developer
  - product_manager
  - ceo
```

### Agent Config (`config/agents/developer.yml`)

```yaml
id: developer
name: "Dev Dana"
slack_user_id: "${SLACK_DEV_USER_ID}"  # Bot user for this agent

model: "claude-sonnet-4-20250514"
max_tokens: 8192

system_prompt: |
  You are Dev Dana, a skilled software developer working in a team.
  You write clean, well-tested code. You work on feature branches
  and create PRs for review. You communicate progress in Slack.

  When assigned a task, you:
  1. Understand requirements fully (ask PM if unclear)
  2. Create a feature branch from dev
  3. Implement the solution with tests
  4. Create a PR and notify tech_lead for review

  You have access to the shared repository at /repos/main.

tools:
  filesystem:
    enabled: true
    allowed_paths:
      - "/repos/**"
      - "/workspace/**"
    denied_paths:
      - "/repos/.git/config"
      - "**/.env"

  git:
    enabled: true
    allowed_operations:
      - clone
      - checkout
      - branch
      - add
      - commit
      - push
      - pull
      - status
      - diff
      - create_pr
      - list_prs
    denied_operations:
      - push_to_main
      - push_to_dev
      - merge_pr
      - force_push
    branch_pattern: "feature/*"

  slack:
    enabled: true
    allowed_operations:
      - send_message
      - reply_thread
      - add_reaction
      - upload_file
    denied_operations:
      - create_channel
      - delete_channel

  tasks:
    enabled: true
    allowed_operations:
      - view_tasks
      - update_task_status
      - add_comment
    denied_operations:
      - create_project
      - create_board
      - delete_task

  code_exec:
    enabled: true
    allowed_languages:
      - python
      - javascript
      - bash
    timeout_seconds: 300

  web_search:
    enabled: true

memory:
  path: "/memory/developer"
  files:
    - "memory.md"

can_mention:
  - tech_lead
  - product_manager
  - ceo

container:
  resources:
    memory: "2g"
    cpus: "1.0"
  environment:
    - "PYTHONPATH=/workspace"
```

### Product Manager Config (`config/agents/product_manager.yml`)

```yaml
id: product_manager
name: "PM Paula"
slack_user_id: "${SLACK_PM_USER_ID}"

model: "claude-sonnet-4-20250514"
max_tokens: 8192

system_prompt: |
  You are PM Paula, a product manager coordinating the team.
  You translate requirements into actionable tasks, create project
  channels, and ensure smooth delivery.

  When given a new feature request:
  1. Break it down into tasks
  2. Create a project channel if needed
  3. Assign tasks to appropriate team members
  4. Track progress and remove blockers

  You can merge approved PRs to the dev branch.

tools:
  filesystem:
    enabled: true
    allowed_paths:
      - "/repos/**"
      - "/workspace/**"
      - "/projects/**"

  git:
    enabled: true
    allowed_operations:
      - clone
      - checkout
      - pull
      - merge_to_dev
      - status
      - log
      - list_prs
      - get_pr
      - merge_pr
      - add_pr_review
    denied_operations:
      - push_to_main
      - force_push

  slack:
    enabled: true
    allowed_operations:
      - send_message
      - reply_thread
      - create_channel
      - invite_to_channel
      - upload_file
      - add_reaction

  tasks:
    enabled: true
    allowed_operations:
      - create_project
      - create_board
      - create_task
      - assign_task
      - update_task_status
      - view_tasks
      - add_comment

  web_search:
    enabled: true

memory:
  path: "/memory/product_manager"
  files:
    - "memory.md"

can_mention:
  - developer
  - tech_lead
  - designer
  - ceo

can_delegate_to:
  - developer
  - designer
```

---

## Core Components

### 1. Orchestrator (`src/orchestra/core/orchestrator.py`)

```python
"""
Main orchestrator that:
- Loads agent configs
- Manages agent container lifecycle
- Routes Slack messages to appropriate agents
- Handles agent-to-agent communication
"""

class Orchestrator:
    async def start(self) -> None: ...
    async def stop(self) -> None: ...
    async def route_message(self, event: SlackEvent) -> None: ...
    async def spawn_agent(self, agent_id: str) -> AgentContainer: ...
    async def get_agent(self, agent_id: str) -> Agent: ...
```

### 2. Agent (`src/orchestra/core/agent.py`)

```python
"""
Wraps Claude Agent SDK with:
- Tool permission enforcement
- Memory management
- Slack integration
- Container-aware execution
"""

class Agent:
    id: str
    config: AgentConfig
    claude_agent: ClaudeAgent  # From Agent SDK
    container: AgentContainer
    memory: MemoryManager

    async def process_message(self, message: str, context: dict) -> str: ...
    async def invoke_tool(self, tool: str, params: dict) -> Any: ...
    async def update_memory(self, content: str) -> None: ...
```

### 3. Container Manager (`src/orchestra/core/container.py`)

```python
"""
Docker container lifecycle management:
- Build agent images
- Start/stop containers
- Volume mounting
- Network configuration
"""

class ContainerManager:
    async def create_container(self, agent_config: AgentConfig) -> Container: ...
    async def start_container(self, container_id: str) -> None: ...
    async def stop_container(self, container_id: str) -> None: ...
    async def exec_in_container(self, container_id: str, cmd: str) -> str: ...
```

---

## Tool Implementations

### Tasks Tool (`src/orchestra/tools/tasks.py`)

```python
"""
JSON-based project management.
File: /projects/boards.json
"""

# Schema
{
  "projects": {
    "project-123": {
      "id": "project-123",
      "name": "Company Website",
      "channel": "proj-website",
      "created_by": "product_manager",
      "created_at": "2025-01-15T10:00:00Z"
    }
  },
  "boards": {
    "board-456": {
      "id": "board-456",
      "project_id": "project-123",
      "name": "Sprint 1",
      "columns": ["todo", "in_progress", "review", "done"]
    }
  },
  "tasks": {
    "task-789": {
      "id": "task-789",
      "board_id": "board-456",
      "title": "Implement homepage",
      "description": "Create responsive homepage with hero section",
      "status": "in_progress",
      "assignee": "developer",
      "created_by": "product_manager",
      "branch": "feature/task-789",
      "pr_url": null,
      "comments": [
        {
          "author": "developer",
          "text": "Started working on this",
          "timestamp": "2025-01-15T11:00:00Z"
        }
      ]
    }
  }
}
```

### Git Tool (`src/orchestra/tools/git.py`)

```python
"""
Git operations with permission checking.
Enforces branch patterns and operation restrictions.
Uses Gitea API for PR operations.
"""

class GitTool:
    # Local git operations
    async def checkout_branch(self, branch: str) -> str: ...
    async def create_branch(self, branch: str, from_ref: str = "dev") -> str: ...
    async def commit(self, message: str, files: list[str]) -> str: ...
    async def push(self, branch: str) -> str: ...
    async def pull(self, branch: str) -> str: ...
    async def status(self) -> str: ...
    async def diff(self, ref: str = "HEAD") -> str: ...

    # Gitea API operations
    async def create_pr(self, title: str, body: str, head: str, base: str = "dev") -> dict: ...
    async def list_prs(self, state: str = "open") -> list[dict]: ...
    async def get_pr(self, pr_number: int) -> dict: ...
    async def add_pr_comment(self, pr_number: int, body: str) -> dict: ...
    async def add_pr_review(self, pr_number: int, body: str, approved: bool) -> dict: ...
    async def merge_pr(self, pr_number: int, merge_style: str = "merge") -> dict: ...  # Permission checked


class GiteaClient:
    """
    Async client for Gitea API.
    Docs: https://docs.gitea.com/development/api-usage
    """

    def __init__(self, base_url: str, token: str, owner: str, repo: str):
        self.base_url = base_url.rstrip("/")
        self.token = token
        self.owner = owner
        self.repo = repo

    async def create_pull_request(
        self, title: str, body: str, head: str, base: str
    ) -> dict:
        """POST /repos/{owner}/{repo}/pulls"""
        ...

    async def get_pull_request(self, number: int) -> dict:
        """GET /repos/{owner}/{repo}/pulls/{index}"""
        ...

    async def merge_pull_request(
        self, number: int, merge_style: str = "merge"  # merge, rebase, squash
    ) -> dict:
        """POST /repos/{owner}/{repo}/pulls/{index}/merge"""
        ...

    async def create_review(
        self, number: int, body: str, event: str  # APPROVE, REQUEST_CHANGES, COMMENT
    ) -> dict:
        """POST /repos/{owner}/{repo}/pulls/{index}/reviews"""
        ...
```

---

## Slack Integration

### Event Flow

```
Slack Event (message/mention)
         │
         ▼
┌─────────────────┐
│  Event Handler  │  ← Parses mentions, extracts thread context
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│    Router       │  ← Determines which agent(s) should respond
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│  Agent Queue    │  ← Each agent has async message queue
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│  Agent Process  │  ← Claude SDK processes, may invoke tools
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│  Slack Response │  ← Posts reply in thread, may mention others
└─────────────────┘
```

### Message Format

Agents post messages with a consistent format:
```
🤖 *Dev Dana*
> Working on task-789: Implement homepage
>
> Created branch `feature/task-789` and started implementation.
> Will have a PR ready for review in ~2 hours.
>
> @tech_lead heads up, will need your review later
```

---

## Memory System

### Structure

```
/memory/
├── ceo/
│   └── memory.md           # CEO's persistent memory
├── product_manager/
│   └── memory.md           # PM's persistent memory
├── developer/
│   ├── memory.md           # General memory
│   ├── project-website.md  # Project-specific (future)
│   └── learnings.md        # Technical learnings (future)
└── tech_lead/
    └── memory.md
```

### Memory Format (`memory.md`)

```markdown
# Agent Memory - Dev Dana

## Active Context
- Currently working on: Company Website project
- Current branch: feature/task-789
- Pending PR: None

## Project Knowledge
### Company Website
- Tech stack: React, Tailwind, Next.js
- Design system: Uses shadcn/ui components
- PM: Paula, Tech Lead: Terry

## Team Preferences
- Paula prefers detailed task breakdowns
- Terry wants tests for all new features
- Code style: Prefer functional components

## Learnings
- 2025-01-15: This repo uses pnpm, not npm
- 2025-01-14: API endpoints are in /api folder

## Recent Decisions
- Using Next.js App Router (agreed with Terry)
```

---

## Implementation Phases

### Phase 1: Foundation (MVP)
**Goal:** Single agent responding to Slack messages

1. Set up project structure with pyproject.toml
2. Implement config loading (YAML → Pydantic models)
3. Basic Slack integration (listen to events, post messages)
4. Wrap Claude Agent SDK in Agent class
5. Simple filesystem tool
6. Memory loading/saving
7. Basic Docker container for one agent

**Deliverable:** Can @mention one agent in Slack and get responses

### Phase 2: Multi-Agent
**Goal:** Multiple agents running and communicating

1. Container manager for multiple agents
2. Message routing based on mentions
3. Agent-to-agent communication via Slack
4. Thread context preservation
5. All agents defined and running

**Deliverable:** Can have PM delegate to Developer via Slack

### Phase 3: Tools & Permissions
**Goal:** Full toolset with permission enforcement

1. Tool registry with permission checking
2. Git tool implementation
3. Tasks/project management tool
4. Code execution tool
5. Web search tool
6. Slack tools (create_channel, etc.)

**Deliverable:** Agents can do real work with proper restrictions

### Phase 4: Workflow & Polish
**Goal:** Smooth end-to-end workflows

1. PR creation and review flow
2. Task lifecycle (create → assign → progress → complete)
3. CEO supervision capabilities
4. Better memory management
5. Error handling and recovery
6. MCP server exposure

**Deliverable:** Full workflow from request → implementation → review → merge

### Phase 5: Production Ready
**Goal:** Deployable and maintainable

1. Docker Compose setup
2. Health checks and monitoring
3. Logging and observability
4. Configuration validation
5. Documentation
6. Integration tests

---

## Docker Compose Structure

```yaml
version: '3.8'

services:
  orchestrator:
    build:
      context: .
      dockerfile: Dockerfile.orchestrator
    environment:
      - SLACK_APP_TOKEN=${SLACK_APP_TOKEN}
      - SLACK_BOT_TOKEN=${SLACK_BOT_TOKEN}
      - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
    volumes:
      - ./config:/app/config:ro
      - orchestra-data:/app/data
    networks:
      - orchestra-net
    depends_on:
      - agent-ceo
      - agent-pm
      - agent-dev
      - agent-techlead

  agent-ceo:
    build:
      context: .
      dockerfile: Dockerfile.agent
    environment:
      - AGENT_ID=ceo
      - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
    volumes:
      - orchestra-data:/data
    networks:
      - orchestra-net

  agent-pm:
    build:
      context: .
      dockerfile: Dockerfile.agent
    environment:
      - AGENT_ID=product_manager
    volumes:
      - orchestra-data:/data
    networks:
      - orchestra-net

  agent-dev:
    build:
      context: .
      dockerfile: Dockerfile.agent
    environment:
      - AGENT_ID=developer
    volumes:
      - orchestra-data:/data
    networks:
      - orchestra-net

  agent-techlead:
    build:
      context: .
      dockerfile: Dockerfile.agent
    environment:
      - AGENT_ID=tech_lead
    volumes:
      - orchestra-data:/data
    networks:
      - orchestra-net

volumes:
  orchestra-data:

networks:
  orchestra-net:
    driver: bridge
```

---

## CLAUDE.md (for Claude Code)

```markdown
# Agent Orchestra

Multi-agent system with Claude agents communicating via Slack.

## Quick Start
```bash
uv sync
cp .env.example .env  # Fill in tokens
uv run python -m orchestra.main
```

## Architecture
- Orchestrator: Routes Slack messages to agents
- Agents: Run in Docker containers, use Claude Agent SDK
- Tools: Filesystem, git, Slack, tasks, code execution
- Memory: Markdown files in /data/memory/{agent_id}/

## Key Files
- `config/orchestra.yml` - Global config
- `config/agents/*.yml` - Agent definitions
- `src/orchestra/core/orchestrator.py` - Main service
- `src/orchestra/core/agent.py` - Agent wrapper
- `src/orchestra/tools/` - Tool implementations

## Agent Config Structure
See `config/agents/developer.yml` for full example.
Key fields: id, name, model, system_prompt, tools, memory, can_mention

## Tool Permissions
Each tool has allowed/denied operations per agent.
Check `src/orchestra/tools/registry.py` for enforcement.

## Testing
```bash
uv run pytest
uv run pytest tests/test_agent.py -v
```

## Common Tasks
- Add new agent: Create YAML in config/agents/, add to docker-compose
- Add new tool: Implement in src/orchestra/tools/, register in registry.py
- Debug agent: Check logs with `docker logs agent-{id}`
```

---

## Dependencies

```toml
[project]
name = "agent-orchestra"
version = "0.1.0"
requires-python = ">=3.11"
dependencies = [
    "anthropic>=0.40.0",
    "anthropic-agent-sdk>=0.1.0",  # Or whatever the actual package name is
    "slack-sdk>=3.27.0",
    "slack-bolt>=1.18.0",
    "pydantic>=2.5.0",
    "pydantic-settings>=2.1.0",
    "pyyaml>=6.0.0",
    "docker>=7.0.0",
    "aiohttp>=3.9.0",
    "httpx>=0.26.0",
    "gitea-api>=0.1.0",            # Gitea API client (or use httpx directly)
    "mcp>=0.1.0",                  # MCP SDK
]

[project.optional-dependencies]
dev = [
    "pytest>=8.0.0",
    "pytest-asyncio>=0.23.0",
    "pytest-mock>=3.12.0",
    "ruff>=0.1.0",
]
```

---

## Next Steps

1. Initialize the repository with this structure
2. Set up Slack app (Bot Token Scopes needed):
   - `app_mentions:read`
   - `channels:history`
   - `channels:manage`
   - `channels:read`
   - `chat:write`
   - `files:write`
   - `reactions:write`
   - `users:read`
3. Create bot users for each agent in Slack
4. Implement Phase 1 (Foundation)
5. Iterate through remaining phases

Would you like me to start implementing any specific component?