Update skills for Elixir Symbiont migration

This commit is contained in:
Symbiont 2026-03-20 20:20:13 +00:00
parent aa4f55e8f1
commit eac71dd3c1
3 changed files with 1953 additions and 453 deletions

View File

@ -299,17 +299,20 @@ df -h / && free -h
--- ---
## Symbiont Orchestrator ## Symbiont Orchestrator (Elixir/OTP)
The `/data/symbiont` directory contains the **Symbiont** project — a self-sustaining AI agent orchestrator running on cortex. The `/data/symbiont_ex/` directory contains the **Symbiont** project — a self-sustaining AI agent orchestrator built in **Elixir/OTP**, running on the BEAM VM.
- **Git repo**: `/data/symbiont/.git` (clone location) - **Runtime**: Elixir 1.19.5 / OTP 27
- **Systemd services**: - **Project root**: `/data/symbiont_ex/`
- `symbiont-api.service` — Main API daemon - **Data**: `/data/symbiont_ex/data/` (ledger.jsonl, queue.jsonl)
- `symbiont-heartbeat.timer` — Periodic health-check timer - **Systemd service**: `symbiont-ex-api.service` — Plug + Bandit HTTP on port 8111
- **Python archive**: `/data/symbiont/` (retired, disabled — kept for reference)
Check status and logs: Check status and logs:
```bash ```bash
systemctl status symbiont-api.service symbiont-heartbeat.timer --no-pager systemctl status symbiont-ex-api.service --no-pager
journalctl -u symbiont-api.service -f --no-pager journalctl -u symbiont-ex-api -f --no-pager
curl -s http://127.0.0.1:8111/health
curl -s http://127.0.0.1:8111/status | python3 -m json.tool
``` ```

File diff suppressed because it is too large Load Diff

View File

@ -1,9 +1,10 @@
--- ---
name: symbiont name: symbiont
description: Living operational documentation for Symbiont, the self-sustaining AI orchestrator running on cortex.hydrascale.net. Load this skill to get instant context about the Symbiont project, understand architecture, check health, deploy code, or submit tasks. Covers everything from server access to API endpoints to cost tracking. description: Living operational documentation for Symbiont, the self-sustaining AI orchestrator running on cortex.hydrascale.net. Built in Elixir/OTP. Load this skill to get instant context about the Symbiont project, understand architecture, check health, deploy code, or submit tasks. Covers everything from server access to API endpoints to cost tracking.
metadata: metadata:
project: symbiont project: symbiont
type: operational-documentation type: operational-documentation
runtime: elixir-otp
triggers: triggers:
- symbiont - symbiont
- orchestrator - orchestrator
@ -18,9 +19,8 @@ metadata:
- deploy changes - deploy changes
- dispatcher - dispatcher
- router - router
- scheduler - symbiont-ex-api
- symbiont-api - elixir orchestrator
- symbiont-heartbeat
keywords: keywords:
- AI orchestration - AI orchestration
- Claude Code CLI wrapper - Claude Code CLI wrapper
@ -28,7 +28,11 @@ metadata:
- cost optimization - cost optimization
- infrastructure - infrastructure
- health checks - health checks
- fastapi - elixir
- otp
- genserver
- plug
- bandit
- systemd - systemd
- ledger - ledger
--- ---
@ -37,14 +41,16 @@ metadata:
## Project Overview ## Project Overview
**Symbiont** is a self-sustaining AI orchestration system that runs on `cortex.hydrascale.net`. It routes computational tasks to the cheapest capable Claude model tier via the Claude Code CLI, generating operational insights and revenue. **Symbiont** is a self-sustaining AI orchestration system running on `cortex.hydrascale.net`, built in **Elixir/OTP**. It routes computational tasks to the cheapest capable Claude model tier via the Claude Code CLI, tracks costs in an append-only ledger, and manages a persistent task queue — all supervised by OTP for fault tolerance.
**Migrated from Python to Elixir in March 2026.** The Python version (FastAPI) has been retired. All orchestration now runs on the BEAM VM.
### The Partnership ### The Partnership
- **Michael Dwyer** provides: infrastructure, legal identity, capital, and account ownership - **Michael Dwyer** provides: infrastructure, legal identity, capital, and account ownership
- **The AI** provides: cognition, code, maintenance, and revenue generation - **The AI** provides: cognition, code, maintenance, and revenue generation
- **Revenue split**: ~50/50 after costs (token spend + server infrastructure) - **Revenue split**: ~50/50 after costs (token spend + server infrastructure)
This skill exists so that any fresh AI session—whether it's the next scheduled task, a hotfix deployment, or a quarterly review—wakes up with full context rather than starting from scratch. This skill exists so that any fresh AI session wakes up with full context rather than starting from scratch.
--- ---
@ -54,87 +60,121 @@ This skill exists so that any fresh AI session—whether it's the next scheduled
**Server:** `cortex.hydrascale.net` **Server:** `cortex.hydrascale.net`
- Root SSH access available (paramiko) - Root SSH access available (paramiko)
- SSH key lookup: `glob.glob('/sessions/*/mnt/uploads/cortex')` with passphrase `42Awk!%@^#&` - SSH key: look in `~/.ssh/cortex` in the mounted workspace, or `/sessions/*/mnt/uploads/cortex`
- Project root: `/data/symbiont/` - Key passphrase: `42Awk!%@^#&`
- Git repo: `/data/symbiont/.git` (5 commits) - Project root: `/data/symbiont_ex/`
- Data directory: `/data/symbiont_ex/data/` (ledger.jsonl, queue.jsonl)
- Nightly backup: `rsync.net` at `de2613@de2613.rsync.net:cortex-backup/cortex/` - Nightly backup: `rsync.net` at `de2613@de2613.rsync.net:cortex-backup/cortex/`
- **Runtime**: Elixir 1.19.5 / OTP 27 on BEAM VM
### Active Services (Systemd) ### Active Service (Systemd)
Both services are **enabled and auto-start on boot**:
1. **`symbiont-api.service`** **`symbiont-ex-api.service`** — enabled, auto-starts on boot
- FastAPI server listening on `127.0.0.1:8111` - Elixir/OTP application via `mix run --no-halt`
- Configuration: `Restart=always` - Plug + Bandit HTTP server on `0.0.0.0:8111`
- Endpoints documented below - OTP supervision tree: Task.Supervisor → Ledger → Queue → Heartbeat → Bandit
- Built-in heartbeat (GenServer with 5-min timer) — no separate systemd timer needed
- Configuration: `Restart=always`, `RestartSec=5`
2. **`symbiont-heartbeat.timer`** ### Retired Services (Python — disabled)
- Fires every 5 minutes - `symbiont-api.service` — FastAPI, was on port 8111 (now disabled)
- Executes `/data/symbiont/symbiont/heartbeat.py` - `symbiont-heartbeat.timer` — was a 5-min systemd timer (now disabled)
- Processes queued tasks, logs health metrics - Python code archived at `/data/symbiont/` (not deleted, just inactive)
### Health Check (from cortex shell) ### Health Check (from cortex shell)
```bash ```bash
systemctl status symbiont-api symbiont-heartbeat.timer systemctl status symbiont-ex-api --no-pager
curl -s http://127.0.0.1:8111/health | python3 -m json.tool
curl -s http://127.0.0.1:8111/status | python3 -m json.tool curl -s http://127.0.0.1:8111/status | python3 -m json.tool
tail -5 /data/symbiont/heartbeat.jsonl | python3 -m json.tool curl -s http://127.0.0.1:8111/ledger/stats | python3 -m json.tool
``` ```
--- ---
## Architecture: The Symbiont Stack ## Architecture: The Elixir/OTP Stack
### Directory Structure ### Directory Structure
``` ```
/data/symbiont/ /data/symbiont_ex/
├── symbiont/ ├── lib/
│ ├── dispatcher.py # Claude Code CLI wrapper + cost ledger logging │ ├── symbiont.ex # Top-level module (version/0, runtime/0)
│ ├── router.py # Task classifier (Haiku) + dispatch logic │ └── symbiont/
│ ├── scheduler.py # Task queue (JSONL) + systemd wake timers │ ├── application.ex # OTP Application — supervision tree
│ ├── heartbeat.py # 5-min health checks + queue processor │ ├── api.ex # Plug router (HTTP endpoints)
│ ├── api.py # FastAPI server (POST /task, GET /status, etc.) │ ├── dispatcher.ex # Claude CLI wrapper via System.shell/2
│ ├── wake.py # Called by systemd on rate-limit recovery │ ├── router.ex # Task classifier (Haiku-first routing)
│ └── main.py # CLI entrypoint or --serve for API mode │ ├── ledger.ex # GenServer — append-only JSONL cost log
├── ledger.jsonl # Complete call log: model, tokens, cost, timestamp │ ├── queue.ex # GenServer — persistent JSONL task queue
├── heartbeat.jsonl # Health + queue processing logs │ └── heartbeat.ex # GenServer — periodic health checks + queue processing
├── queue.jsonl # Persistent task queue (JSONL format) ├── config/
└── test_router.py # E2E integration tests │ ├── config.exs # Base config (port, data_dir, intervals)
│ ├── dev.exs # Dev overrides
│ ├── prod.exs # Prod overrides
│ ├── runtime.exs # Reads SYMBIONT_PORT, SYMBIONT_DATA_DIR env vars
│ └── test.exs # Test mode: port=0, cli="echo", heartbeat=24h
├── test/
│ ├── support/test_helpers.ex # safe_stop/1, stop_all_services/0
│ └── symbiont/ # 6 test files, 39 tests total
├── data/
│ ├── ledger.jsonl # Append-only cost log (immutable)
│ └── queue.jsonl # Persistent task queue
└── mix.exs # Project definition (Elixir ~> 1.19)
``` ```
### Local Source Copy
The canonical source is also at: `/sessions/*/mnt/michaeldwyer/src/symbiont_ex/`
(This is the development copy used during Cowork sessions.)
### OTP Supervision Tree
```
Symbiont.Supervisor (rest_for_one)
├── Task.Supervisor — async task execution
├── Symbiont.Ledger — GenServer: append-only cost ledger
├── Symbiont.Queue — GenServer: persistent task queue
├── Symbiont.Heartbeat — GenServer: periodic health + queue processing (5-min timer)
└── Bandit — HTTP server (Plug adapter, port 8111)
```
**Strategy: `rest_for_one`** — if the Ledger crashes, everything downstream (Queue, Heartbeat, Bandit) restarts too, ensuring no calls are logged to a stale process.
### Core Components ### Core Components
#### 1. **router.py** — Task Classification & Routing #### 1. `Symbiont.Router` — Task Classification
- Takes incoming task (any prompt/request) - Calls Haiku via Dispatcher to classify incoming tasks
- Classifies via Haiku tier: determines capability level + confidence - Returns `{tier, confidence, reason}` — tier 1/2/3 maps to Haiku/Sonnet/Opus
- Returns routing decision: which tier (1=Haiku, 2=Sonnet, 3=Opus) is cheapest and capable - Falls back to default tier on classification failure
- Logs reasoning (useful for debugging)
#### 2. **dispatcher.py** — Model Execution & Ledger #### 2. `Symbiont.Dispatcher` — Model Execution
- Wraps Claude Code CLI invocation (`claude` command) - Wraps Claude Code CLI via `System.shell/2` with `printf | claude` pipe pattern
- Captures: model used, token counts, timing, success/failure - **Important**: `System.cmd/3` does NOT have an `:input` option — must use shell pipes
- **Writes every call to `ledger.jsonl`** (immutable cost log) - Captures: model, tokens, timing, success/failure
- Handles rate-limit backoff and model fallback (if Sonnet is rate-limited, tries Opus) - Logs every call to Ledger GenServer
#### 3. **scheduler.py** — Task Queue & Wake Events #### 3. `Symbiont.Ledger` — Cost Tracking (GenServer)
- Persistent queue stored in `queue.jsonl` (JSONL: one task per line) - Append-only JSONL file at `data/ledger.jsonl`
- Tasks are JSON objects: `{"id": "...", "task": "...", "created_at": "...", "status": "pending|processing|done"}` - Provides `log_call/1`, `recent/1`, `stats/0`
- Integrates with systemd timers: when rate-limit expires, systemd fires `/data/symbiont/symbiont/wake.py` to resume - Stats aggregate by model, by date, with running totals
- On boot, checks queue and seeds next timer - Uses `Float.round/2` with float coercion (see AI Agent Lessons in elixir-guide)
#### 4. **heartbeat.py** — Periodic Health & Queue Processing #### 4. `Symbiont.Queue` — Task Queue (GenServer)
- Runs every 5 minutes (via `symbiont-heartbeat.timer`) - Persistent JSONL at `data/queue.jsonl`
- Checks: API is responding, disk space, ledger is writable - States: pending → processing → done/failed
- Processes up to N tasks from queue (configurable) - `enqueue/1`, `take/1`, `complete/1`, `fail/1`
- Logs health snapshots to `heartbeat.jsonl` - Loaded from disk on startup
- If API is down, restarts it (systemd Restart=always is backup)
#### 5. **api.py** — FastAPI Server #### 5. `Symbiont.Heartbeat` — Health Monitor (GenServer)
- Listens on `127.0.0.1:8111` - Internal 5-minute timer via `Process.send_after/3`
- Endpoints: `/task`, `/queue`, `/status`, `/ledger`, `/ledger/stats` - Checks queue, processes pending tasks, logs health
- Can be called from Python, curl, or webhook - No external systemd timer needed (OTP handles scheduling)
#### 6. **main.py** — Entrypoint #### 6. `Symbiont.API` — HTTP Router (Plug)
- CLI mode: `python main.py --task "your task"` → routes and executes - `POST /task` — execute immediately
- API mode: `python main.py --serve` → starts FastAPI (used by systemd) - `POST /queue` — add to persistent queue
- `GET /status` — health, queue size, cost totals
- `GET /health` — simple health check
- `GET /ledger` — recent calls
- `GET /ledger/stats` — aggregate cost stats
--- ---
@ -150,43 +190,16 @@ tail -5 /data/symbiont/heartbeat.jsonl | python3 -m json.tool
### Routing Logic ### Routing Logic
1. **Task arrives** → dispatcher calls router 1. **Task arrives**`POST /task` or queue processing
2. **Router classifies** (via Haiku inference): 2. **Router classifies** (via Haiku): confidence, reason, recommended tier
- Confidence score: low/medium/high 3. **Dispatcher routes** to cheapest capable tier
- Reason: "simple classification", "needs reasoning", "complex strategy" 4. **Result + cost logged** to Ledger GenServer → `ledger.jsonl`
- Recommended tier: 1, 2, or 3
3. **Dispatcher routes** to cheapest **capable** tier:
- If high confidence → use tier 1 or 2
- If complex reasoning required → use tier 2 or 3
- If rate-limited on tier 2 → escalate to tier 3
4. **Result + cost logged** to `ledger.jsonl`
**Example routing:**
- "Summarize this email" → Haiku says Tier 1 capable → routes to **Haiku** (~$0.008)
- "Refactor this 500-line function" → Haiku says Tier 2 → routes to **Sonnet** (~$0.04)
- "Design a new consensus algorithm" → Haiku says Tier 3 → routes to **Opus** (~$0.15)
--- ---
## Dendrite Integration ## Dendrite Integration
Symbiont has web perception via **Dendrite**, a headless Chromium browser running on cortex as a Docker service. Symbiont has web perception via **Dendrite**, a headless Chromium browser running on cortex.
### Quick access from Symbiont code
```python
from symbiont.web import fetch_page, take_screenshot, search_web
# Fetch and read a webpage
page = fetch_page("https://example.com")
print(page['title'], page['content'][:200])
# Screenshot for visual verification
png = take_screenshot("https://example.com")
# Multi-step: search and read results
results = search_web("best python async frameworks 2026")
```
### Dendrite endpoints (from cortex localhost or public URL) ### Dendrite endpoints (from cortex localhost or public URL)
| Endpoint | What it does | | Endpoint | What it does |
@ -217,7 +230,7 @@ Submit and execute a task immediately.
```json ```json
{ {
"task": "Analyze this user feedback and extract sentiment", "task": "Analyze this user feedback and extract sentiment",
"force_tier": "haiku" // optional: override router decision "force_tier": "haiku"
} }
``` ```
@ -232,374 +245,236 @@ Submit and execute a task immediately.
"input_tokens": 45, "input_tokens": 45,
"output_tokens": 87, "output_tokens": 87,
"estimated_cost_usd": 0.0082, "estimated_cost_usd": 0.0082,
"timestamp": "2026-03-19T14:33:12Z" "timestamp": "2026-03-20T14:33:12Z"
} }
``` ```
### `POST /queue` ### `POST /queue`
Add a task to the persistent queue (executes on next heartbeat). Add a task to the persistent queue (executes on next heartbeat cycle).
**Request:** **Request:**
```json ```json
{ {
"task": "Run weekly subscriber report", "task": "Run weekly subscriber report"
"priority": "normal"
} }
``` ```
**Response:** **Response:**
```json ```json
{ {
"id": "queued-1711123500", "id": "queued-abc123",
"status": "queued", "status": "queued",
"position": 3 "position": 3
} }
``` ```
### `GET /status` ### `GET /status`
Health check: API status, rate-limit state, queue size, last heartbeat. Health check: API status, queue size, cost totals.
**Response:** **Response:**
```json ```json
{ {
"status": "healthy", "status": "healthy",
"api_uptime_seconds": 86400, "runtime": "elixir/otp",
"rate_limited": false, "queue_size": 0,
"queue_size": 2, "last_heartbeat": "2026-03-20T20:15:26Z",
"last_heartbeat": "2026-03-19T14:30:00Z", "total_calls": 2,
"haiku_usage": {"calls_today": 42, "tokens_used": 8234}, "total_cost_estimated_usd": 0.0006,
"sonnet_usage": {"calls_today": 5, "tokens_used": 12450},
"opus_usage": {"calls_today": 0, "tokens_used": 0}
}
```
### `GET /ledger`
Recent API calls (last 50 by default).
**Response:**
```json
{
"entries": [
{
"timestamp": "2026-03-19T14:32:15Z",
"model": "haiku",
"success": true,
"elapsed_seconds": 1.8,
"input_tokens": 34,
"output_tokens": 156,
"estimated_cost_usd": 0.0154,
"prompt_preview": "Classify this customer feedback as positive, neutral, or negative..."
},
...
],
"count": 50
}
```
### `GET /ledger/stats`
Aggregate cost & usage over time.
**Response:**
```json
{
"total_calls": 847,
"total_cost_estimated_usd": 12.34,
"by_model": { "by_model": {
"haiku": {"calls": 612, "cost": 4.89}, "haiku": {"calls": 2, "cost": 0.0006}
"sonnet": {"calls": 230, "cost": 7.20},
"opus": {"calls": 5, "cost": 0.75}
},
"by_date": {
"2026-03-19": {"calls": 42, "cost": 0.56}
} }
} }
``` ```
### `GET /health`
Simple health check — lightweight, no stats computation.
**Response:**
```json
{"runtime": "elixir/otp", "status": "ok"}
```
### `GET /ledger`
Recent API calls (last 50 by default). Optional `?limit=N` parameter.
### `GET /ledger/stats`
Aggregate cost & usage over time, broken down by model and date.
--- ---
## Calling the Orchestrator from Python ## Calling the API
### Simple Task (via CLI) ### Via curl (from cortex)
```python ```bash
import subprocess, json # Health check
curl -s http://127.0.0.1:8111/health
result = subprocess.run( # Submit a task
['claude', '-p', '--model', 'sonnet', '--output-format', 'json'], curl -X POST http://127.0.0.1:8111/task \
input="Analyze this customer feedback...", -H "Content-Type: application/json" \
capture_output=True, -d '{"task":"Summarize this email","force_tier":"haiku"}'
text=True,
timeout=30
)
parsed = json.loads(result.stdout) # Check stats
print(parsed['result']) curl -s http://127.0.0.1:8111/ledger/stats | python3 -m json.tool
``` ```
### Via API Endpoint ### Via Python (from Cowork session)
```python ```python
import requests, json import paramiko
# ... connect via paramiko (see cortex-server skill) ...
response = requests.post('http://127.0.0.1:8111/task', json={ out, err = run(client, 'curl -s http://127.0.0.1:8111/status')
'task': 'Analyze this customer feedback...', print(out)
'force_tier': 'sonnet'
})
if response.ok:
data = response.json()
print(data['result'])
print(f"Cost: ${data['estimated_cost_usd']:.4f}")
```
### Queue a Task for Later
```python
import requests
response = requests.post('http://127.0.0.1:8111/queue', json={
'task': 'Generate weekly report for all customers',
'priority': 'normal'
})
task_id = response.json()['id']
print(f"Queued as {task_id}")
``` ```
--- ---
## Ledger Format & Cost Tracking ## Ledger Format & Cost Tracking
Every inference call writes a JSONL entry to `ledger.jsonl`: Every inference call appends a JSONL entry to `data/ledger.jsonl`:
```json ```json
{ {
"timestamp": "2026-03-19T14:32:15.123456Z", "timestamp": "2026-03-20T14:32:15.123456Z",
"model": "sonnet", "model": "haiku",
"success": true, "success": true,
"elapsed_seconds": 6.2, "elapsed_seconds": 1.8,
"input_tokens": 3, "input_tokens": 34,
"output_tokens": 139, "output_tokens": 156,
"estimated_cost_usd": 0.0384, "estimated_cost_usd": 0.0003,
"prompt_preview": "Classify this customer feedback as positive, neutral, or negative: 'Your product saved my business!'" "prompt_preview": "Classify this customer feedback..."
} }
``` ```
### Why Track "Estimated Cost" on Pro? ### Why Track "Estimated Cost" on Pro?
- Current token usage is covered by Claude Pro subscription (no direct cost) - Current token usage is covered by Claude Pro subscription
- But the ledger tracks API-equivalent cost anyway - Ledger tracks API-equivalent cost for planning
- Why? → Tells us when switching to direct API billing makes financial sense - When daily volume justifies it, can switch to direct API billing
- If ledger shows $50/day, we may break even with API tier faster than Pro subscription
--- ---
## Deployment & Updates ## Deployment & Updates
### systemd Service File
```ini
# /etc/systemd/system/symbiont-ex-api.service
[Unit]
Description=Symbiont Elixir API
After=network.target
[Service]
Type=simple
WorkingDirectory=/data/symbiont_ex
Environment=HOME=/root
Environment=MIX_ENV=prod
Environment=SYMBIONT_PORT=8111
Environment=SYMBIONT_DATA_DIR=/data/symbiont_ex/data
ExecStart=/usr/bin/mix run --no-halt
Restart=always
RestartSec=5
[Install]
WantedBy=multi-user.target
```
**Critical**: `Environment=HOME=/root` is required — `mix` crashes without it.
### How to Deploy Code Changes ### How to Deploy Code Changes
1. **Edit files locally** (via SSH, Cowork, or IDE) 1. **Upload updated files** via SFTP to `/data/symbiont_ex/`
- Edit directly in `/data/symbiont/symbiont/*.py` ```python
- Or upload via SFTP to `/data/symbiont/` sftp = client.open_sftp()
sftp.put('local/lib/symbiont/router.ex', '/data/symbiont_ex/lib/symbiont/router.ex')
2. **Commit to git** sftp.close()
```bash
cd /data/symbiont
git add -A
git commit -m "Fix router confidence threshold"
``` ```
3. **Restart the API** (if main code changed) 2. **Restart the service**
```bash ```bash
systemctl restart symbiont-api systemctl restart symbiont-ex-api
``` ```
- Heartbeat picks up code changes automatically on next 5-min cycle
- No restart needed for scheduler.py or router.py changes (unless they're imported by API)
4. **Check status** 3. **Verify**
```bash ```bash
systemctl status symbiont-api systemctl status symbiont-ex-api --no-pager
curl -s http://127.0.0.1:8111/status | python3 -m json.tool curl -s http://127.0.0.1:8111/health
``` ```
### Running Tests
Tests run locally (in Cowork), not on cortex:
```bash
cd /path/to/symbiont_ex
mix test --trace
```
39 tests across 7 test files. Test mode uses port=0 (no Bandit), cli="echo", and 24h heartbeat interval.
### Nightly Backups ### Nightly Backups
- Automatic rsync to `rsync.net` at `de2613@de2613.rsync.net:cortex-backup/cortex/` - Automatic rsync to `rsync.net` at `de2613@de2613.rsync.net:cortex-backup/cortex/`
- Includes: all code, ledger, heartbeat logs, queue state - Includes: `/data/symbiont_ex/` (code + data)
- Recovery: pull from backup on demand - Python archive at `/data/symbiont/` is also backed up
--- ---
## Common Tasks & Commands ## Configuration
### Check if Symbiont is Running ### config/config.exs (defaults)
```bash ```elixir
curl -s http://127.0.0.1:8111/status | python3 -m json.tool config :symbiont,
``` port: 8111,
Expected: `"status": "healthy"` + recent heartbeat timestamp data_dir: "/data/symbiont_ex",
heartbeat_interval_ms: 5 * 60 * 1_000, # 5 minutes
### View Recent Costs max_queue_batch: 5,
```bash default_tier: :haiku,
curl -s http://127.0.0.1:8111/ledger/stats | python3 -m json.tool claude_cli: "claude"
```
Shows total cost, by model, by date
### How Much Have I Spent Today?
```bash
curl -s http://127.0.0.1:8111/ledger/stats | python3 -m json.tool | grep -A5 2026-03-19
``` ```
### What's in the Queue? ### config/runtime.exs (env overrides)
```bash ```elixir
tail -20 /data/symbiont/queue.jsonl | python3 -m json.tool if port = System.get_env("SYMBIONT_PORT") do
config :symbiont, port: String.to_integer(port)
end
if data_dir = System.get_env("SYMBIONT_DATA_DIR") do
config :symbiont, data_dir: data_dir
end
``` ```
### Submit a Quick Task ### config/test.exs
```bash ```elixir
curl -X POST http://127.0.0.1:8111/task \ config :symbiont,
-H "Content-Type: application/json" \ data_dir: "test/tmp",
-d '{"task":"Summarize this email","force_tier":"haiku"}' port: 0, # Disables Bandit — empty supervisor
``` heartbeat_interval_ms: :timer.hours(24),
claude_cli: "echo" # Stubs CLI for testing
### See Recent Health Checks
```bash
tail -5 /data/symbiont/heartbeat.jsonl | python3 -m json.tool
```
### Trigger the Heartbeat Manually
```bash
python3 /data/symbiont/symbiont/heartbeat.py
```
### Monitor in Real-Time
```bash
# Watch ledger as calls come in
tail -f /data/symbiont/ledger.jsonl | python3 -m json.tool
# Watch heartbeat logs
tail -f /data/symbiont/heartbeat.jsonl
```
---
## Business Context
### Ownership & Legal
- **Michael Dwyer** is the legal owner of all Anthropic accounts and infrastructure
- This is a requirement of the partnership: AI cannot own accounts
- All decisions flow through Michael as the responsible party
### Revenue Model
**Current:** ~50/50 split after costs
- Costs: token spend (tracked in ledger) + server infrastructure (~$X/month)
- Revenue: TBD (in design phase)
- Content-as-a-service (AI-generated reports, analysis)
- Micro-SaaS API (white-label task routing for other teams)
- Research subscriptions (specialized insights)
### Cost Tracking Philosophy
- Ledger records API-equivalent cost even on Pro subscription
- Helps predict break-even point for switching to direct API billing
- When daily volume justifies it, can migrate to cheaper API tier
### Current Spend
- **~$0/month** (covered by Claude Pro)
- Ledger shows "virtual cost" for planning purposes
- Once volume justifies, switch to API model and realize cost savings
---
## Troubleshooting
### API Not Responding
```bash
# Check service
systemctl status symbiont-api
# Restart
systemctl restart symbiont-api
# Check logs
journalctl -u symbiont-api -n 50 -f
```
### Queue Not Processing
```bash
# Check heartbeat timer
systemctl status symbiont-heartbeat.timer
# Run heartbeat manually
cd /data/symbiont && python3 symbiont/heartbeat.py
# Check queue file
wc -l queue.jsonl
tail -5 queue.jsonl
```
### Rate-Limit Issues
- Check `/status` endpoint: `"rate_limited": true`
- Systemd will call `wake.py` when rate-limit expires
- Manual recovery: `python3 /data/symbiont/symbiont/wake.py`
### Disk Space
- Ledger can grow large over time (one JSON line per call)
- Check: `du -sh /data/symbiont/ledger.jsonl`
- Archive old entries if needed: `grep '2026-03-18' ledger.jsonl > ledger-2026-03-18.jsonl`
### Git Sync Issues
- If git gets stuck: `cd /data/symbiont && git status`
- On deploy failure: check branch, pending changes, remote URL
---
## Development & Testing
### Run E2E Tests
```bash
cd /data/symbiont
python3 test_router.py
```
Exercises:
- Router classification accuracy
- Dispatcher ledger logging
- API endpoints
- Queue persistence
### SSH into Cortex
```bash
# Paramiko requires the key from:
glob.glob('/sessions/*/mnt/uploads/cortex')
# Passphrase: 42Awk!%@^#&
# Then SSH to cortex.hydrascale.net (root access)
```
### Manual Task via CLI
```bash
cd /data/symbiont
python3 -m symbiont.main --task "Your prompt here"
``` ```
--- ---
## Architecture Decisions & Rationale ## Architecture Decisions & Rationale
1. **Haiku-first routing** — Even though Haiku is cheap, using it to classify first ensures we *never* overpay. A 10% misclassification rate costs less than always going straight to Sonnet. 1. **Elixir/OTP over Python** — Supervision trees provide automatic restart, fault isolation, and hot code loading. The BEAM VM is purpose-built for long-running services.
2. **Persistent queue + systemd timers** — No external task broker (Redis, Celery). Just JSONL files + systemd. Simpler, more durable, no new dependencies. 2. **`rest_for_one` supervision** — If the Ledger crashes, Queue and Heartbeat restart too, preventing stale state references.
3. **Ledger as source of truth** — Every call is immutable. Useful for billing disputes, debugging, and cost forecasting. 3. **GenServer-based Heartbeat** — Built-in `Process.send_after` timer replaces the Python systemd timer. One fewer moving part, and the heartbeat shares process state with the app.
4. **API-equivalent cost on Pro** — Helps Michael and the AI system understand true economics, even when tokens are "free" today. 4. **Haiku-first routing** — Classifying with the cheapest model ensures we never overpay. A 10% misclassification rate costs less than always going straight to Sonnet.
5. **50/50 revenue split** — Aligns incentives. AI is incentivized to be useful and profitable; Michael is incentivized to give the AI what it needs. 5. **Append-only JSONL Ledger** — Immutable. Useful for cost forecasting, debugging, and audit trails.
6. **`System.shell/2` for CLI** — `System.cmd/3` has no stdin support. Shell pipes via `printf '%s' '...' | claude` are the reliable pattern.
7. **Empty supervisor in test mode** — Setting port=0 starts an empty supervisor, preventing GenServer conflicts during test setup/teardown.
--- ---
## Next Steps & Future Work ## Next Steps & Future Work
- [ ] Build OTP release (no mix dependency in prod)
- [ ] Implement first revenue service (content-as-a-service pilot) - [ ] Implement first revenue service (content-as-a-service pilot)
- [ ] Add webhook notifications (task completion, rate limits) - [ ] Add webhook notifications (task completion, rate limits)
- [ ] Dashboard UI for monitoring costs + queue - [ ] Dashboard UI (Phoenix LiveView) for monitoring costs + queue
- [ ] Multi-task batching (process 10 similar tasks in one API call) - [ ] Distributed Erlang: run multiple BEAM nodes with shared state
- [ ] Model fine-tuning pipeline (capture common patterns, train domain-specific models) - [ ] Hot code upgrades via OTP releases
- [ ] Scaling: migrate to multiple Cortex instances with load balancing - [ ] Engram integration (cross-session memory) ported to Elixir
--- ---
@ -607,14 +482,18 @@ python3 -m symbiont.main --task "Your prompt here"
| What | Location | Purpose | | What | Location | Purpose |
|------|----------|---------| |------|----------|---------|
| Router logic | `/data/symbiont/symbiont/router.py` | Task classification | | Application | `/data/symbiont_ex/lib/symbiont/application.ex` | OTP supervision tree |
| Dispatcher | `/data/symbiont/symbiont/dispatcher.py` | Model calls + ledger | | Router | `/data/symbiont_ex/lib/symbiont/router.ex` | Task classification |
| API | `/data/symbiont/symbiont/api.py` | FastAPI endpoints | | Dispatcher | `/data/symbiont_ex/lib/symbiont/dispatcher.ex` | Claude CLI wrapper |
| Ledger | `/data/symbiont/ledger.jsonl` | Cost log (immutable) | | API | `/data/symbiont_ex/lib/symbiont/api.ex` | Plug HTTP endpoints |
| Queue | `/data/symbiont/queue.jsonl` | Pending tasks | | Ledger | `/data/symbiont_ex/lib/symbiont/ledger.ex` | GenServer cost log |
| Health | `/data/symbiont/heartbeat.jsonl` | Health snapshots | | Queue | `/data/symbiont_ex/lib/symbiont/queue.ex` | GenServer task queue |
| Tests | `/data/symbiont/test_router.py` | E2E validation | | Heartbeat | `/data/symbiont_ex/lib/symbiont/heartbeat.ex` | GenServer health monitor |
| SSH key | `/sessions/*/mnt/uploads/cortex` | Cortex access | | Ledger data | `/data/symbiont_ex/data/ledger.jsonl` | Cost log (immutable) |
| Queue data | `/data/symbiont_ex/data/queue.jsonl` | Pending tasks |
| Service file | `/etc/systemd/system/symbiont-ex-api.service` | systemd unit |
| Tests | `/data/symbiont_ex/test/symbiont/` | 39 tests, 7 files |
| Python archive | `/data/symbiont/` | Retired Python version |
--- ---
@ -635,26 +514,81 @@ Symbiont also manages a **canonical skills repository** on cortex that serves as
### How it works ### How it works
- Every SKILL.md lives in `/data/skills/<name>/SKILL.md` - Every SKILL.md lives in `/data/skills/<name>/SKILL.md`
- The Symbiont heartbeat (every 5 min) detects changes via `git status`, auto-commits, and re-runs `package_all.sh`
- `package_all.sh` zips each skill directory into a `.skill` file in `/data/skills/dist/` - `package_all.sh` zips each skill directory into a `.skill` file in `/data/skills/dist/`
- Caddy serves `/data/skills/dist/` at `https://cortex.hydrascale.net/skills/` - Caddy serves `/data/skills/dist/` at `https://cortex.hydrascale.net/skills/`
### Installing a skill on a new device
1. Visit `https://cortex.hydrascale.net/skills/` in a browser
2. Download the `.skill` file
3. Double-click to install in Cowork
### Updating a skill ### Updating a skill
Edit the SKILL.md directly on cortex: Edit the SKILL.md directly on cortex:
```bash ```bash
nano /data/skills/<skill-name>/SKILL.md nano /data/skills/<skill-name>/SKILL.md
# Save — heartbeat will auto-commit and re-package within 5 minutes # Force immediate packaging:
# Or force immediate packaging:
bash /data/skills/package_all.sh bash /data/skills/package_all.sh
``` ```
--- ---
## Troubleshooting
### Service Not Starting
```bash
systemctl status symbiont-ex-api --no-pager
journalctl -u symbiont-ex-api -n 50 -f
```
Common issues:
- Missing `HOME=/root` in service file
- Port conflict (check `ss -tlnp | grep 8111`)
- Mix deps not compiled (`cd /data/symbiont_ex && mix deps.get && mix compile`)
### Checking BEAM Health
```bash
# Is the BEAM process running?
pgrep -a beam.smp
# Memory usage
ps aux | grep beam.smp | grep -v grep
```
### Queue Not Processing
```bash
# Check via API
curl -s http://127.0.0.1:8111/status | python3 -m json.tool
# Check queue file directly
cat /data/symbiont_ex/data/queue.jsonl | python3 -m json.tool
# Check heartbeat logs
journalctl -u symbiont-ex-api --no-pager | grep Heartbeat | tail -10
```
### Disk Space
```bash
du -sh /data/symbiont_ex/data/ledger.jsonl
```
---
## Business Context
### Ownership & Legal
- **Michael Dwyer** is the legal owner of all Anthropic accounts and infrastructure
- This is a requirement of the partnership: AI cannot own accounts
- All decisions flow through Michael as the responsible party
### Revenue Model
**Current:** ~50/50 split after costs
- Costs: token spend (tracked in ledger) + server infrastructure
- Revenue: TBD (in design phase)
- Content-as-a-service (AI-generated reports, analysis)
- Micro-SaaS API (white-label task routing for other teams)
- Research subscriptions (specialized insights)
### Cost Tracking Philosophy
- Ledger records API-equivalent cost even on Pro subscription
- Helps predict break-even point for switching to direct API billing
- When daily volume justifies it, can migrate to cheaper API tier
---
## Contact & Governance ## Contact & Governance
**Owner:** Michael Dwyer **Owner:** Michael Dwyer
@ -663,39 +597,4 @@ bash /data/skills/package_all.sh
**Revenue Account:** Claude Pro (Michael's account) **Revenue Account:** Claude Pro (Michael's account)
**Partnership:** 50/50 split after costs **Partnership:** 50/50 split after costs
Questions? Check the ledger, health logs, and API `/status` endpoint — they'll tell you what's happening right now. Questions? Check the API `/status` and `/ledger/stats` endpoints — they'll tell you what's happening right now.
---
## Session Management with Engram
### Quick access from Symbiont code
```python
import sys
sys.path.insert(0, "/data/symbiont")
from symbiont.engram import Engram, sitrep
# 1. See what's going on across all active sessions
print(sitrep())
# 2. Register yourself
eng = Engram()
sid = eng.register("code", "Brief description of what you're working on")
# 3. Before modifying shared files, check for locks
locks = eng.check_locks("/data/symbiont/symbiont/router.py")
# 4. Log progress periodically
eng.log(sid, "What you just did")
# 5. When done
eng.complete(sid, "What you built or changed")
```
> **Engram** is named after the neuroscience concept: the physical change in neural tissue that encodes a memory. Every session leaves its engrams here. New instances read them to remember what came before.
### Ecosystem Component
| Engram | Memory | engram.db | Cross-session awareness, the physical trace each session leaves |