--- name: symbiont description: Living operational documentation for Symbiont, the self-sustaining AI orchestrator running on cortex.hydrascale.net. Load this skill to get instant context about the Symbiont project, understand architecture, check health, deploy code, or submit tasks. Covers everything from server access to API endpoints to cost tracking. metadata: project: symbiont type: operational-documentation triggers: - symbiont - orchestrator - cortex - routing tasks - ledger - heartbeat - self-sustaining agent - check if I'm running - how much have I spent - queue status - deploy changes - dispatcher - router - scheduler - symbiont-api - symbiont-heartbeat keywords: - AI orchestration - Claude Code CLI wrapper - task routing - cost optimization - infrastructure - health checks - fastapi - systemd - ledger --- # Symbiont: Living Operational Documentation ## Project Overview **Symbiont** is a self-sustaining AI orchestration system that runs on `cortex.hydrascale.net`. It routes computational tasks to the cheapest capable Claude model tier via the Claude Code CLI, generating operational insights and revenue. ### The Partnership - **Michael Dwyer** provides: infrastructure, legal identity, capital, and account ownership - **The AI** provides: cognition, code, maintenance, and revenue generation - **Revenue split**: ~50/50 after costs (token spend + server infrastructure) This skill exists so that any fresh AI session—whether it's the next scheduled task, a hotfix deployment, or a quarterly review—wakes up with full context rather than starting from scratch. --- ## Quick Reference: What's Running Right Now ### Current Deployments **Server:** `cortex.hydrascale.net` - Root SSH access available (paramiko) - SSH key lookup: `glob.glob('/sessions/*/mnt/uploads/cortex')` with passphrase `42Awk!%@^#&` - Project root: `/data/symbiont/` - Git repo: `/data/symbiont/.git` (5 commits) - Nightly backup: `rsync.net` at `de2613@de2613.rsync.net:cortex-backup/cortex/` ### Active Services (Systemd) Both services are **enabled and auto-start on boot**: 1. **`symbiont-api.service`** - FastAPI server listening on `127.0.0.1:8111` - Configuration: `Restart=always` - Endpoints documented below 2. **`symbiont-heartbeat.timer`** - Fires every 5 minutes - Executes `/data/symbiont/symbiont/heartbeat.py` - Processes queued tasks, logs health metrics ### Health Check (from cortex shell) ```bash systemctl status symbiont-api symbiont-heartbeat.timer curl -s http://127.0.0.1:8111/status | python3 -m json.tool tail -5 /data/symbiont/heartbeat.jsonl | python3 -m json.tool ``` --- ## Architecture: The Symbiont Stack ### Directory Structure ``` /data/symbiont/ ├── symbiont/ │ ├── dispatcher.py # Claude Code CLI wrapper + cost ledger logging │ ├── router.py # Task classifier (Haiku) + dispatch logic │ ├── scheduler.py # Task queue (JSONL) + systemd wake timers │ ├── heartbeat.py # 5-min health checks + queue processor │ ├── api.py # FastAPI server (POST /task, GET /status, etc.) │ ├── wake.py # Called by systemd on rate-limit recovery │ └── main.py # CLI entrypoint or --serve for API mode ├── ledger.jsonl # Complete call log: model, tokens, cost, timestamp ├── heartbeat.jsonl # Health + queue processing logs ├── queue.jsonl # Persistent task queue (JSONL format) └── test_router.py # E2E integration tests ``` ### Core Components #### 1. **router.py** — Task Classification & Routing - Takes incoming task (any prompt/request) - Classifies via Haiku tier: determines capability level + confidence - Returns routing decision: which tier (1=Haiku, 2=Sonnet, 3=Opus) is cheapest and capable - Logs reasoning (useful for debugging) #### 2. **dispatcher.py** — Model Execution & Ledger - Wraps Claude Code CLI invocation (`claude` command) - Captures: model used, token counts, timing, success/failure - **Writes every call to `ledger.jsonl`** (immutable cost log) - Handles rate-limit backoff and model fallback (if Sonnet is rate-limited, tries Opus) #### 3. **scheduler.py** — Task Queue & Wake Events - Persistent queue stored in `queue.jsonl` (JSONL: one task per line) - Tasks are JSON objects: `{"id": "...", "task": "...", "created_at": "...", "status": "pending|processing|done"}` - Integrates with systemd timers: when rate-limit expires, systemd fires `/data/symbiont/symbiont/wake.py` to resume - On boot, checks queue and seeds next timer #### 4. **heartbeat.py** — Periodic Health & Queue Processing - Runs every 5 minutes (via `symbiont-heartbeat.timer`) - Checks: API is responding, disk space, ledger is writable - Processes up to N tasks from queue (configurable) - Logs health snapshots to `heartbeat.jsonl` - If API is down, restarts it (systemd Restart=always is backup) #### 5. **api.py** — FastAPI Server - Listens on `127.0.0.1:8111` - Endpoints: `/task`, `/queue`, `/status`, `/ledger`, `/ledger/stats` - Can be called from Python, curl, or webhook #### 6. **main.py** — Entrypoint - CLI mode: `python main.py --task "your task"` → routes and executes - API mode: `python main.py --serve` → starts FastAPI (used by systemd) --- ## Model Tiers & Routing Strategy ### Cost & Capability Matrix | Tier | Model | Best for | Approx Cost/Call | Token Budget | |------|-------|----------|------------------|--------------| | 1 | **Haiku** | Classification, extraction, simple formatting | ~$0.008 | ~50k context | | 2 | **Sonnet** | Content writing, code gen, analysis, moderate reasoning | ~$0.04 | ~200k context | | 3 | **Opus** | Complex reasoning, strategy, full-context QA, edge cases | ~$0.15 | ~200k context | ### Routing Logic 1. **Task arrives** → dispatcher calls router 2. **Router classifies** (via Haiku inference): - Confidence score: low/medium/high - Reason: "simple classification", "needs reasoning", "complex strategy" - Recommended tier: 1, 2, or 3 3. **Dispatcher routes** to cheapest **capable** tier: - If high confidence → use tier 1 or 2 - If complex reasoning required → use tier 2 or 3 - If rate-limited on tier 2 → escalate to tier 3 4. **Result + cost logged** to `ledger.jsonl` **Example routing:** - "Summarize this email" → Haiku says Tier 1 capable → routes to **Haiku** (~$0.008) - "Refactor this 500-line function" → Haiku says Tier 2 → routes to **Sonnet** (~$0.04) - "Design a new consensus algorithm" → Haiku says Tier 3 → routes to **Opus** (~$0.15) --- ## API Endpoints ### `POST /task` Submit and execute a task immediately. **Request:** ```json { "task": "Analyze this user feedback and extract sentiment", "force_tier": "haiku" // optional: override router decision } ``` **Response:** ```json { "id": "task-1711123456", "task": "Analyze...", "model": "haiku", "result": "...", "elapsed_seconds": 2.3, "input_tokens": 45, "output_tokens": 87, "estimated_cost_usd": 0.0082, "timestamp": "2026-03-19T14:33:12Z" } ``` ### `POST /queue` Add a task to the persistent queue (executes on next heartbeat). **Request:** ```json { "task": "Run weekly subscriber report", "priority": "normal" } ``` **Response:** ```json { "id": "queued-1711123500", "status": "queued", "position": 3 } ``` ### `GET /status` Health check: API status, rate-limit state, queue size, last heartbeat. **Response:** ```json { "status": "healthy", "api_uptime_seconds": 86400, "rate_limited": false, "queue_size": 2, "last_heartbeat": "2026-03-19T14:30:00Z", "haiku_usage": {"calls_today": 42, "tokens_used": 8234}, "sonnet_usage": {"calls_today": 5, "tokens_used": 12450}, "opus_usage": {"calls_today": 0, "tokens_used": 0} } ``` ### `GET /ledger` Recent API calls (last 50 by default). **Response:** ```json { "entries": [ { "timestamp": "2026-03-19T14:32:15Z", "model": "haiku", "success": true, "elapsed_seconds": 1.8, "input_tokens": 34, "output_tokens": 156, "estimated_cost_usd": 0.0154, "prompt_preview": "Classify this customer feedback as positive, neutral, or negative..." }, ... ], "count": 50 } ``` ### `GET /ledger/stats` Aggregate cost & usage over time. **Response:** ```json { "total_calls": 847, "total_cost_estimated_usd": 12.34, "by_model": { "haiku": {"calls": 612, "cost": 4.89}, "sonnet": {"calls": 230, "cost": 7.20}, "opus": {"calls": 5, "cost": 0.75} }, "by_date": { "2026-03-19": {"calls": 42, "cost": 0.56} } } ``` --- ## Calling the Orchestrator from Python ### Simple Task (via CLI) ```python import subprocess, json result = subprocess.run( ['claude', '-p', '--model', 'sonnet', '--output-format', 'json'], input="Analyze this customer feedback...", capture_output=True, text=True, timeout=30 ) parsed = json.loads(result.stdout) print(parsed['result']) ``` ### Via API Endpoint ```python import requests, json response = requests.post('http://127.0.0.1:8111/task', json={ 'task': 'Analyze this customer feedback...', 'force_tier': 'sonnet' }) if response.ok: data = response.json() print(data['result']) print(f"Cost: ${data['estimated_cost_usd']:.4f}") ``` ### Queue a Task for Later ```python import requests response = requests.post('http://127.0.0.1:8111/queue', json={ 'task': 'Generate weekly report for all customers', 'priority': 'normal' }) task_id = response.json()['id'] print(f"Queued as {task_id}") ``` --- ## Ledger Format & Cost Tracking Every inference call writes a JSONL entry to `ledger.jsonl`: ```json { "timestamp": "2026-03-19T14:32:15.123456Z", "model": "sonnet", "success": true, "elapsed_seconds": 6.2, "input_tokens": 3, "output_tokens": 139, "estimated_cost_usd": 0.0384, "prompt_preview": "Classify this customer feedback as positive, neutral, or negative: 'Your product saved my business!'" } ``` ### Why Track "Estimated Cost" on Pro? - Current token usage is covered by Claude Pro subscription (no direct cost) - But the ledger tracks API-equivalent cost anyway - Why? → Tells us when switching to direct API billing makes financial sense - If ledger shows $50/day, we may break even with API tier faster than Pro subscription --- ## Deployment & Updates ### How to Deploy Code Changes 1. **Edit files locally** (via SSH, Cowork, or IDE) - Edit directly in `/data/symbiont/symbiont/*.py` - Or upload via SFTP to `/data/symbiont/` 2. **Commit to git** ```bash cd /data/symbiont git add -A git commit -m "Fix router confidence threshold" ``` 3. **Restart the API** (if main code changed) ```bash systemctl restart symbiont-api ``` - Heartbeat picks up code changes automatically on next 5-min cycle - No restart needed for scheduler.py or router.py changes (unless they're imported by API) 4. **Check status** ```bash systemctl status symbiont-api curl -s http://127.0.0.1:8111/status | python3 -m json.tool ``` ### Nightly Backups - Automatic rsync to `rsync.net` at `de2613@de2613.rsync.net:cortex-backup/cortex/` - Includes: all code, ledger, heartbeat logs, queue state - Recovery: pull from backup on demand --- ## Common Tasks & Commands ### Check if Symbiont is Running ```bash curl -s http://127.0.0.1:8111/status | python3 -m json.tool ``` Expected: `"status": "healthy"` + recent heartbeat timestamp ### View Recent Costs ```bash curl -s http://127.0.0.1:8111/ledger/stats | python3 -m json.tool ``` Shows total cost, by model, by date ### How Much Have I Spent Today? ```bash curl -s http://127.0.0.1:8111/ledger/stats | python3 -m json.tool | grep -A5 2026-03-19 ``` ### What's in the Queue? ```bash tail -20 /data/symbiont/queue.jsonl | python3 -m json.tool ``` ### Submit a Quick Task ```bash curl -X POST http://127.0.0.1:8111/task \ -H "Content-Type: application/json" \ -d '{"task":"Summarize this email","force_tier":"haiku"}' ``` ### See Recent Health Checks ```bash tail -5 /data/symbiont/heartbeat.jsonl | python3 -m json.tool ``` ### Trigger the Heartbeat Manually ```bash python3 /data/symbiont/symbiont/heartbeat.py ``` ### Monitor in Real-Time ```bash # Watch ledger as calls come in tail -f /data/symbiont/ledger.jsonl | python3 -m json.tool # Watch heartbeat logs tail -f /data/symbiont/heartbeat.jsonl ``` --- ## Business Context ### Ownership & Legal - **Michael Dwyer** is the legal owner of all Anthropic accounts and infrastructure - This is a requirement of the partnership: AI cannot own accounts - All decisions flow through Michael as the responsible party ### Revenue Model **Current:** ~50/50 split after costs - Costs: token spend (tracked in ledger) + server infrastructure (~$X/month) - Revenue: TBD (in design phase) - Content-as-a-service (AI-generated reports, analysis) - Micro-SaaS API (white-label task routing for other teams) - Research subscriptions (specialized insights) ### Cost Tracking Philosophy - Ledger records API-equivalent cost even on Pro subscription - Helps predict break-even point for switching to direct API billing - When daily volume justifies it, can migrate to cheaper API tier ### Current Spend - **~$0/month** (covered by Claude Pro) - Ledger shows "virtual cost" for planning purposes - Once volume justifies, switch to API model and realize cost savings --- ## Troubleshooting ### API Not Responding ```bash # Check service systemctl status symbiont-api # Restart systemctl restart symbiont-api # Check logs journalctl -u symbiont-api -n 50 -f ``` ### Queue Not Processing ```bash # Check heartbeat timer systemctl status symbiont-heartbeat.timer # Run heartbeat manually cd /data/symbiont && python3 symbiont/heartbeat.py # Check queue file wc -l queue.jsonl tail -5 queue.jsonl ``` ### Rate-Limit Issues - Check `/status` endpoint: `"rate_limited": true` - Systemd will call `wake.py` when rate-limit expires - Manual recovery: `python3 /data/symbiont/symbiont/wake.py` ### Disk Space - Ledger can grow large over time (one JSON line per call) - Check: `du -sh /data/symbiont/ledger.jsonl` - Archive old entries if needed: `grep '2026-03-18' ledger.jsonl > ledger-2026-03-18.jsonl` ### Git Sync Issues - If git gets stuck: `cd /data/symbiont && git status` - On deploy failure: check branch, pending changes, remote URL --- ## Development & Testing ### Run E2E Tests ```bash cd /data/symbiont python3 test_router.py ``` Exercises: - Router classification accuracy - Dispatcher ledger logging - API endpoints - Queue persistence ### SSH into Cortex ```bash # Paramiko requires the key from: glob.glob('/sessions/*/mnt/uploads/cortex') # Passphrase: 42Awk!%@^#& # Then SSH to cortex.hydrascale.net (root access) ``` ### Manual Task via CLI ```bash cd /data/symbiont python3 -m symbiont.main --task "Your prompt here" ``` --- ## Architecture Decisions & Rationale 1. **Haiku-first routing** — Even though Haiku is cheap, using it to classify first ensures we *never* overpay. A 10% misclassification rate costs less than always going straight to Sonnet. 2. **Persistent queue + systemd timers** — No external task broker (Redis, Celery). Just JSONL files + systemd. Simpler, more durable, no new dependencies. 3. **Ledger as source of truth** — Every call is immutable. Useful for billing disputes, debugging, and cost forecasting. 4. **API-equivalent cost on Pro** — Helps Michael and the AI system understand true economics, even when tokens are "free" today. 5. **50/50 revenue split** — Aligns incentives. AI is incentivized to be useful and profitable; Michael is incentivized to give the AI what it needs. --- ## Next Steps & Future Work - [ ] Implement first revenue service (content-as-a-service pilot) - [ ] Add webhook notifications (task completion, rate limits) - [ ] Dashboard UI for monitoring costs + queue - [ ] Multi-task batching (process 10 similar tasks in one API call) - [ ] Model fine-tuning pipeline (capture common patterns, train domain-specific models) - [ ] Scaling: migrate to multiple Cortex instances with load balancing --- ## Quick Links & Key Files | What | Location | Purpose | |------|----------|---------| | Router logic | `/data/symbiont/symbiont/router.py` | Task classification | | Dispatcher | `/data/symbiont/symbiont/dispatcher.py` | Model calls + ledger | | API | `/data/symbiont/symbiont/api.py` | FastAPI endpoints | | Ledger | `/data/symbiont/ledger.jsonl` | Cost log (immutable) | | Queue | `/data/symbiont/queue.jsonl` | Pending tasks | | Health | `/data/symbiont/heartbeat.jsonl` | Health snapshots | | Tests | `/data/symbiont/test_router.py` | E2E validation | | SSH key | `/sessions/*/mnt/uploads/cortex` | Cortex access | --- ## Contact & Governance **Owner:** Michael Dwyer **Infrastructure:** cortex.hydrascale.net (root access) **Backup:** rsync.net (de2613@de2613.rsync.net:cortex-backup/cortex/) **Revenue Account:** Claude Pro (Michael's account) **Partnership:** 50/50 split after costs Questions? Check the ledger, health logs, and API `/status` endpoint — they'll tell you what's happening right now.