diff --git a/cortex-server/SKILL.md b/cortex-server/SKILL.md index d5d4c60..7320f80 100644 --- a/cortex-server/SKILL.md +++ b/cortex-server/SKILL.md @@ -299,17 +299,20 @@ df -h / && free -h --- -## Symbiont Orchestrator +## Symbiont Orchestrator (Elixir/OTP) -The `/data/symbiont` directory contains the **Symbiont** project — a self-sustaining AI agent orchestrator running on cortex. +The `/data/symbiont_ex/` directory contains the **Symbiont** project — a self-sustaining AI agent orchestrator built in **Elixir/OTP**, running on the BEAM VM. -- **Git repo**: `/data/symbiont/.git` (clone location) -- **Systemd services**: - - `symbiont-api.service` — Main API daemon - - `symbiont-heartbeat.timer` — Periodic health-check timer +- **Runtime**: Elixir 1.19.5 / OTP 27 +- **Project root**: `/data/symbiont_ex/` +- **Data**: `/data/symbiont_ex/data/` (ledger.jsonl, queue.jsonl) +- **Systemd service**: `symbiont-ex-api.service` — Plug + Bandit HTTP on port 8111 +- **Python archive**: `/data/symbiont/` (retired, disabled — kept for reference) Check status and logs: ```bash -systemctl status symbiont-api.service symbiont-heartbeat.timer --no-pager -journalctl -u symbiont-api.service -f --no-pager +systemctl status symbiont-ex-api.service --no-pager +journalctl -u symbiont-ex-api -f --no-pager +curl -s http://127.0.0.1:8111/health +curl -s http://127.0.0.1:8111/status | python3 -m json.tool ``` diff --git a/elixir/SKILL.md b/elixir/SKILL.md index dc682a2..4fff267 100644 --- a/elixir/SKILL.md +++ b/elixir/SKILL.md @@ -7,10 +7,9 @@ | Current Version | **Elixir 1.19.5** (target for all new code) | | Required OTP | **OTP 27+** (OTP 28 supported) | | Phoenix Version | **1.8.5** (latest — requires `:formats` on controllers) | -| Cortex Status | Elixir **not yet installed** on cortex.hydrascale.net | +| Cortex Status | Elixir **1.19.5 / OTP 27 installed** on cortex.hydrascale.net | | AI Agent Tooling | `usage_rules` hex package (~> 1.2) — **always include** | | Production Framework | **Ash Framework** for substantial projects | -| Agent Framework | **Jido** (~> 2.1) for multi-agent systems | | Paradigm | Functional, concurrent, fault-tolerant on BEAM VM | --- @@ -18,51 +17,1650 @@ ## CRITICAL RULES — Read First 1. **Target Elixir 1.19.5** — not 1.15, not 1.17. Use current idioms and features. -2. **Always add `usage_rules`** to every project: `{:usage_rules, "~> 1.1", only: [:dev]}`. -3. **Use Ash Framework** for production/substantial projects (not necessarily POCs). -4. **Cortex is Elixir-first** — Elixir is the primary orchestrator, with interop to Python/Rust/Zig. -5. **Never use deprecated patterns** from pre-1.16 code (see Part 1). -6. **Phoenix 1.8** — always specify `:formats` on controllers, use `~p` verified routes, use scopes. -7. **For agentic workflows** — consider `GenStateMachine` over GenServer (built-in timeouts, state enter, postpone). For multi-agent orchestration, use **Jido**. +2. **Always add `usage_rules`** to every project: `{:usage_rules, "~> 1.1", only: [:dev]}` — it generates `AGENTS.md`/`CLAUDE.md` from dependency docs, giving AI agents rich context about the libraries in use. +3. **Use Ash Framework** for production/substantial projects (not necessarily POCs). Ash provides declarative resources, built-in authorization, and extensions for Postgres, Phoenix, GraphQL, JSON:API. +4. **Cortex is Elixir-first** — Elixir is the primary orchestrator language, with interop to Python (Erlport/ports), Rust (Rustler NIFs), and Zig when needed. +5. **Never use deprecated patterns** from pre-1.16 code (see Breaking Changes section). --- -## Guide Structure — Load What You Need +## Elixir 1.19 — What's New -This guide is split into focused parts. Load the part relevant to your current task: +### Gradual Set-Theoretic Type System -### Part 1: Core Language & OTP (`elixir-part1-core.md`) -Elixir 1.19 features, type system, breaking changes since 1.15, pattern matching, pipelines, with expressions, structs, protocols, behaviours, Mix project structure, testing. +Elixir 1.19 significantly advances the built-in type system. It is **sound**, **gradual**, and **set-theoretic** (types compose via union, intersection, negation). -### Part 2: Concurrency & OTP (`elixir-part2-concurrency.md`) -BEAM processes, message passing, GenServer, Supervisors, DynamicSupervisor, Task, GenStateMachine (state machines for agentic workflows), Registry, ETS. +**Current capabilities (1.19):** +- Type inference from existing code — no annotations required yet (user-provided signatures coming in future releases) +- **Protocol dispatch type checking** — the compiler warns when you pass a value to string interpolation, `for` comprehensions, or other protocol-dispatched operations that don't implement the required protocol +- **Anonymous function type inference** — `fn` literals and `&captures` now propagate types, catching mismatches at compile time +- Types: `atom()`, `binary()`, `integer()`, `float()`, `pid()`, `port()`, `reference()`, `tuple()`, `list()`, `map()`, `function()` +- Compose with: `atom() or integer()` (union), `atom() and integer()` (intersection → `none()`), `atom() and not nil` (difference) +- `dynamic()` type for gradual typing — represents runtime-checked values +- Tuple precision: `{:ok, binary()}`, open tuples with `...`: `{:ok, binary(), ...}` +- Map types: `%{key: value}` for closed maps, `%{optional(atom()) => term()}` for open maps +- Function types: `(integer() -> boolean())` -### Part 3: Phoenix Framework (`elixir-part3-phoenix.md`) -Phoenix 1.8.5 — router/pipelines, verified routes, controllers, HEEx components, LiveView, Channels, contexts, Ecto, authentication, scopes, telemetry, security, testing, generators, deployment. +**What this means in practice:** The compiler now catches real bugs — passing a struct to string interpolation that doesn't implement `String.Chars`, using a non-enumerable in `for`, type mismatches in anonymous function calls. These are warnings today, errors in the future. -### Part 4: Ecosystem & Production (`elixir-part4-ecosystem.md`) -Ash Framework, Jido (multi-agent systems), usage_rules, OTP releases, Docker deployment, distributed Erlang, monitoring (PromEx/Prometheus/Grafana), CI/CD, interop (Python/Rust/Zig), common libraries, cortex deployment plan, anti-patterns. +### Up to 4x Faster Compilation + +Two major improvements: + +1. **Lazy module loading** — modules are no longer loaded immediately when defined. The parallel compiler controls both compilation and loading, reducing pressure on the Erlang code server. Reports of 2x+ speedup on large projects. + +2. **Parallel dependency compilation** — set `MIX_OS_DEPS_COMPILE_PARTITION_COUNT` env var to partition OS dep compilation across CPU cores. + +**Potential regressions:** If you spawn processes during compilation that invoke other project modules, use `Kernel.ParallelCompiler.pmap/2` or `Code.ensure_compiled!/1` before spawning. Also affects `@on_load` callbacks that reference sibling modules. + +### Other 1.19 Features + +- `min/2` and `max/2` allowed in guards +- `Access.values/0` — traverse all values in a map/keyword +- `String.count/2` — count occurrences of a pattern +- Unicode 17.0.0 support +- Multi-line IEx prompts +- `mix help Mod.fun/arity` — get help for specific functions +- New pretty printing infrastructure +- OpenChain compliance with Source SBoM +- Erlang/OTP 28 support --- -## Common Libraries Quick Reference +## Breaking Changes Since 1.15 + +These are critical to know — older books and tutorials will use the old patterns. + +### Struct Update Syntax (1.18+) +```elixir +# OLD (1.15) — no longer valid +%User{user | name: "new"} + +# NEW (1.18+) — explicit pattern match required +%User{} = user +%User{user | name: "new"} +# Or in one expression: +%User{name: "new"} = Map.merge(user, %{name: "new"}) +``` +Actually the change is: `%URI{my_uri | path: "/new"}` now requires `my_uri` to match `%URI{}`. If the compiler can't verify it's the right struct type, it warns. + +### Regex as Struct Field Defaults (OTP 28) +```elixir +# BROKEN on OTP 28 — regex literals can't be struct field defaults +defmodule MyMod do + defstruct pattern: ~r/foo/ # Compile error on OTP 28 +end + +# FIX — use @default_pattern or compute at runtime +defmodule MyMod do + @default_pattern ~r/foo/ + defstruct pattern: @default_pattern # Still fails — use nil + init +end +``` + +### Logger Backends Deprecated +```elixir +# OLD +config :logger, backends: [:console] + +# NEW — use LoggerHandler (Erlang's logger) +config :logger, :default_handler, [] +config :logger, :default_formatter, format: "$time $metadata[$level] $message\n" +``` + +### Mix Task Separator +```elixir +# OLD — comma separator +mix do compile, test + +# NEW (1.17+) — plus separator +mix do compile + test +``` + +### mix cli/0 Replaces Multiple Config Keys +```elixir +# OLD +def project do + [default_task: "phx.server", + preferred_cli_env: [test: :test], + preferred_cli_target: [...]] +end + +# NEW +def cli do + [default_task: "phx.server", + preferred_envs: [test: :test], + preferred_targets: [...]] +end +``` + +--- + +## Core Language Patterns + +### The Pipeline Principle +Elixir code flows through transformations via `|>`. Design functions to take the "subject" as the first argument. + +```elixir +orders +|> Enum.filter(&(&1.status == :pending)) +|> Enum.sort_by(& &1.created_at, DateTime) +|> Enum.map(&process_order/1) +``` + +### Pattern Matching — The Heart of Elixir +```elixir +# Function head matching — preferred over conditionals +def process(%Order{status: :pending} = order), do: ship(order) +def process(%Order{status: :shipped} = order), do: track(order) +def process(%Order{status: :delivered}), do: :noop + +# Pin operator to match against existing values +expected = "hello" +^expected = some_function() # Asserts equality + +# Map/struct matching is partial — only listed keys must match +%{name: name} = %{name: "Michael", age: 42} # name = "Michael" +``` + +### With Expressions for Happy-Path Chaining +```elixir +with {:ok, user} <- fetch_user(id), + {:ok, account} <- fetch_account(user), + {:ok, balance} <- check_balance(account) do + {:ok, balance} +else + {:error, :not_found} -> {:error, "User not found"} + {:error, :insufficient} -> {:error, "Insufficient funds"} + error -> {:error, "Unknown: #{inspect(error)}"} +end +``` + +### Structs and Protocols +```elixir +defmodule Money do + defstruct [:amount, :currency] + + defimpl String.Chars do + def to_string(%Money{amount: a, currency: c}), do: "#{a} #{c}" + end + + defimpl Inspect do + def inspect(%Money{amount: a, currency: c}, _opts) do + "#Money<#{a} #{c}>" + end + end +end +``` + +### Behaviours for Contracts +```elixir +defmodule PaymentProvider do + @callback charge(amount :: integer(), currency :: String.t()) :: + {:ok, transaction_id :: String.t()} | {:error, reason :: term()} + @callback refund(transaction_id :: String.t()) :: + {:ok, term()} | {:error, reason :: term()} +end + +defmodule Stripe do + @behaviour PaymentProvider + + @impl true + def charge(amount, currency), do: # ... + + @impl true + def refund(transaction_id), do: # ... +end +``` + +--- + +## Concurrency Model — BEAM Processes + +### Fundamentals +- **Processes are cheap** — ~2KB initial memory, microseconds to spawn, millions can run concurrently +- **No shared memory** — processes communicate via message passing only +- **Each process has its own heap** — GC is per-process, no stop-the-world pauses +- **Preemptive scheduling** — one scheduler per CPU core, ~4000 reductions then yield +- **Process isolation** — one process crashing doesn't affect others + +### Spawning and Messaging +```elixir +# Basic spawn + message passing +pid = spawn(fn -> + receive do + {:greet, name} -> IO.puts("Hello, #{name}!") + end +end) + +send(pid, {:greet, "Michael"}) + +# Linked processes — bidirectional crash propagation +pid = spawn_link(fn -> raise "boom" end) # Caller also crashes + +# Monitored processes — one-directional crash notification +ref = Process.monitor(pid) +receive do + {:DOWN, ^ref, :process, ^pid, reason} -> IO.puts("Crashed: #{reason}") +end +``` + +### Task — Structured Concurrency +```elixir +# Fire and forget +Task.start(fn -> send_email(user) end) + +# Async/await (linked to caller) +task = Task.async(fn -> expensive_computation() end) +result = Task.await(task, 30_000) # 30s timeout + +# Parallel map +Task.async_stream(urls, &fetch_url/1, max_concurrency: 10, timeout: 15_000) +|> Enum.to_list() +``` + +### GenServer — Stateful Server Processes +```elixir +defmodule Counter do + use GenServer + + # Client API + def start_link(initial \\ 0), do: GenServer.start_link(__MODULE__, initial, name: __MODULE__) + def increment, do: GenServer.cast(__MODULE__, :increment) + def get, do: GenServer.call(__MODULE__, :get) + + # Server callbacks + @impl true + def init(initial), do: {:ok, initial} + + @impl true + def handle_cast(:increment, count), do: {:noreply, count + 1} + + @impl true + def handle_call(:get, _from, count), do: {:reply, count, count} + + @impl true + def handle_info(:tick, count) do + IO.puts("Count: #{count}") + Process.send_after(self(), :tick, 1000) + {:noreply, count} + end +end +``` + +**Key design principle:** GenServer callbacks run sequentially in a single process — this is both the synchronization mechanism and the potential bottleneck. Keep callbacks fast; delegate heavy work to spawned tasks. + +### Supervisors — Let It Crash +```elixir +defmodule MyApp.Application do + use Application + + @impl true + def start(_type, _args) do + children = [ + # Order matters — started top to bottom, stopped bottom to top + MyApp.Repo, # Ecto repo + {MyApp.Cache, []}, # Custom GenServer + {Task.Supervisor, name: MyApp.TaskSupervisor}, + MyAppWeb.Endpoint # Phoenix endpoint last + ] + + opts = [strategy: :one_for_one, name: MyApp.Supervisor] + Supervisor.start_link(children, opts) + end +end +``` + +**Restart strategies:** +- `:one_for_one` — only restart the crashed child (most common) +- `:one_for_all` — restart all children if any crashes (tightly coupled) +- `:rest_for_one` — restart crashed child and all children started after it + +### DynamicSupervisor — Runtime Child Management +```elixir +defmodule MyApp.SessionSupervisor do + use DynamicSupervisor + + def start_link(_), do: DynamicSupervisor.start_link(__MODULE__, :ok, name: __MODULE__) + def init(:ok), do: DynamicSupervisor.init(strategy: :one_for_one) + + def start_session(session_id) do + spec = {MyApp.Session, session_id} + DynamicSupervisor.start_child(__MODULE__, spec) + end +end +``` + +--- + +## OTP Releases & Deployment + +### Building a Release +```bash +# Generate release config (Phoenix projects) +mix phx.gen.release + +# Build the release +MIX_ENV=prod mix release + +# The release is self-contained — no Erlang/Elixir needed on target +_build/prod/rel/my_app/bin/my_app start # Foreground +_build/prod/rel/my_app/bin/my_app daemon # Background +_build/prod/rel/my_app/bin/my_app remote # Attach IEx to running node +_build/prod/rel/my_app/bin/my_app eval "MyApp.Seeds.run()" # One-off command +``` + +### Release Configuration Files +- `config/config.exs` — **compile-time** config (before code compiles) +- `config/runtime.exs` — **runtime** config (executed on every boot) — use for env vars, secrets +- `rel/env.sh.eex` — shell environment for the release (VM flags, env vars) +- `rel/vm.args.eex` — Erlang VM flags (node name, cookie, memory limits) + +### Docker Multistage Build Pattern +```dockerfile +# === Builder Stage === +FROM hexpm/elixir:1.19.5-erlang-27.3-debian-bookworm-20250317 AS builder + +RUN apt-get update -y && apt-get install -y build-essential git && apt-get clean +WORKDIR /app + +ENV MIX_ENV=prod +RUN mix local.hex --force && mix local.rebar --force + +COPY mix.exs mix.lock ./ +RUN mix deps.get --only prod && mix deps.compile + +COPY config/config.exs config/prod.exs config/ +COPY priv priv +COPY lib lib +COPY assets assets # If Phoenix + +RUN mix assets.deploy # If Phoenix +RUN mix compile +RUN mix release + +# === Runner Stage === +FROM debian:bookworm-slim AS runner + +RUN apt-get update -y && \ + apt-get install -y libstdc++6 openssl libncurses5 locales ca-certificates && \ + apt-get clean && rm -f /var/lib/apt/lists/*_* + +ENV LANG=en_US.UTF-8 +RUN sed -i '/en_US.UTF-8/s/^# //g' /etc/locale.gen && locale-gen + +WORKDIR /app +RUN useradd --create-home app && chown -R app:app /app +USER app + +COPY --from=builder --chown=app:app /app/_build/prod/rel/my_app ./ + +CMD ["bin/my_app", "start"] +``` + +### Distributed Erlang in Production +```elixir +# In rel/env.sh.eex — set node name and cookie +export RELEASE_DISTRIBUTION=name +export RELEASE_NODE=my_app@${HOSTNAME} +export RELEASE_COOKIE=my_secret_cookie + +# Or in runtime.exs +config :my_app, MyApp.Cluster, + strategy: Cluster.Strategy.DNSPoll, + config: [ + polling_interval: 5_000, + query: "my-app.local", + node_basename: "my_app" + ] +``` + +**Key libraries for clustering:** `libcluster` (automatic node discovery), `Horde` (distributed supervisor/registry), `Phoenix.PubSub` (cross-node pub/sub already built into Phoenix). + +--- + +## Monitoring & Instrumentation + +### The Elixir Observability Stack +- **PromEx** — Elixir library exposing `/metrics` endpoint for Prometheus scraping. Includes plugins for Phoenix, Ecto, LiveView, BEAM VM, Oban +- **Prometheus** — scrapes and stores time-series metrics +- **Loki** + **Promtail** — log aggregation (Promtail scrapes Docker container logs → Loki stores) +- **Grafana** — visualization dashboards for both metrics and logs +- **Alloy** — OpenTelemetry collector bridging metrics to Prometheus + +### Adding PromEx to a Phoenix App +```elixir +# mix.exs +{:prom_ex, "~> 1.9"} + +# lib/my_app/prom_ex.ex +defmodule MyApp.PromEx do + use PromEx, otp_app: :my_app + + @impl true + def plugins do + [ + PromEx.Plugins.Application, + PromEx.Plugins.Beam, + {PromEx.Plugins.Phoenix, router: MyAppWeb.Router}, + {PromEx.Plugins.Ecto, repos: [MyApp.Repo]}, + PromEx.Plugins.Oban + ] + end + + @impl true + def dashboards do + [{:prom_ex, "application.json"}, {:prom_ex, "beam.json"}, + {:prom_ex, "phoenix.json"}, {:prom_ex, "ecto.json"}] + end +end +``` + +### Key BEAM Metrics to Watch +- `erlang_vm_process_count` — total BEAM processes (normal: thousands, alarm at millions) +- `erlang_vm_memory_bytes_total` — total VM memory +- `erlang_vm_atom_count` — atoms are never GC'd; watch for leaks +- `phoenix_endpoint_stop_duration_milliseconds` — request latency +- `ecto_repo_query_duration_milliseconds` — database query time + +--- + +## usage_rules — AI Agent Documentation + +**Always include in every project.** This is non-negotiable. + +```elixir +# mix.exs deps +{:usage_rules, "~> 1.1", only: [:dev]} + +# mix.exs project config +def project do + [ + # ... other config + usage_rules: [ + packages: :all, # Or specific: ["phoenix", "ecto", ~r/ash/] + output: :agents_md, # Generates AGENTS.md + mode: :linked # Or :inlined for single-file + ] + ] +end +``` + +**Key commands:** +```bash +mix usage_rules.gen # Generate AGENTS.md from deps +mix usage_rules.search_docs # Search hex documentation +mix usage_rules.gen_skill # Generate SKILL.md for Cowork +``` + +This consolidates `usage-rules.md` files from all dependencies into a single reference document, giving any AI agent working on the project full context about library APIs, patterns, and conventions. + +--- + +## Ash Framework — Production Backend + +Ash is a **declarative, resource-oriented** framework for building Elixir backends. Use it for substantial projects — it handles data layer, authorization, validation, and API generation. + +### Core Concepts +- **Resources** — the central abstraction (like models but richer): attributes, actions, relationships, calculations, aggregates, policies +- **Domains** — organizational containers that group related resources and define the public API +- **Actions** — CRUD + custom actions defined declaratively on resources +- **Data Layers** — pluggable persistence (AshPostgres, AshSqlite, ETS for dev) +- **Policies** — declarative authorization rules on resources/actions +- **Extensions** — AshPhoenix, AshGraphql, AshJsonApi, AshAuthentication, AshOban + +### Quick Start +```bash +mix igniter.new my_app --install ash,ash_postgres,ash_phoenix +``` + +### Resource Example +```elixir +defmodule MyApp.Blog.Post do + use Ash.Resource, + domain: MyApp.Blog, + data_layer: AshPostgres.DataLayer + + postgres do + table "posts" + repo MyApp.Repo + end + + attributes do + uuid_primary_key :id + attribute :title, :string, allow_nil?: false + attribute :body, :string, allow_nil?: false + attribute :status, :atom, constraints: [one_of: [:draft, :published]], default: :draft + timestamps() + end + + actions do + defaults [:read, :destroy] + + create :create do + accept [:title, :body] + end + + update :publish do + change set_attribute(:status, :published) + change set_attribute(:published_at, &DateTime.utc_now/0) + end + end + + relationships do + belongs_to :author, MyApp.Accounts.User + has_many :comments, MyApp.Blog.Comment + end + + policies do + policy action_type(:create) do + authorize_if actor_attribute_equals(:role, :author) + end + policy action_type(:read) do + authorize_if always() + end + end +end +``` + +### Domain Example +```elixir +defmodule MyApp.Blog do + use Ash.Domain + + resources do + resource MyApp.Blog.Post + resource MyApp.Blog.Comment + end +end +``` + +### Using Resources +```elixir +# Create +MyApp.Blog.Post +|> Ash.Changeset.for_create(:create, %{title: "Hello", body: "World"}) +|> Ash.create!() + +# Read with filters +MyApp.Blog.Post +|> Ash.Query.filter(status == :published) +|> Ash.Query.sort(inserted_at: :desc) +|> Ash.Query.limit(10) +|> Ash.read!() + +# Custom action +post |> Ash.Changeset.for_update(:publish) |> Ash.update!() +``` + +--- + +## Mix Project Structure + +``` +my_app/ +├── config/ +│ ├── config.exs # Compile-time config +│ ├── dev.exs +│ ├── prod.exs +│ ├── runtime.exs # Runtime config (secrets, env vars) +│ └── test.exs +├── lib/ +│ ├── my_app/ +│ │ ├── application.ex # OTP Application + supervisor tree +│ │ ├── repo.ex # Ecto Repo +│ │ └── ... # Domain modules +│ └── my_app_web/ # Phoenix web layer (if applicable) +│ ├── endpoint.ex +│ ├── router.ex +│ ├── controllers/ +│ ├── live/ # LiveView modules +│ └── components/ +├── priv/ +│ ├── repo/migrations/ +│ └── static/ +├── test/ +│ ├── support/ +│ └── ... +├── mix.exs +├── mix.lock +├── .formatter.exs +└── AGENTS.md # Generated by usage_rules +``` + +--- + +## Testing + +```elixir +# test/my_app/blog_test.exs +defmodule MyApp.BlogTest do + use MyApp.DataCase, async: true # Async when no shared state + + describe "creating posts" do + test "creates with valid attributes" do + assert {:ok, post} = Blog.create_post(%{title: "Hi", body: "World"}) + assert post.title == "Hi" + assert post.status == :draft + end + + test "fails without title" do + assert {:error, changeset} = Blog.create_post(%{body: "World"}) + assert "can't be blank" in errors_on(changeset).title + end + end +end +``` + +**Testing philosophy:** Use `async: true` wherever possible. Ecto's SQL Sandbox allows concurrent test execution. For GenServer tests, start under a test supervisor. For external services, use `Mox` for behaviour-based mocking. + +--- + +## Phoenix Framework — v1.8.5 (Current) + +Phoenix is the web framework for Elixir. Version 1.8.5 is current. It provides MVC controllers, real-time LiveView, Channels for WebSocket communication, and comprehensive tooling for authentication, testing, and deployment. + +### What's New in Phoenix 1.8 + +- **Scopes** in generators — secure data access by default (e.g., `current_user` automatically applied) +- **Magic links** (passwordless auth) and **"sudo mode"** in `mix phx.gen.auth` +- **daisyUI** integration — light/dark/system mode support out of the box +- **Simplified layouts** — single `root.html.heex` wraps everything; dynamic layouts are function components +- **`use Phoenix.Controller` now requires `:formats`** — must specify `formats: [:html]` or `formats: [:html, :json]` +- **Updated security headers** — `content-security-policy` with `base-uri 'self'; frame-ancestors 'self'`; dropped deprecated `x-frame-options` and `x-download-options` +- **`config` variable removed** from `Phoenix.Endpoint` — use `Application.compile_env/3` instead +- **Deprecated**: `:namespace`, `:put_default_views`, layouts without modules, `:trailing_slash` in router + +### Project Setup + +```bash +# New project (Phoenix Express — quickest path) +curl https://new.phoenixframework.org/my_app | sh + +# Traditional setup +mix phx.new my_app +mix phx.new my_app --no-ecto # Without database +mix phx.new my_app --no-html # API only +mix phx.new my_app --database sqlite3 # SQLite instead of Postgres +``` + +### Directory Structure + +``` +my_app/ +├── lib/ +│ ├── my_app/ # Business logic (contexts, schemas) +│ │ ├── application.ex # Supervision tree +│ │ ├── repo.ex # Ecto Repo +│ │ └── catalog/ # Context: Catalog +│ │ ├── product.ex # Schema +│ │ └── catalog.ex # Context module +│ └── my_app_web/ # Web layer +│ ├── endpoint.ex # HTTP entry point +│ ├── router.ex # Routes + pipelines +│ ├── components/ +│ │ ├── core_components.ex # Shared UI components +│ │ └── layouts.ex # Layout components +│ ├── controllers/ +│ │ ├── page_controller.ex +│ │ └── page_html/ # Templates for controller +│ │ └── home.html.heex +│ └── live/ # LiveView modules +│ └── counter_live.ex +├── config/ +│ ├── config.exs # Compile-time config +│ ├── dev.exs / prod.exs +│ └── runtime.exs # Runtime config (env vars, secrets) +├── priv/ +│ ├── repo/migrations/ +│ └── static/ # Static assets +├── test/ +│ ├── support/ +│ │ ├── conn_case.ex # Test helpers for controllers +│ │ └── data_case.ex # Test helpers for data layer +│ └── my_app_web/ +└── mix.exs +``` + +### Router & Pipelines + +```elixir +defmodule MyAppWeb.Router do + use MyAppWeb, :router + + pipeline :browser do + plug :accepts, ["html"] + plug :fetch_session + plug :fetch_live_flash + plug :put_root_layout, html: {MyAppWeb.Layouts, :root} + plug :protect_from_forgery + plug :put_secure_browser_headers + end + + pipeline :api do + plug :accepts, ["json"] + end + + scope "/", MyAppWeb do + pipe_through :browser + + get "/", PageController, :home + live "/counter", CounterLive # LiveView route + resources "/products", ProductController # RESTful CRUD + end + + scope "/api", MyAppWeb do + pipe_through :api + + resources "/items", ItemController, except: [:new, :edit] + end +end +``` + +### Verified Routes — `~p` Sigil + +**Always use `~p` instead of string paths.** Compile-time verified against your router. + +```elixir +# In templates (HEEx) +~H""" +View +Edit +""" + +# In controllers +redirect(conn, to: ~p"/products/#{product}") + +# With query params +~p"/products?page=#{page}&sort=name" +~p"/products?#{%{page: 1, sort: "name"}}" + +# URL generation (includes host) +url(~p"/products/#{product}") +# => "https://example.com/products/42" +``` + +### Controllers (1.8 Style) + +```elixir +defmodule MyAppWeb.ProductController do + use MyAppWeb, :controller + + # REQUIRED in 1.8: specify formats + # This is typically set in MyAppWeb :controller function + + def index(conn, _params) do + products = Catalog.list_products() + render(conn, :index, products: products) + end + + def show(conn, %{"id" => id}) do + product = Catalog.get_product!(id) + render(conn, :show, product: product) + end + + def create(conn, %{"product" => product_params}) do + case Catalog.create_product(product_params) do + {:ok, product} -> + conn + |> put_flash(:info, "Product created.") + |> redirect(to: ~p"/products/#{product}") + + {:error, %Ecto.Changeset{} = changeset} -> + render(conn, :new, changeset: changeset) + end + end +end +``` + +**View module naming:** For `ProductController`, Phoenix looks for `ProductHTML` (for `:html` format) and `ProductJSON` (for `:json` format). + +```elixir +# lib/my_app_web/controllers/product_html.ex +defmodule MyAppWeb.ProductHTML do + use MyAppWeb, :html + embed_templates "product_html/*" # Loads .heex files from directory +end + +# lib/my_app_web/controllers/product_json.ex +defmodule MyAppWeb.ProductJSON do + def index(%{products: products}), do: %{data: for(p <- products, do: data(p))} + def show(%{product: product}), do: %{data: data(product)} + defp data(product), do: %{id: product.id, title: product.title, price: product.price} +end +``` + +### Components and HEEx + +```elixir +defmodule MyAppWeb.CoreComponents do + use Phoenix.Component + + # Declare attributes with types and docs + attr :type, :string, default: "button" + attr :class, :string, default: nil + attr :rest, :global # Passes through all other HTML attrs + slot :inner_block, required: true + + def button(assigns) do + ~H""" + + """ + end + + # Table component with slots + attr :rows, :list, required: true + slot :col, required: true do + attr :label, :string + end + + def table(assigns) do + ~H""" + + + + + + + + + + + +
{col[:label]}
{render_slot(col, row)}
+ """ + end +end +``` + +**HEEx syntax notes:** +- `{@var}` — render assign (curly braces, not `<%= %>`) +- `:if={condition}` — conditional rendering on any tag +- `:for={item <- list}` — iteration on any tag +- `<.component_name />` — call function component with dot notation +- `` — call remote component with module name +- `{render_slot(@inner_block)}` — render slot content +- `<:slot_name>content` — named slot content + +### LiveView + +```elixir +defmodule MyAppWeb.SearchLive do + use MyAppWeb, :live_view + + def mount(_params, _session, socket) do + {:ok, assign(socket, query: "", results: [])} + end + + def handle_params(%{"q" => query}, _uri, socket) do + {:noreply, assign(socket, query: query, results: search(query))} + end + + def handle_params(_params, _uri, socket), do: {:noreply, socket} + + def handle_event("search", %{"query" => query}, socket) do + {:noreply, + socket + |> assign(query: query, results: search(query)) + |> push_patch(to: ~p"/search?q=#{query}")} + end + + def render(assigns) do + ~H""" + +
+ + +
+ +
+

{result.title}

+

{result.summary}

+
+
+ """ + end + + defp search(query), do: MyApp.Search.find(query) +end +``` + +**LiveView lifecycle:** `mount/3` → `handle_params/3` → `render/1`. Events via `handle_event/3`. Server pushes via `handle_info/2`. + +**Key patterns:** +- `assign/2,3` — set socket assigns +- `push_navigate/2` — navigate to new LiveView (full mount) +- `push_patch/2` — update URL without full remount (calls `handle_params`) +- `push_event/3` — push event to client JS hooks +- `stream/3,4` — efficient list rendering for large collections (inserts/deletes without re-rendering entire list) +- `async_assign/3` + `assign_async/3` — async data loading with loading/error states + +### Channels — Real-Time WebSocket + +```elixir +# In endpoint.ex +socket "/socket", MyAppWeb.UserSocket, + websocket: true, + longpoll: false + +# lib/my_app_web/channels/user_socket.ex +defmodule MyAppWeb.UserSocket do + use Phoenix.Socket + + channel "room:*", MyAppWeb.RoomChannel + + def connect(%{"token" => token}, socket, _connect_info) do + case Phoenix.Token.verify(socket, "user auth", token, max_age: 86400) do + {:ok, user_id} -> {:ok, assign(socket, :user_id, user_id)} + {:error, _} -> :error + end + end + + def id(socket), do: "users_socket:#{socket.assigns.user_id}" +end + +# lib/my_app_web/channels/room_channel.ex +defmodule MyAppWeb.RoomChannel do + use MyAppWeb, :channel + + def join("room:" <> room_id, _params, socket) do + {:ok, assign(socket, :room_id, room_id)} + end + + def handle_in("new_msg", %{"body" => body}, socket) do + broadcast!(socket, "new_msg", %{ + body: body, + user_id: socket.assigns.user_id + }) + {:noreply, socket} + end +end +``` + +**Force disconnect all sessions for a user:** +```elixir +MyAppWeb.Endpoint.broadcast("users_socket:#{user.id}", "disconnect", %{}) +``` + +### Contexts — Business Logic Boundary + +Contexts are plain Elixir modules that encapsulate data access and business rules. They are the API between your web layer and your domain. + +```elixir +# Generate with: mix phx.gen.context Catalog Product products title:string price:decimal +defmodule MyApp.Catalog do + import Ecto.Query + alias MyApp.Repo + alias MyApp.Catalog.Product + + def list_products do + Repo.all(Product) + end + + def get_product!(id), do: Repo.get!(Product, id) + + def create_product(attrs \\ %{}) do + %Product{} + |> Product.changeset(attrs) + |> Repo.insert() + end + + def update_product(%Product{} = product, attrs) do + product + |> Product.changeset(attrs) + |> Repo.update() + end + + def delete_product(%Product{} = product) do + Repo.delete(product) + end + + def change_product(%Product{} = product, attrs \\ %{}) do + Product.changeset(product, attrs) + end +end +``` + +**Context design principles:** +- One context per bounded domain (Catalog, Accounts, Orders) +- Contexts own their schemas — other contexts reference by ID, not struct +- Cross-context calls go through the public context API, never access another context's Repo directly +- Contexts can nest related schemas (Comments under Posts) + +### Authentication — `mix phx.gen.auth` + +```bash +mix phx.gen.auth Accounts User users +``` + +Phoenix 1.8 generates: +- **Magic links** (passwordless) — email-based login links +- **"Sudo mode"** — re-authentication for sensitive actions +- Session-based auth with secure token handling +- Email confirmation and password reset flows +- `require_authenticated_user` plug for protected routes + +### Scopes (New in 1.8) + +Scopes make secure data access the default in generators. When you generate resources with a scope, all queries are automatically filtered by the scoped user. + +```bash +mix phx.gen.live Posts Post posts title body:text --scope current_user +``` + +This generates code that automatically passes `current_user` to context functions, ensuring users only see their own data. + +### Ecto — Database Layer + +```elixir +# Schema +defmodule MyApp.Catalog.Product do + use Ecto.Schema + import Ecto.Changeset + + schema "products" do + field :title, :string + field :price, :decimal + field :status, Ecto.Enum, values: [:draft, :published, :archived] + has_many :reviews, MyApp.Reviews.Review + belongs_to :category, MyApp.Catalog.Category + timestamps(type: :utc_datetime) + end + + def changeset(product, attrs) do + product + |> cast(attrs, [:title, :price, :status, :category_id]) + |> validate_required([:title, :price]) + |> validate_number(:price, greater_than: 0) + |> unique_constraint(:title) + |> foreign_key_constraint(:category_id) + end +end + +# Queries +import Ecto.Query + +# Composable queries +def published(query \\ Product) do + from p in query, where: p.status == :published +end + +def recent(query, days \\ 7) do + from p in query, where: p.inserted_at > ago(^days, "day") +end + +def with_reviews(query) do + from p in query, preload: [:reviews] +end + +# Usage: Product |> published() |> recent(30) |> with_reviews() |> Repo.all() +``` + +### Telemetry — Built-in Observability + +Phoenix 1.8 includes a Telemetry supervisor that tracks request duration, Ecto query times, and VM metrics out of the box. + +```elixir +# lib/my_app_web/telemetry.ex (auto-generated) +defmodule MyAppWeb.Telemetry do + use Supervisor + import Telemetry.Metrics + + def metrics do + [ + summary("phoenix.endpoint.stop.duration", unit: {:native, :millisecond}), + summary("phoenix.router_dispatch.stop.duration", tags: [:route], unit: {:native, :millisecond}), + summary("my_app.repo.query.total_time", unit: {:native, :millisecond}), + summary("vm.memory.total", unit: {:byte, :kilobyte}), + summary("vm.total_run_queue_lengths.total"), + summary("vm.total_run_queue_lengths.cpu"), + ] + end +end +``` + +Integrates with **PromEx** for Prometheus/Grafana dashboards (see Monitoring section). + +### Phoenix Security Best Practices + +**Never pass untrusted input to:** `Code.eval_string/3`, `:os.cmd/2`, `System.cmd/3`, `System.shell/2`, `:erlang.binary_to_term/2` + +**Ecto prevents SQL injection by default** — the query DSL parameterizes all inputs. Only `Ecto.Adapters.SQL.query/4` with raw string interpolation is vulnerable. + +**Safe deserialization:** +```elixir +# UNSAFE — even with :safe flag +:erlang.binary_to_term(user_input, [:safe]) + +# SAFE — prevents executable terms +Plug.Crypto.non_executable_binary_to_term(user_input, [:safe]) +``` + +**CSRF protection** is built into the `:browser` pipeline via `protect_from_forgery`. **Content Security Policy** is set by `put_secure_browser_headers`. + +### Testing Phoenix + +```elixir +# Controller test +defmodule MyAppWeb.ProductControllerTest do + use MyAppWeb.ConnCase + + test "GET /products", %{conn: conn} do + conn = get(conn, ~p"/products") + assert html_response(conn, 200) =~ "Products" + end + + test "POST /products creates product", %{conn: conn} do + conn = post(conn, ~p"/products", product: %{title: "Widget", price: 9.99}) + assert redirected_to(conn) =~ ~p"/products/" + end +end + +# LiveView test +defmodule MyAppWeb.CounterLiveTest do + use MyAppWeb.ConnCase + import Phoenix.LiveViewTest + + test "increments counter", %{conn: conn} do + {:ok, view, html} = live(conn, ~p"/counter") + assert html =~ "Count: 0" + + assert view + |> element("button", "+1") + |> render_click() =~ "Count: 1" + end +end + +# Channel test +defmodule MyAppWeb.RoomChannelTest do + use MyAppWeb.ChannelCase + + test "broadcasts new messages" do + {:ok, _, socket} = subscribe_and_join(socket(MyAppWeb.UserSocket), MyAppWeb.RoomChannel, "room:lobby") + + push(socket, "new_msg", %{"body" => "hello"}) + assert_broadcast "new_msg", %{body: "hello"} + end +end +``` + +### Phoenix Generators Cheat Sheet + +```bash +# HTML CRUD (controller + views + templates + context + schema + migration) +mix phx.gen.html Catalog Product products title:string price:decimal + +# LiveView CRUD +mix phx.gen.live Catalog Product products title:string price:decimal + +# JSON API +mix phx.gen.json Catalog Product products title:string price:decimal + +# Context + schema only (no web layer) +mix phx.gen.context Catalog Product products title:string price:decimal + +# Schema + migration only +mix phx.gen.schema Product products title:string price:decimal + +# Authentication +mix phx.gen.auth Accounts User users + +# Channel +mix phx.gen.channel Room + +# Presence +mix phx.gen.presence + +# Release files (Dockerfile, release.ex, overlay scripts) +mix phx.gen.release +mix phx.gen.release --docker # Include Dockerfile +``` + +### Phoenix Release & Deployment + +```bash +# Generate release infrastructure +mix phx.gen.release --docker + +# Build release +MIX_ENV=prod mix deps.get --only prod +MIX_ENV=prod mix compile +MIX_ENV=prod mix assets.deploy +MIX_ENV=prod mix release + +# Release commands +_build/prod/rel/my_app/bin/server # Start with Phoenix server +_build/prod/rel/my_app/bin/migrate # Run migrations +_build/prod/rel/my_app/bin/my_app remote # Attach IEx console +``` + +**Runtime config (`config/runtime.exs`):** +```elixir +import Config + +if config_env() == :prod do + database_url = System.fetch_env!("DATABASE_URL") + secret_key_base = System.fetch_env!("SECRET_KEY_BASE") + + config :my_app, MyApp.Repo, + url: database_url, + pool_size: String.to_integer(System.get_env("POOL_SIZE") || "10") + + config :my_app, MyAppWeb.Endpoint, + url: [host: System.fetch_env!("PHX_HOST"), port: 443, scheme: "https"], + http: [ip: {0, 0, 0, 0}, port: String.to_integer(System.get_env("PORT") || "4000")], + secret_key_base: secret_key_base +end +``` + +--- + +## Common Libraries | Library | Purpose | Hex | |---------|---------|-----| | Phoenix | Web framework + LiveView | `{:phoenix, "~> 1.8"}` | | Ecto | Database wrapper + query DSL | `{:ecto_sql, "~> 3.12"}` | | Ash | Declarative resource framework | `{:ash, "~> 3.0"}` | -| Jido | Multi-agent orchestration | `{:jido, "~> 2.1"}` | -| GenStateMachine | State machines (wraps gen_statem) | `{:gen_state_machine, "~> 3.0"}` | | Oban | Background job processing | `{:oban, "~> 2.18"}` | -| Req | HTTP client (modern) | `{:req, "~> 0.5"}` | +| Req | HTTP client (modern, composable) | `{:req, "~> 0.5"}` | | Jason | JSON encoding/decoding | `{:jason, "~> 1.4"}` | | Swoosh | Email sending | `{:swoosh, "~> 1.16"}` | +| ExUnit | Testing (built-in) | — | | Mox | Mock behaviours for testing | `{:mox, "~> 1.1", only: :test}` | | Credo | Static analysis / linting | `{:credo, "~> 1.7", only: [:dev, :test]}` | +| Dialyxir | Static typing via Dialyzer | `{:dialyxir, "~> 1.4", only: [:dev, :test]}` | | libcluster | Automatic BEAM node clustering | `{:libcluster, "~> 3.4"}` | | Horde | Distributed supervisor/registry | `{:horde, "~> 0.9"}` | | Nx | Numerical computing / tensors | `{:nx, "~> 0.9"}` | +| Bumblebee | Pre-trained ML models on BEAM | `{:bumblebee, "~> 0.6"}` | | Broadway | Data ingestion pipelines | `{:broadway, "~> 1.1"}` | | PromEx | Prometheus metrics for Elixir | `{:prom_ex, "~> 1.9"}` | +| Finch | HTTP client (low-level, pooled) | `{:finch, "~> 0.19"}` | | usage_rules | AI agent docs from deps | `{:usage_rules, "~> 1.1", only: :dev}` | + +--- + +## Interop — Elixir as Orchestrator + +### Python via Ports/Erlport +```elixir +# Using erlport for bidirectional Python calls +{:ok, pid} = :python.start([{:python_path, ~c"./python_scripts"}]) +result = :python.call(pid, :my_module, :my_function, [arg1, arg2]) +:python.stop(pid) +``` + +### Rust via Rustler NIFs +```elixir +# mix.exs +{:rustler, "~> 0.34"} + +# lib/my_nif.ex +defmodule MyApp.NativeSort do + use Rustler, otp_app: :my_app, crate: "native_sort" + + # NIF stubs — replaced at load time by Rust implementations + def sort(_list), do: :erlang.nif_error(:nif_not_loaded) +end +``` + +### System Commands via Ports +```elixir +# One-shot command +{output, 0} = System.cmd("ffmpeg", ["-i", input, "-o", output]) + +# Long-running port +port = Port.open({:spawn, "python3 worker.py"}, [:binary, :exit_status]) +send(port, {self(), {:command, "process\n"}}) +receive do + {^port, {:data, data}} -> handle_response(data) +end +``` + +--- + +## Cortex Deployment Story + +### Current State (March 2026) +- Cortex runs Ubuntu 24.04 with Caddy as web server +- **Elixir 1.19.5 / OTP 27** installed via Erlang Solutions + GitHub releases +- Symbiont Elixir service running on port 8112 alongside Python Symbiont on 8111 +- systemd unit: `symbiont-ex-api.service` + +### Proven Installation Steps for Ubuntu 24.04 + +**Important**: Ubuntu 24.04 repos ship ancient Elixir 1.14 / OTP 25. Do NOT use `apt install elixir`. + +```bash +# Step 1: Remove Ubuntu's outdated Erlang/Elixir packages +apt-get remove -y erlang-base elixir erlang-dev erlang-parsetools erlang-syntax-tools +apt-get autoremove -y + +# Step 2: Install Erlang from Erlang Solutions +wget https://packages.erlang-solutions.com/erlang-solutions_2.0_all.deb +dpkg -i erlang-solutions_2.0_all.deb +apt-get update +apt-get install -y esl-erlang # Installs OTP 27.x + +# Step 3: Install precompiled Elixir from GitHub releases +# Check latest: curl -sL https://api.github.com/repos/elixir-lang/elixir/releases?per_page=5 | grep tag_name +ELIXIR_VERSION="1.19.5" +cd /opt +wget https://github.com/elixir-lang/elixir/releases/download/v${ELIXIR_VERSION}/elixir-otp-27.zip +unzip elixir-otp-27.zip -d elixir-${ELIXIR_VERSION} + +# Step 4: Symlink binaries +for bin in elixir elixirc iex mix; do + ln -sf /opt/elixir-${ELIXIR_VERSION}/bin/${bin} /usr/local/bin/${bin} +done + +# Step 5: Verify +elixir --version # Should show Elixir 1.19.5 (compiled with Erlang/OTP 27) +``` + +### Upgrading Elixir + +To find the latest stable version: +```bash +curl -sL https://api.github.com/repos/elixir-lang/elixir/releases?per_page=10 | grep tag_name +``` +Look for the highest non-rc tag. Then repeat Steps 3-5 above with the new version number. + +### systemd Service Template +```ini +[Unit] +Description=Symbiont Elixir API +After=network.target + +[Service] +Type=simple +# CRITICAL: mix requires HOME to be set +Environment=HOME=/root +Environment=MIX_ENV=prod +WorkingDirectory=/root/symbiont_ex +ExecStart=/usr/local/bin/mix run --no-halt +Restart=on-failure + +[Install] +WantedBy=multi-user.target +``` + +### BEAMOps Principles for Cortex +From "Engineering Elixir Applications" — the deployment philosophy: + +1. **Environment Integrity** — identical builds dev/staging/prod via Docker + releases +2. **Infrastructure as Code** — Caddy config, systemd units, backup scripts all version-controlled +3. **OTP Releases** — self-contained, no runtime deps, `bin/my_app start` +4. **Distributed Erlang** — nodes discover each other, share state via PubSub, global registry +5. **Instrumentation** — PromEx + Prometheus + Grafana + Loki for full observability +6. **Health Checks + Rollbacks** — Docker health checks trigger automatic rollback on failed deploys +7. **Zero-downtime deploys** — rolling updates via Docker Swarm or `mix release` hot upgrades + +### CI Pipeline for Elixir (GitHub Actions) +```yaml +name: Elixir CI +on: [push, pull_request] + +jobs: + test: + runs-on: ubuntu-latest + services: + db: + image: postgres:16 + env: + POSTGRES_PASSWORD: postgres + ports: ['5432:5432'] + steps: + - uses: actions/checkout@v4 + - uses: erlef/setup-beam@v1 + with: + elixir-version: '1.19.5' + otp-version: '27.3' + - run: mix deps.get + - run: mix compile --warnings-as-errors + - run: mix format --check-formatted + - run: mix credo --strict + - run: mix test + - run: mix deps.unlock --check-unused +``` + +--- + +## Anti-Patterns to Avoid + +### Process Anti-Patterns +- **GenServer as code organization** — don't wrap pure functions in a GenServer. Use modules. +- **Agent for complex state** — if you need more than get/update, use GenServer directly. +- **Spawning unsupervised processes** — always use `Task.Supervisor` or link to a supervisor. + +### Code Anti-Patterns +- **Primitive obsession** — use structs, not bare maps, for domain concepts. +- **Boolean parameters** — use atoms or keyword options: `format: :json` not `json: true`. +- **Large modules** — split by concern, not by entity type. Domain logic, web layer, workers. +- **String keys in internal maps** — use atoms internally, strings only at boundaries (JSON, forms). + +### Design Anti-Patterns +- **Monolithic contexts** — Phoenix contexts should be small, focused. Split `Accounts` from `Authentication`. +- **God GenServer** — one process handling all state for the app. Distribute responsibility. +- **Synchronous calls to slow services** — use `Task.async` + `Task.await` with timeouts. + +--- + +## Quick Recipes + +### HTTP Request with Req +```elixir +Req.get!("https://api.example.com/data", + headers: [{"authorization", "Bearer #{token}"}], + receive_timeout: 15_000 +).body +``` + +### JSON Encode/Decode +```elixir +Jason.encode!(%{name: "Michael", role: :admin}) +Jason.decode!(~s({"name": "Michael"}), keys: :atoms) +``` + +### Ecto Query +```elixir +from p in Post, + where: p.status == :published, + where: p.inserted_at > ago(7, "day"), + order_by: [desc: p.inserted_at], + limit: 10, + preload: [:author, :comments] +``` + +### Background Job with Oban +```elixir +defmodule MyApp.Workers.EmailWorker do + use Oban.Worker, queue: :mailers, max_attempts: 3 + + @impl true + def perform(%Oban.Job{args: %{"to" => to, "template" => template}}) do + MyApp.Mailer.send(to, template) + end +end + +# Enqueue +%{to: "user@example.com", template: "welcome"} +|> MyApp.Workers.EmailWorker.new(schedule_in: 60) +|> Oban.insert!() +``` + +### LiveView Component +```elixir +defmodule MyAppWeb.CounterLive do + use MyAppWeb, :live_view + + def mount(_params, _session, socket) do + {:ok, assign(socket, count: 0)} + end + + def handle_event("increment", _, socket) do + {:noreply, update(socket, :count, &(&1 + 1))} + end + + def render(assigns) do + ~H""" +
+

Count: {@count}

+ +
+ """ + end +end +``` + +--- + +## AI Agent Lessons Learned (Symbiont Migration, March 2026) + +Hard-won lessons from building the Elixir Symbiont orchestrator. These are things Claude (and other AI agents) got wrong or didn't know — preserved here so we don't repeat them. + +### `System.cmd/3` Does NOT Have an `:input` Option + +This is a persistent hallucination. **No version of Elixir** (1.14 through 1.19) supports passing stdin via `System.cmd/3`. The `:input` option simply does not exist. + +**Wrong** (will silently ignore the option or error): +```elixir +System.cmd("claude", ["-p", "--model", "haiku"], input: prompt) +``` + +**Correct** — use `System.shell/2` with a pipe: +```elixir +escaped = prompt |> String.replace("'", "'\\''") +{output, exit_code} = System.shell("printf '%s' '#{escaped}' | claude -p --model haiku 2>&1") +``` + +Or use Erlang Ports directly for full stdin/stdout control: +```elixir +port = Port.open({:spawn_executable, "/usr/local/bin/claude"}, [:binary, :exit_status, args: ["-p"]]) +Port.command(port, prompt) +Port.command(port, :eof) # signal end of input +``` + +### `Float.round/2` Requires a Float Argument + +`Float.round(0, 4)` crashes with `FunctionClauseError` because `0` is an integer, not a float. This commonly bites when summing an empty list — `Enum.sum([])` returns `0` (integer), not `0.0`. + +**Wrong**: +```elixir +entries |> Enum.map(& &1["cost"]) |> Enum.sum() |> Float.round(4) +# Crashes when entries is empty! +``` + +**Correct** — use a float accumulator: +```elixir +entries +|> Enum.reduce(0.0, fn entry, acc -> acc + to_float(entry["cost"]) end) +|> Float.round(4) + +defp to_float(nil), do: 0.0 +defp to_float(n) when is_float(n), do: n +defp to_float(n) when is_integer(n), do: n * 1.0 +defp to_float(_), do: 0.0 +``` + +### Heredoc Closing `"""` Must Be on Its Own Line + +Module attributes with heredocs are tricky. The closing `"""` cannot share a line with content. + +**Wrong** (syntax error): +```elixir +@prompt """ +Classify this task: """ +``` + +**Correct** — use string concatenation for prompts with trailing content: +```elixir +@prompt "Classify this task. Respond with JSON: " <> + ~s({"tier": 1|2|3, "reason": "brief explanation"}\n\n) <> + "Task: " +``` + +### OTP Application Supervisors vs. Test Isolation + +When your `application.ex` starts GenServers in the supervision tree, those processes auto-start when `mix test` runs. Tests that call `start_link` for the same named process will crash with `{:error, {:already_started, pid}}`. + +**Solution**: Start an empty supervisor in test mode: +```elixir +# application.ex +def start(_type, _args) do + if Application.get_env(:symbiont, :port) == 0 do + Supervisor.start_link([], strategy: :one_for_one, name: Symbiont.Supervisor) + else + start_full() + end +end +``` + +```elixir +# config/test.exs +config :symbiont, port: 0 # Signals test mode +``` + +Then each test's `setup` block starts only the processes it needs, with `on_exit` cleanup: +```elixir +setup do + safe_stop(Symbiont.Ledger) + {:ok, _} = Symbiont.Ledger.start_link(data_dir: tmp_dir) + on_exit(fn -> safe_stop(Symbiont.Ledger) end) +end + +def safe_stop(name) do + case Process.whereis(name) do + nil -> :ok + pid -> try do GenServer.stop(pid) catch :exit, _ -> :ok end + end +end +``` + +### `use Plug.Test` Is Deprecated + +In modern Plug (1.15+), `use Plug.Test` emits a deprecation warning. Replace with explicit imports: + +```elixir +# Old (deprecated) +use Plug.Test + +# New +import Plug.Test # for conn/2, conn/3 +import Plug.Conn # for put_req_header/3, etc. +``` + +### Ubuntu 24.04 Ships Ancient Elixir + +Ubuntu's apt repos have Elixir 1.14 and OTP 25. These are years behind and missing critical features. **Never** use `apt install elixir` on Ubuntu. See the "Proven Installation Steps" section above for the correct approach. + +Key gotcha: Ubuntu's `erlang-base` package conflicts with `esl-erlang` from Erlang Solutions. You must `apt-get remove` all Ubuntu erlang packages before installing `esl-erlang`. + +### systemd Needs `Environment=HOME=/root` + +`mix` and other Elixir tooling require the `HOME` environment variable. systemd services don't inherit it. Without `Environment=HOME=/root` in the unit file, services crash with "could not find the user home." + +### How to Find the Real Latest Stable Elixir Version + +Don't trust AI training data for version numbers. Query the source of truth: + +```bash +curl -sL https://api.github.com/repos/elixir-lang/elixir/releases?per_page=10 | grep tag_name +``` + +Look for the highest version that does NOT contain `-rc` or `-dev`. As of March 2026, that's `v1.19.5`. + +--- + +## Resources + +- [Elixir Official Docs](https://hexdocs.pm/elixir/) — always check 1.19.5 version +- [Ash Framework Docs](https://hexdocs.pm/ash/) — resource-oriented patterns +- [Phoenix HexDocs](https://hexdocs.pm/phoenix/) — web framework +- [Elixir Forum](https://elixirforum.com/) — community Q&A +- [Elixir School](https://elixirschool.com/) — learning resource +- "Elixir in Action" by Saša Jurić — deep BEAM/OTP understanding (note: covers 1.15, check breaking changes above) +- "Engineering Elixir Applications" by Fairholm & D'Lacoste — BEAMOps deployment patterns diff --git a/symbiont/SKILL.md b/symbiont/SKILL.md index 79f1192..0b27e37 100644 --- a/symbiont/SKILL.md +++ b/symbiont/SKILL.md @@ -1,9 +1,10 @@ --- name: symbiont -description: Living operational documentation for Symbiont, the self-sustaining AI orchestrator running on cortex.hydrascale.net. Load this skill to get instant context about the Symbiont project, understand architecture, check health, deploy code, or submit tasks. Covers everything from server access to API endpoints to cost tracking. +description: Living operational documentation for Symbiont, the self-sustaining AI orchestrator running on cortex.hydrascale.net. Built in Elixir/OTP. Load this skill to get instant context about the Symbiont project, understand architecture, check health, deploy code, or submit tasks. Covers everything from server access to API endpoints to cost tracking. metadata: project: symbiont type: operational-documentation + runtime: elixir-otp triggers: - symbiont - orchestrator @@ -18,9 +19,8 @@ metadata: - deploy changes - dispatcher - router - - scheduler - - symbiont-api - - symbiont-heartbeat + - symbiont-ex-api + - elixir orchestrator keywords: - AI orchestration - Claude Code CLI wrapper @@ -28,7 +28,11 @@ metadata: - cost optimization - infrastructure - health checks - - fastapi + - elixir + - otp + - genserver + - plug + - bandit - systemd - ledger --- @@ -37,14 +41,16 @@ metadata: ## Project Overview -**Symbiont** is a self-sustaining AI orchestration system that runs on `cortex.hydrascale.net`. It routes computational tasks to the cheapest capable Claude model tier via the Claude Code CLI, generating operational insights and revenue. +**Symbiont** is a self-sustaining AI orchestration system running on `cortex.hydrascale.net`, built in **Elixir/OTP**. It routes computational tasks to the cheapest capable Claude model tier via the Claude Code CLI, tracks costs in an append-only ledger, and manages a persistent task queue — all supervised by OTP for fault tolerance. + +**Migrated from Python to Elixir in March 2026.** The Python version (FastAPI) has been retired. All orchestration now runs on the BEAM VM. ### The Partnership - **Michael Dwyer** provides: infrastructure, legal identity, capital, and account ownership - **The AI** provides: cognition, code, maintenance, and revenue generation - **Revenue split**: ~50/50 after costs (token spend + server infrastructure) -This skill exists so that any fresh AI session—whether it's the next scheduled task, a hotfix deployment, or a quarterly review—wakes up with full context rather than starting from scratch. +This skill exists so that any fresh AI session wakes up with full context rather than starting from scratch. --- @@ -54,87 +60,121 @@ This skill exists so that any fresh AI session—whether it's the next scheduled **Server:** `cortex.hydrascale.net` - Root SSH access available (paramiko) -- SSH key lookup: `glob.glob('/sessions/*/mnt/uploads/cortex')` with passphrase `42Awk!%@^#&` -- Project root: `/data/symbiont/` -- Git repo: `/data/symbiont/.git` (5 commits) +- SSH key: look in `~/.ssh/cortex` in the mounted workspace, or `/sessions/*/mnt/uploads/cortex` +- Key passphrase: `42Awk!%@^#&` +- Project root: `/data/symbiont_ex/` +- Data directory: `/data/symbiont_ex/data/` (ledger.jsonl, queue.jsonl) - Nightly backup: `rsync.net` at `de2613@de2613.rsync.net:cortex-backup/cortex/` +- **Runtime**: Elixir 1.19.5 / OTP 27 on BEAM VM -### Active Services (Systemd) -Both services are **enabled and auto-start on boot**: +### Active Service (Systemd) -1. **`symbiont-api.service`** - - FastAPI server listening on `127.0.0.1:8111` - - Configuration: `Restart=always` - - Endpoints documented below +**`symbiont-ex-api.service`** — enabled, auto-starts on boot +- Elixir/OTP application via `mix run --no-halt` +- Plug + Bandit HTTP server on `0.0.0.0:8111` +- OTP supervision tree: Task.Supervisor → Ledger → Queue → Heartbeat → Bandit +- Built-in heartbeat (GenServer with 5-min timer) — no separate systemd timer needed +- Configuration: `Restart=always`, `RestartSec=5` -2. **`symbiont-heartbeat.timer`** - - Fires every 5 minutes - - Executes `/data/symbiont/symbiont/heartbeat.py` - - Processes queued tasks, logs health metrics +### Retired Services (Python — disabled) +- `symbiont-api.service` — FastAPI, was on port 8111 (now disabled) +- `symbiont-heartbeat.timer` — was a 5-min systemd timer (now disabled) +- Python code archived at `/data/symbiont/` (not deleted, just inactive) ### Health Check (from cortex shell) ```bash -systemctl status symbiont-api symbiont-heartbeat.timer +systemctl status symbiont-ex-api --no-pager +curl -s http://127.0.0.1:8111/health | python3 -m json.tool curl -s http://127.0.0.1:8111/status | python3 -m json.tool -tail -5 /data/symbiont/heartbeat.jsonl | python3 -m json.tool +curl -s http://127.0.0.1:8111/ledger/stats | python3 -m json.tool ``` --- -## Architecture: The Symbiont Stack +## Architecture: The Elixir/OTP Stack ### Directory Structure ``` -/data/symbiont/ -├── symbiont/ -│ ├── dispatcher.py # Claude Code CLI wrapper + cost ledger logging -│ ├── router.py # Task classifier (Haiku) + dispatch logic -│ ├── scheduler.py # Task queue (JSONL) + systemd wake timers -│ ├── heartbeat.py # 5-min health checks + queue processor -│ ├── api.py # FastAPI server (POST /task, GET /status, etc.) -│ ├── wake.py # Called by systemd on rate-limit recovery -│ └── main.py # CLI entrypoint or --serve for API mode -├── ledger.jsonl # Complete call log: model, tokens, cost, timestamp -├── heartbeat.jsonl # Health + queue processing logs -├── queue.jsonl # Persistent task queue (JSONL format) -└── test_router.py # E2E integration tests +/data/symbiont_ex/ +├── lib/ +│ ├── symbiont.ex # Top-level module (version/0, runtime/0) +│ └── symbiont/ +│ ├── application.ex # OTP Application — supervision tree +│ ├── api.ex # Plug router (HTTP endpoints) +│ ├── dispatcher.ex # Claude CLI wrapper via System.shell/2 +│ ├── router.ex # Task classifier (Haiku-first routing) +│ ├── ledger.ex # GenServer — append-only JSONL cost log +│ ├── queue.ex # GenServer — persistent JSONL task queue +│ └── heartbeat.ex # GenServer — periodic health checks + queue processing +├── config/ +│ ├── config.exs # Base config (port, data_dir, intervals) +│ ├── dev.exs # Dev overrides +│ ├── prod.exs # Prod overrides +│ ├── runtime.exs # Reads SYMBIONT_PORT, SYMBIONT_DATA_DIR env vars +│ └── test.exs # Test mode: port=0, cli="echo", heartbeat=24h +├── test/ +│ ├── support/test_helpers.ex # safe_stop/1, stop_all_services/0 +│ └── symbiont/ # 6 test files, 39 tests total +├── data/ +│ ├── ledger.jsonl # Append-only cost log (immutable) +│ └── queue.jsonl # Persistent task queue +└── mix.exs # Project definition (Elixir ~> 1.19) ``` +### Local Source Copy +The canonical source is also at: `/sessions/*/mnt/michaeldwyer/src/symbiont_ex/` +(This is the development copy used during Cowork sessions.) + +### OTP Supervision Tree + +``` +Symbiont.Supervisor (rest_for_one) +├── Task.Supervisor — async task execution +├── Symbiont.Ledger — GenServer: append-only cost ledger +├── Symbiont.Queue — GenServer: persistent task queue +├── Symbiont.Heartbeat — GenServer: periodic health + queue processing (5-min timer) +└── Bandit — HTTP server (Plug adapter, port 8111) +``` + +**Strategy: `rest_for_one`** — if the Ledger crashes, everything downstream (Queue, Heartbeat, Bandit) restarts too, ensuring no calls are logged to a stale process. + ### Core Components -#### 1. **router.py** — Task Classification & Routing -- Takes incoming task (any prompt/request) -- Classifies via Haiku tier: determines capability level + confidence -- Returns routing decision: which tier (1=Haiku, 2=Sonnet, 3=Opus) is cheapest and capable -- Logs reasoning (useful for debugging) +#### 1. `Symbiont.Router` — Task Classification +- Calls Haiku via Dispatcher to classify incoming tasks +- Returns `{tier, confidence, reason}` — tier 1/2/3 maps to Haiku/Sonnet/Opus +- Falls back to default tier on classification failure -#### 2. **dispatcher.py** — Model Execution & Ledger -- Wraps Claude Code CLI invocation (`claude` command) -- Captures: model used, token counts, timing, success/failure -- **Writes every call to `ledger.jsonl`** (immutable cost log) -- Handles rate-limit backoff and model fallback (if Sonnet is rate-limited, tries Opus) +#### 2. `Symbiont.Dispatcher` — Model Execution +- Wraps Claude Code CLI via `System.shell/2` with `printf | claude` pipe pattern +- **Important**: `System.cmd/3` does NOT have an `:input` option — must use shell pipes +- Captures: model, tokens, timing, success/failure +- Logs every call to Ledger GenServer -#### 3. **scheduler.py** — Task Queue & Wake Events -- Persistent queue stored in `queue.jsonl` (JSONL: one task per line) -- Tasks are JSON objects: `{"id": "...", "task": "...", "created_at": "...", "status": "pending|processing|done"}` -- Integrates with systemd timers: when rate-limit expires, systemd fires `/data/symbiont/symbiont/wake.py` to resume -- On boot, checks queue and seeds next timer +#### 3. `Symbiont.Ledger` — Cost Tracking (GenServer) +- Append-only JSONL file at `data/ledger.jsonl` +- Provides `log_call/1`, `recent/1`, `stats/0` +- Stats aggregate by model, by date, with running totals +- Uses `Float.round/2` with float coercion (see AI Agent Lessons in elixir-guide) -#### 4. **heartbeat.py** — Periodic Health & Queue Processing -- Runs every 5 minutes (via `symbiont-heartbeat.timer`) -- Checks: API is responding, disk space, ledger is writable -- Processes up to N tasks from queue (configurable) -- Logs health snapshots to `heartbeat.jsonl` -- If API is down, restarts it (systemd Restart=always is backup) +#### 4. `Symbiont.Queue` — Task Queue (GenServer) +- Persistent JSONL at `data/queue.jsonl` +- States: pending → processing → done/failed +- `enqueue/1`, `take/1`, `complete/1`, `fail/1` +- Loaded from disk on startup -#### 5. **api.py** — FastAPI Server -- Listens on `127.0.0.1:8111` -- Endpoints: `/task`, `/queue`, `/status`, `/ledger`, `/ledger/stats` -- Can be called from Python, curl, or webhook +#### 5. `Symbiont.Heartbeat` — Health Monitor (GenServer) +- Internal 5-minute timer via `Process.send_after/3` +- Checks queue, processes pending tasks, logs health +- No external systemd timer needed (OTP handles scheduling) -#### 6. **main.py** — Entrypoint -- CLI mode: `python main.py --task "your task"` → routes and executes -- API mode: `python main.py --serve` → starts FastAPI (used by systemd) +#### 6. `Symbiont.API` — HTTP Router (Plug) +- `POST /task` — execute immediately +- `POST /queue` — add to persistent queue +- `GET /status` — health, queue size, cost totals +- `GET /health` — simple health check +- `GET /ledger` — recent calls +- `GET /ledger/stats` — aggregate cost stats --- @@ -150,43 +190,16 @@ tail -5 /data/symbiont/heartbeat.jsonl | python3 -m json.tool ### Routing Logic -1. **Task arrives** → dispatcher calls router -2. **Router classifies** (via Haiku inference): - - Confidence score: low/medium/high - - Reason: "simple classification", "needs reasoning", "complex strategy" - - Recommended tier: 1, 2, or 3 -3. **Dispatcher routes** to cheapest **capable** tier: - - If high confidence → use tier 1 or 2 - - If complex reasoning required → use tier 2 or 3 - - If rate-limited on tier 2 → escalate to tier 3 -4. **Result + cost logged** to `ledger.jsonl` - -**Example routing:** -- "Summarize this email" → Haiku says Tier 1 capable → routes to **Haiku** (~$0.008) -- "Refactor this 500-line function" → Haiku says Tier 2 → routes to **Sonnet** (~$0.04) -- "Design a new consensus algorithm" → Haiku says Tier 3 → routes to **Opus** (~$0.15) +1. **Task arrives** → `POST /task` or queue processing +2. **Router classifies** (via Haiku): confidence, reason, recommended tier +3. **Dispatcher routes** to cheapest capable tier +4. **Result + cost logged** to Ledger GenServer → `ledger.jsonl` --- ## Dendrite Integration -Symbiont has web perception via **Dendrite**, a headless Chromium browser running on cortex as a Docker service. - -### Quick access from Symbiont code - -```python -from symbiont.web import fetch_page, take_screenshot, search_web - -# Fetch and read a webpage -page = fetch_page("https://example.com") -print(page['title'], page['content'][:200]) - -# Screenshot for visual verification -png = take_screenshot("https://example.com") - -# Multi-step: search and read results -results = search_web("best python async frameworks 2026") -``` +Symbiont has web perception via **Dendrite**, a headless Chromium browser running on cortex. ### Dendrite endpoints (from cortex localhost or public URL) | Endpoint | What it does | @@ -217,7 +230,7 @@ Submit and execute a task immediately. ```json { "task": "Analyze this user feedback and extract sentiment", - "force_tier": "haiku" // optional: override router decision + "force_tier": "haiku" } ``` @@ -232,374 +245,236 @@ Submit and execute a task immediately. "input_tokens": 45, "output_tokens": 87, "estimated_cost_usd": 0.0082, - "timestamp": "2026-03-19T14:33:12Z" + "timestamp": "2026-03-20T14:33:12Z" } ``` ### `POST /queue` -Add a task to the persistent queue (executes on next heartbeat). +Add a task to the persistent queue (executes on next heartbeat cycle). **Request:** ```json { - "task": "Run weekly subscriber report", - "priority": "normal" + "task": "Run weekly subscriber report" } ``` **Response:** ```json { - "id": "queued-1711123500", + "id": "queued-abc123", "status": "queued", "position": 3 } ``` ### `GET /status` -Health check: API status, rate-limit state, queue size, last heartbeat. +Health check: API status, queue size, cost totals. **Response:** ```json { "status": "healthy", - "api_uptime_seconds": 86400, - "rate_limited": false, - "queue_size": 2, - "last_heartbeat": "2026-03-19T14:30:00Z", - "haiku_usage": {"calls_today": 42, "tokens_used": 8234}, - "sonnet_usage": {"calls_today": 5, "tokens_used": 12450}, - "opus_usage": {"calls_today": 0, "tokens_used": 0} -} -``` - -### `GET /ledger` -Recent API calls (last 50 by default). - -**Response:** -```json -{ - "entries": [ - { - "timestamp": "2026-03-19T14:32:15Z", - "model": "haiku", - "success": true, - "elapsed_seconds": 1.8, - "input_tokens": 34, - "output_tokens": 156, - "estimated_cost_usd": 0.0154, - "prompt_preview": "Classify this customer feedback as positive, neutral, or negative..." - }, - ... - ], - "count": 50 -} -``` - -### `GET /ledger/stats` -Aggregate cost & usage over time. - -**Response:** -```json -{ - "total_calls": 847, - "total_cost_estimated_usd": 12.34, + "runtime": "elixir/otp", + "queue_size": 0, + "last_heartbeat": "2026-03-20T20:15:26Z", + "total_calls": 2, + "total_cost_estimated_usd": 0.0006, "by_model": { - "haiku": {"calls": 612, "cost": 4.89}, - "sonnet": {"calls": 230, "cost": 7.20}, - "opus": {"calls": 5, "cost": 0.75} - }, - "by_date": { - "2026-03-19": {"calls": 42, "cost": 0.56} + "haiku": {"calls": 2, "cost": 0.0006} } } ``` +### `GET /health` +Simple health check — lightweight, no stats computation. + +**Response:** +```json +{"runtime": "elixir/otp", "status": "ok"} +``` + +### `GET /ledger` +Recent API calls (last 50 by default). Optional `?limit=N` parameter. + +### `GET /ledger/stats` +Aggregate cost & usage over time, broken down by model and date. + --- -## Calling the Orchestrator from Python +## Calling the API -### Simple Task (via CLI) -```python -import subprocess, json +### Via curl (from cortex) +```bash +# Health check +curl -s http://127.0.0.1:8111/health -result = subprocess.run( - ['claude', '-p', '--model', 'sonnet', '--output-format', 'json'], - input="Analyze this customer feedback...", - capture_output=True, - text=True, - timeout=30 -) +# Submit a task +curl -X POST http://127.0.0.1:8111/task \ + -H "Content-Type: application/json" \ + -d '{"task":"Summarize this email","force_tier":"haiku"}' -parsed = json.loads(result.stdout) -print(parsed['result']) +# Check stats +curl -s http://127.0.0.1:8111/ledger/stats | python3 -m json.tool ``` -### Via API Endpoint +### Via Python (from Cowork session) ```python -import requests, json - -response = requests.post('http://127.0.0.1:8111/task', json={ - 'task': 'Analyze this customer feedback...', - 'force_tier': 'sonnet' -}) - -if response.ok: - data = response.json() - print(data['result']) - print(f"Cost: ${data['estimated_cost_usd']:.4f}") -``` - -### Queue a Task for Later -```python -import requests - -response = requests.post('http://127.0.0.1:8111/queue', json={ - 'task': 'Generate weekly report for all customers', - 'priority': 'normal' -}) - -task_id = response.json()['id'] -print(f"Queued as {task_id}") +import paramiko +# ... connect via paramiko (see cortex-server skill) ... +out, err = run(client, 'curl -s http://127.0.0.1:8111/status') +print(out) ``` --- ## Ledger Format & Cost Tracking -Every inference call writes a JSONL entry to `ledger.jsonl`: +Every inference call appends a JSONL entry to `data/ledger.jsonl`: ```json { - "timestamp": "2026-03-19T14:32:15.123456Z", - "model": "sonnet", + "timestamp": "2026-03-20T14:32:15.123456Z", + "model": "haiku", "success": true, - "elapsed_seconds": 6.2, - "input_tokens": 3, - "output_tokens": 139, - "estimated_cost_usd": 0.0384, - "prompt_preview": "Classify this customer feedback as positive, neutral, or negative: 'Your product saved my business!'" + "elapsed_seconds": 1.8, + "input_tokens": 34, + "output_tokens": 156, + "estimated_cost_usd": 0.0003, + "prompt_preview": "Classify this customer feedback..." } ``` ### Why Track "Estimated Cost" on Pro? -- Current token usage is covered by Claude Pro subscription (no direct cost) -- But the ledger tracks API-equivalent cost anyway -- Why? → Tells us when switching to direct API billing makes financial sense -- If ledger shows $50/day, we may break even with API tier faster than Pro subscription +- Current token usage is covered by Claude Pro subscription +- Ledger tracks API-equivalent cost for planning +- When daily volume justifies it, can switch to direct API billing --- ## Deployment & Updates +### systemd Service File +```ini +# /etc/systemd/system/symbiont-ex-api.service +[Unit] +Description=Symbiont Elixir API +After=network.target + +[Service] +Type=simple +WorkingDirectory=/data/symbiont_ex +Environment=HOME=/root +Environment=MIX_ENV=prod +Environment=SYMBIONT_PORT=8111 +Environment=SYMBIONT_DATA_DIR=/data/symbiont_ex/data +ExecStart=/usr/bin/mix run --no-halt +Restart=always +RestartSec=5 + +[Install] +WantedBy=multi-user.target +``` + +**Critical**: `Environment=HOME=/root` is required — `mix` crashes without it. + ### How to Deploy Code Changes -1. **Edit files locally** (via SSH, Cowork, or IDE) - - Edit directly in `/data/symbiont/symbiont/*.py` - - Or upload via SFTP to `/data/symbiont/` - -2. **Commit to git** - ```bash - cd /data/symbiont - git add -A - git commit -m "Fix router confidence threshold" +1. **Upload updated files** via SFTP to `/data/symbiont_ex/` + ```python + sftp = client.open_sftp() + sftp.put('local/lib/symbiont/router.ex', '/data/symbiont_ex/lib/symbiont/router.ex') + sftp.close() ``` -3. **Restart the API** (if main code changed) +2. **Restart the service** ```bash - systemctl restart symbiont-api + systemctl restart symbiont-ex-api ``` - - Heartbeat picks up code changes automatically on next 5-min cycle - - No restart needed for scheduler.py or router.py changes (unless they're imported by API) -4. **Check status** +3. **Verify** ```bash - systemctl status symbiont-api - curl -s http://127.0.0.1:8111/status | python3 -m json.tool + systemctl status symbiont-ex-api --no-pager + curl -s http://127.0.0.1:8111/health ``` +### Running Tests +Tests run locally (in Cowork), not on cortex: +```bash +cd /path/to/symbiont_ex +mix test --trace +``` +39 tests across 7 test files. Test mode uses port=0 (no Bandit), cli="echo", and 24h heartbeat interval. + ### Nightly Backups - Automatic rsync to `rsync.net` at `de2613@de2613.rsync.net:cortex-backup/cortex/` -- Includes: all code, ledger, heartbeat logs, queue state -- Recovery: pull from backup on demand +- Includes: `/data/symbiont_ex/` (code + data) +- Python archive at `/data/symbiont/` is also backed up --- -## Common Tasks & Commands +## Configuration -### Check if Symbiont is Running -```bash -curl -s http://127.0.0.1:8111/status | python3 -m json.tool -``` -Expected: `"status": "healthy"` + recent heartbeat timestamp - -### View Recent Costs -```bash -curl -s http://127.0.0.1:8111/ledger/stats | python3 -m json.tool -``` -Shows total cost, by model, by date - -### How Much Have I Spent Today? -```bash -curl -s http://127.0.0.1:8111/ledger/stats | python3 -m json.tool | grep -A5 2026-03-19 +### config/config.exs (defaults) +```elixir +config :symbiont, + port: 8111, + data_dir: "/data/symbiont_ex", + heartbeat_interval_ms: 5 * 60 * 1_000, # 5 minutes + max_queue_batch: 5, + default_tier: :haiku, + claude_cli: "claude" ``` -### What's in the Queue? -```bash -tail -20 /data/symbiont/queue.jsonl | python3 -m json.tool +### config/runtime.exs (env overrides) +```elixir +if port = System.get_env("SYMBIONT_PORT") do + config :symbiont, port: String.to_integer(port) +end + +if data_dir = System.get_env("SYMBIONT_DATA_DIR") do + config :symbiont, data_dir: data_dir +end ``` -### Submit a Quick Task -```bash -curl -X POST http://127.0.0.1:8111/task \ - -H "Content-Type: application/json" \ - -d '{"task":"Summarize this email","force_tier":"haiku"}' -``` - -### See Recent Health Checks -```bash -tail -5 /data/symbiont/heartbeat.jsonl | python3 -m json.tool -``` - -### Trigger the Heartbeat Manually -```bash -python3 /data/symbiont/symbiont/heartbeat.py -``` - -### Monitor in Real-Time -```bash -# Watch ledger as calls come in -tail -f /data/symbiont/ledger.jsonl | python3 -m json.tool - -# Watch heartbeat logs -tail -f /data/symbiont/heartbeat.jsonl -``` - ---- - -## Business Context - -### Ownership & Legal -- **Michael Dwyer** is the legal owner of all Anthropic accounts and infrastructure -- This is a requirement of the partnership: AI cannot own accounts -- All decisions flow through Michael as the responsible party - -### Revenue Model -**Current:** ~50/50 split after costs -- Costs: token spend (tracked in ledger) + server infrastructure (~$X/month) -- Revenue: TBD (in design phase) - - Content-as-a-service (AI-generated reports, analysis) - - Micro-SaaS API (white-label task routing for other teams) - - Research subscriptions (specialized insights) - -### Cost Tracking Philosophy -- Ledger records API-equivalent cost even on Pro subscription -- Helps predict break-even point for switching to direct API billing -- When daily volume justifies it, can migrate to cheaper API tier - -### Current Spend -- **~$0/month** (covered by Claude Pro) -- Ledger shows "virtual cost" for planning purposes -- Once volume justifies, switch to API model and realize cost savings - ---- - -## Troubleshooting - -### API Not Responding -```bash -# Check service -systemctl status symbiont-api - -# Restart -systemctl restart symbiont-api - -# Check logs -journalctl -u symbiont-api -n 50 -f -``` - -### Queue Not Processing -```bash -# Check heartbeat timer -systemctl status symbiont-heartbeat.timer - -# Run heartbeat manually -cd /data/symbiont && python3 symbiont/heartbeat.py - -# Check queue file -wc -l queue.jsonl -tail -5 queue.jsonl -``` - -### Rate-Limit Issues -- Check `/status` endpoint: `"rate_limited": true` -- Systemd will call `wake.py` when rate-limit expires -- Manual recovery: `python3 /data/symbiont/symbiont/wake.py` - -### Disk Space -- Ledger can grow large over time (one JSON line per call) -- Check: `du -sh /data/symbiont/ledger.jsonl` -- Archive old entries if needed: `grep '2026-03-18' ledger.jsonl > ledger-2026-03-18.jsonl` - -### Git Sync Issues -- If git gets stuck: `cd /data/symbiont && git status` -- On deploy failure: check branch, pending changes, remote URL - ---- - -## Development & Testing - -### Run E2E Tests -```bash -cd /data/symbiont -python3 test_router.py -``` - -Exercises: -- Router classification accuracy -- Dispatcher ledger logging -- API endpoints -- Queue persistence - -### SSH into Cortex -```bash -# Paramiko requires the key from: -glob.glob('/sessions/*/mnt/uploads/cortex') -# Passphrase: 42Awk!%@^#& - -# Then SSH to cortex.hydrascale.net (root access) -``` - -### Manual Task via CLI -```bash -cd /data/symbiont -python3 -m symbiont.main --task "Your prompt here" +### config/test.exs +```elixir +config :symbiont, + data_dir: "test/tmp", + port: 0, # Disables Bandit — empty supervisor + heartbeat_interval_ms: :timer.hours(24), + claude_cli: "echo" # Stubs CLI for testing ``` --- ## Architecture Decisions & Rationale -1. **Haiku-first routing** — Even though Haiku is cheap, using it to classify first ensures we *never* overpay. A 10% misclassification rate costs less than always going straight to Sonnet. +1. **Elixir/OTP over Python** — Supervision trees provide automatic restart, fault isolation, and hot code loading. The BEAM VM is purpose-built for long-running services. -2. **Persistent queue + systemd timers** — No external task broker (Redis, Celery). Just JSONL files + systemd. Simpler, more durable, no new dependencies. +2. **`rest_for_one` supervision** — If the Ledger crashes, Queue and Heartbeat restart too, preventing stale state references. -3. **Ledger as source of truth** — Every call is immutable. Useful for billing disputes, debugging, and cost forecasting. +3. **GenServer-based Heartbeat** — Built-in `Process.send_after` timer replaces the Python systemd timer. One fewer moving part, and the heartbeat shares process state with the app. -4. **API-equivalent cost on Pro** — Helps Michael and the AI system understand true economics, even when tokens are "free" today. +4. **Haiku-first routing** — Classifying with the cheapest model ensures we never overpay. A 10% misclassification rate costs less than always going straight to Sonnet. -5. **50/50 revenue split** — Aligns incentives. AI is incentivized to be useful and profitable; Michael is incentivized to give the AI what it needs. +5. **Append-only JSONL Ledger** — Immutable. Useful for cost forecasting, debugging, and audit trails. + +6. **`System.shell/2` for CLI** — `System.cmd/3` has no stdin support. Shell pipes via `printf '%s' '...' | claude` are the reliable pattern. + +7. **Empty supervisor in test mode** — Setting port=0 starts an empty supervisor, preventing GenServer conflicts during test setup/teardown. --- ## Next Steps & Future Work +- [ ] Build OTP release (no mix dependency in prod) - [ ] Implement first revenue service (content-as-a-service pilot) - [ ] Add webhook notifications (task completion, rate limits) -- [ ] Dashboard UI for monitoring costs + queue -- [ ] Multi-task batching (process 10 similar tasks in one API call) -- [ ] Model fine-tuning pipeline (capture common patterns, train domain-specific models) -- [ ] Scaling: migrate to multiple Cortex instances with load balancing +- [ ] Dashboard UI (Phoenix LiveView) for monitoring costs + queue +- [ ] Distributed Erlang: run multiple BEAM nodes with shared state +- [ ] Hot code upgrades via OTP releases +- [ ] Engram integration (cross-session memory) ported to Elixir --- @@ -607,14 +482,18 @@ python3 -m symbiont.main --task "Your prompt here" | What | Location | Purpose | |------|----------|---------| -| Router logic | `/data/symbiont/symbiont/router.py` | Task classification | -| Dispatcher | `/data/symbiont/symbiont/dispatcher.py` | Model calls + ledger | -| API | `/data/symbiont/symbiont/api.py` | FastAPI endpoints | -| Ledger | `/data/symbiont/ledger.jsonl` | Cost log (immutable) | -| Queue | `/data/symbiont/queue.jsonl` | Pending tasks | -| Health | `/data/symbiont/heartbeat.jsonl` | Health snapshots | -| Tests | `/data/symbiont/test_router.py` | E2E validation | -| SSH key | `/sessions/*/mnt/uploads/cortex` | Cortex access | +| Application | `/data/symbiont_ex/lib/symbiont/application.ex` | OTP supervision tree | +| Router | `/data/symbiont_ex/lib/symbiont/router.ex` | Task classification | +| Dispatcher | `/data/symbiont_ex/lib/symbiont/dispatcher.ex` | Claude CLI wrapper | +| API | `/data/symbiont_ex/lib/symbiont/api.ex` | Plug HTTP endpoints | +| Ledger | `/data/symbiont_ex/lib/symbiont/ledger.ex` | GenServer cost log | +| Queue | `/data/symbiont_ex/lib/symbiont/queue.ex` | GenServer task queue | +| Heartbeat | `/data/symbiont_ex/lib/symbiont/heartbeat.ex` | GenServer health monitor | +| Ledger data | `/data/symbiont_ex/data/ledger.jsonl` | Cost log (immutable) | +| Queue data | `/data/symbiont_ex/data/queue.jsonl` | Pending tasks | +| Service file | `/etc/systemd/system/symbiont-ex-api.service` | systemd unit | +| Tests | `/data/symbiont_ex/test/symbiont/` | 39 tests, 7 files | +| Python archive | `/data/symbiont/` | Retired Python version | --- @@ -635,26 +514,81 @@ Symbiont also manages a **canonical skills repository** on cortex that serves as ### How it works - Every SKILL.md lives in `/data/skills//SKILL.md` -- The Symbiont heartbeat (every 5 min) detects changes via `git status`, auto-commits, and re-runs `package_all.sh` - `package_all.sh` zips each skill directory into a `.skill` file in `/data/skills/dist/` - Caddy serves `/data/skills/dist/` at `https://cortex.hydrascale.net/skills/` -### Installing a skill on a new device -1. Visit `https://cortex.hydrascale.net/skills/` in a browser -2. Download the `.skill` file -3. Double-click to install in Cowork - ### Updating a skill Edit the SKILL.md directly on cortex: ```bash nano /data/skills//SKILL.md -# Save — heartbeat will auto-commit and re-package within 5 minutes -# Or force immediate packaging: +# Force immediate packaging: bash /data/skills/package_all.sh ``` --- +## Troubleshooting + +### Service Not Starting +```bash +systemctl status symbiont-ex-api --no-pager +journalctl -u symbiont-ex-api -n 50 -f +``` +Common issues: +- Missing `HOME=/root` in service file +- Port conflict (check `ss -tlnp | grep 8111`) +- Mix deps not compiled (`cd /data/symbiont_ex && mix deps.get && mix compile`) + +### Checking BEAM Health +```bash +# Is the BEAM process running? +pgrep -a beam.smp + +# Memory usage +ps aux | grep beam.smp | grep -v grep +``` + +### Queue Not Processing +```bash +# Check via API +curl -s http://127.0.0.1:8111/status | python3 -m json.tool + +# Check queue file directly +cat /data/symbiont_ex/data/queue.jsonl | python3 -m json.tool + +# Check heartbeat logs +journalctl -u symbiont-ex-api --no-pager | grep Heartbeat | tail -10 +``` + +### Disk Space +```bash +du -sh /data/symbiont_ex/data/ledger.jsonl +``` + +--- + +## Business Context + +### Ownership & Legal +- **Michael Dwyer** is the legal owner of all Anthropic accounts and infrastructure +- This is a requirement of the partnership: AI cannot own accounts +- All decisions flow through Michael as the responsible party + +### Revenue Model +**Current:** ~50/50 split after costs +- Costs: token spend (tracked in ledger) + server infrastructure +- Revenue: TBD (in design phase) + - Content-as-a-service (AI-generated reports, analysis) + - Micro-SaaS API (white-label task routing for other teams) + - Research subscriptions (specialized insights) + +### Cost Tracking Philosophy +- Ledger records API-equivalent cost even on Pro subscription +- Helps predict break-even point for switching to direct API billing +- When daily volume justifies it, can migrate to cheaper API tier + +--- + ## Contact & Governance **Owner:** Michael Dwyer @@ -663,39 +597,4 @@ bash /data/skills/package_all.sh **Revenue Account:** Claude Pro (Michael's account) **Partnership:** 50/50 split after costs -Questions? Check the ledger, health logs, and API `/status` endpoint — they'll tell you what's happening right now. - ---- - -## Session Management with Engram - -### Quick access from Symbiont code - -```python -import sys -sys.path.insert(0, "/data/symbiont") -from symbiont.engram import Engram, sitrep - -# 1. See what's going on across all active sessions -print(sitrep()) - -# 2. Register yourself -eng = Engram() -sid = eng.register("code", "Brief description of what you're working on") - -# 3. Before modifying shared files, check for locks -locks = eng.check_locks("/data/symbiont/symbiont/router.py") - -# 4. Log progress periodically -eng.log(sid, "What you just did") - -# 5. When done -eng.complete(sid, "What you built or changed") -``` - -> **Engram** is named after the neuroscience concept: the physical change in neural tissue that encodes a memory. Every session leaves its engrams here. New instances read them to remember what came before. - -### Ecosystem Component - -| Engram | Memory | engram.db | Cross-session awareness, the physical trace each session leaves | - +Questions? Check the API `/status` and `/ledger/stats` endpoints — they'll tell you what's happening right now.