474 lines
15 KiB
Markdown
474 lines
15 KiB
Markdown
# Elixir / Phoenix Learnings — Cortex Status Dashboard
|
|
|
|
Patterns, gotchas, and reference notes from building the cortex_status Phoenix app.
|
|
|
|
## Project: cortex_status
|
|
|
|
- **Location on cortex**: `/data/cortex_status/`
|
|
- **Service**: `symbiont-ex-api.service` (systemd)
|
|
- **Port**: 4000 (behind Caddy at status.hydrascale.net)
|
|
- **Framework**: Phoenix 1.7.14, LiveView 1.0.0, LiveDashboard 0.8.4
|
|
|
|
---
|
|
|
|
## LiveView Patterns
|
|
|
|
### PubSub for Real-Time Updates
|
|
The status page subscribes to PubSub topics on mount and receives broadcast updates:
|
|
```elixir
|
|
def mount(_params, _session, socket) do
|
|
if connected?(socket) do
|
|
Phoenix.PubSub.subscribe(CortexStatus.PubSub, "service_status")
|
|
end
|
|
{:ok, assign(socket, ...)}
|
|
end
|
|
|
|
def handle_info({:status_update, new_status}, socket) do
|
|
{:noreply, assign(socket, status: new_status)}
|
|
end
|
|
```
|
|
|
|
### Polling Pattern (Process.send_after)
|
|
For polling an external API from a LiveView (e.g., task progress):
|
|
```elixir
|
|
@poll_interval 1_000
|
|
|
|
# Start polling
|
|
timer = Process.send_after(self(), :poll_task, @poll_interval)
|
|
{:noreply, assign(socket, poll_timer: timer)}
|
|
|
|
# Handle poll
|
|
def handle_info(:poll_task, socket) do
|
|
case fetch_progress(socket.assigns.task_id) do
|
|
{:ok, data} when is_map(data) ->
|
|
terminal = data["status"] in ["completed", "failed"]
|
|
timer = if terminal, do: nil, else: Process.send_after(self(), :poll_task, @poll_interval)
|
|
{:noreply, assign(socket, data: data, poll_timer: timer)}
|
|
_ ->
|
|
# Don't crash on bad data — keep polling
|
|
timer = Process.send_after(self(), :poll_task, @poll_interval)
|
|
{:noreply, assign(socket, poll_timer: timer)}
|
|
end
|
|
end
|
|
```
|
|
|
|
**Gotcha**: Always cancel timers on unmount/logout:
|
|
```elixir
|
|
if socket.assigns.poll_timer, do: Process.cancel_timer(socket.assigns.poll_timer)
|
|
```
|
|
|
|
### Component Attrs and Passing Assigns
|
|
When defining function components with `attr`, use `assigns` directly. For passing whole assigns bundles to sub-components, use a named assign:
|
|
```elixir
|
|
# Don't try to pass @assigns directly — use a named prop
|
|
<.auth_gate socket_assigns={assigns} />
|
|
|
|
defp auth_gate(assigns) do
|
|
~H"""
|
|
<%= @socket_assigns.prompt %>
|
|
"""
|
|
end
|
|
```
|
|
|
|
### phx-change Goes on the Form, Not Individual Inputs
|
|
```elixir
|
|
# WRONG: phx-change on textarea alone won't fire
|
|
<textarea phx-change="update_prompt" name="prompt"></textarea>
|
|
|
|
# RIGHT: phx-change on the form, inputs trigger it
|
|
<form phx-submit="submit" phx-change="update">
|
|
<textarea name="prompt"><%= @prompt %></textarea>
|
|
</form>
|
|
```
|
|
|
|
### Enum.with_index Returns {element, index}
|
|
```elixir
|
|
# WRONG — destructuring is backwards:
|
|
for {index, element} <- Enum.with_index(list)
|
|
|
|
# RIGHT:
|
|
for {element, index} <- Enum.with_index(list)
|
|
```
|
|
|
|
---
|
|
|
|
## LiveDashboard Custom Pages
|
|
|
|
### Basic Structure
|
|
```elixir
|
|
defmodule MyApp.DashboardPages.MyPage do
|
|
use Phoenix.LiveDashboard.PageBuilder
|
|
|
|
@impl true
|
|
def menu_link(_, _), do: {:ok, "Page Title"}
|
|
|
|
@impl true
|
|
def render_page(_assigns) do
|
|
{:ok, row(components: [card(value: "Hello", inner_title: "Title")])}
|
|
end
|
|
end
|
|
```
|
|
|
|
### Registration in Router
|
|
```elixir
|
|
live_dashboard "/dashboard",
|
|
metrics: MyAppWeb.Telemetry,
|
|
additional_pages: [
|
|
my_page: MyApp.DashboardPages.MyPage
|
|
]
|
|
```
|
|
|
|
### Available Components
|
|
- `card(value:, inner_title:)` — simple KV display
|
|
- `table(columns:, rows:, id:, title:, row_attrs:)` — data table
|
|
- `row(components:)` — horizontal layout
|
|
- `columns(columns:)` — multi-column layout
|
|
|
|
**Gotcha**: `row_attrs` must be a function: `fn row -> [{"data-id", row.id}] end`
|
|
|
|
---
|
|
|
|
## DateTime Gotchas
|
|
|
|
### DateTime.from_iso8601 Returns {:error, reason}, Not :error
|
|
```elixir
|
|
# WRONG:
|
|
case DateTime.from_iso8601(str) do
|
|
{:ok, dt, _} -> dt
|
|
:error -> str # This clause never matches!
|
|
end
|
|
|
|
# RIGHT:
|
|
case DateTime.from_iso8601(str) do
|
|
{:ok, dt, _} -> dt
|
|
{:error, _} -> str # Correct error tuple
|
|
end
|
|
```
|
|
|
|
---
|
|
|
|
## HTTP Calls from LiveView/GenServer
|
|
|
|
### Using Req Library
|
|
```elixir
|
|
# GET
|
|
case Req.get("http://localhost:8111/status", receive_timeout: 5000) do
|
|
{:ok, %{status: 200, body: body}} when is_map(body) -> {:ok, body}
|
|
{:ok, %{status: code}} -> {:error, "HTTP #{code}"}
|
|
{:error, reason} -> {:error, reason}
|
|
end
|
|
|
|
# POST with JSON
|
|
case Req.post(url, json: %{"prompt" => prompt}) do
|
|
{:ok, %{status: 200, body: body}} -> {:ok, body}
|
|
...
|
|
end
|
|
```
|
|
|
|
**Gotcha**: Always handle when `body` might be a string (not auto-parsed JSON). Req parses JSON automatically when content-type is application/json.
|
|
|
|
---
|
|
|
|
## Application Configuration
|
|
|
|
### Reading Config at Runtime (Not Compile-Time)
|
|
Use function calls instead of module attributes for config that may change:
|
|
```elixir
|
|
# WRONG — baked in at compile time:
|
|
@symbiont_url Application.get_env(:cortex_status, :services)[:symbiont_url]
|
|
|
|
# RIGHT — reads at runtime:
|
|
defp symbiont_url do
|
|
config = Application.get_env(:cortex_status, :services, [])
|
|
Keyword.get(config, :symbiont_url, "http://127.0.0.1:8111")
|
|
end
|
|
```
|
|
|
|
---
|
|
|
|
## os_mon for System Metrics
|
|
|
|
The app uses Erlang's `os_mon` for host-level metrics (requires `:os_mon` in `extra_applications`):
|
|
```elixir
|
|
cpu_load = :cpu_sup.avg1() / 256 # normalized 0-1
|
|
{mem_total, mem_alloc, _} = :memsup.get_memory_data()
|
|
disk_data = :disksup.get_disk_data() # [{mount, total_kb, percent_used}]
|
|
```
|
|
|
|
---
|
|
|
|
## Release & Deployment
|
|
|
|
### Build Commands
|
|
```bash
|
|
cd /data/cortex_status
|
|
MIX_ENV=prod mix deps.get
|
|
MIX_ENV=prod mix compile
|
|
MIX_ENV=prod mix assets.deploy # tailwind + esbuild
|
|
MIX_ENV=prod mix release --overwrite
|
|
systemctl restart symbiont-ex-api
|
|
```
|
|
|
|
### Environment Variables (runtime.exs)
|
|
- `SECRET_KEY_BASE` — required in prod
|
|
- `PHX_HOST` — defaults to cortex.hydrascale.net
|
|
- `PORT` — defaults to 4000
|
|
- `SYMBIONT_URL` — override Symbiont API base URL
|
|
|
|
---
|
|
|
|
## check_origin: The LiveView Connection Killer
|
|
|
|
When LiveView connections silently fail (`liveSocket.isConnected()` returns `false`,
|
|
`_mount_attempts` climbs into the thousands), the most likely culprit is a `check_origin`
|
|
mismatch. Phoenix checks the HTTP `Origin` header against its configured URL.
|
|
|
|
**Symptom**: Phoenix logs show:
|
|
```
|
|
[error] Could not check origin for Phoenix.Socket transport.
|
|
Origin of the request: https://cortex.hydrascale.net
|
|
```
|
|
|
|
**Root cause**: Setting `url: [host: h, port: 443, scheme: "https"]` in `runtime.exs`
|
|
causes Phoenix to expect an origin of `https://cortex.hydrascale.net:443`, but browsers
|
|
send `https://cortex.hydrascale.net` (no port — 443 is implicit for HTTPS). String
|
|
comparison fails.
|
|
|
|
**Fix**: Explicitly set `check_origin` in the endpoint config in `runtime.exs`:
|
|
```elixir
|
|
config :cortex_status, CortexStatusWeb.Endpoint,
|
|
url: [host: host, port: 443, scheme: "https"],
|
|
http: [ip: {127, 0, 0, 1}, port: port],
|
|
secret_key_base: secret_key_base,
|
|
check_origin: ["https://cortex.hydrascale.net"] # ← explicit, no port
|
|
```
|
|
|
|
**Note**: LiveView uses `longpoll` as its transport (WebSocket upgrades may not work
|
|
through all Caddy configs). `longpoll` is functionally identical for most purposes —
|
|
slightly more latency but fully supported.
|
|
|
|
---
|
|
|
|
## LiveView Auth Gate Pattern
|
|
|
|
### Password-Protected LiveView Pages
|
|
For pages that need a simple password gate (like Mission Control), use assigns to track
|
|
auth state and conditionally render:
|
|
```elixir
|
|
def mount(_params, _session, socket) do
|
|
{:ok, assign(socket, authenticated: false, error: nil, prompt: "")}
|
|
end
|
|
|
|
def handle_event("authenticate", %{"password" => pw}, socket) do
|
|
if pw == Application.get_env(:my_app, :task_password) do
|
|
{:noreply, assign(socket, authenticated: true, error: nil)}
|
|
else
|
|
{:noreply, assign(socket, error: "Invalid password")}
|
|
end
|
|
end
|
|
```
|
|
|
|
In the template, wrap content with `<%= if @authenticated do %>`.
|
|
|
|
**Gotcha**: The password check happens server-side in the LiveView process, so it's
|
|
secure even though the HTML is rendered client-side. But remember: the initial static
|
|
render (before WebSocket connects) will show the unauthenticated state, so don't put
|
|
sensitive data in the assigns until after authentication.
|
|
|
|
---
|
|
|
|
## LiveView Silent Failures — Always Show Error Feedback
|
|
|
|
### The Problem
|
|
When a LiveView event handler hits an error (API call fails, validation error, etc.)
|
|
and you just return `{:noreply, socket}` without updating assigns, the user sees
|
|
*nothing happen*. The button appears to do nothing. This is extremely confusing.
|
|
|
|
### The Fix Pattern
|
|
Always maintain an `error` assign and display it:
|
|
```elixir
|
|
def handle_event("submit_task", %{"prompt" => prompt}, socket) do
|
|
case submit_to_api(prompt) do
|
|
{:ok, task_id} ->
|
|
{:noreply, assign(socket, task_id: task_id, error: nil)}
|
|
{:error, reason} ->
|
|
{:noreply, assign(socket, error: "Task failed: #{reason}")}
|
|
end
|
|
end
|
|
```
|
|
|
|
In the template:
|
|
```elixir
|
|
<%= if @error do %>
|
|
<div class="alert alert-danger"><%= @error %></div>
|
|
<% end %>
|
|
```
|
|
|
|
**Lesson learned the hard way**: The Mission Control "Execute" button did nothing
|
|
for a while because the Symbiont API was rejecting the auth token, but the error
|
|
was swallowed silently. Always surface errors to the UI.
|
|
|
|
---
|
|
|
|
## API Authentication from LiveView
|
|
|
|
### Bearer Token vs Query Param vs JSON Body
|
|
When calling external APIs from LiveView, be careful about where auth tokens go.
|
|
FastAPI's `Depends()` for auth reads from specific locations — if the API expects
|
|
a query param and you send it in the JSON body, auth silently fails.
|
|
|
|
**Preferred pattern**: Use `Authorization: Bearer <token>` header — it's unambiguous:
|
|
```elixir
|
|
headers = [{"authorization", "Bearer #{token}"}, {"content-type", "application/json"}]
|
|
case Req.post(url, json: payload, headers: headers) do
|
|
{:ok, %{status: 200, body: body}} -> {:ok, body}
|
|
{:ok, %{status: code, body: body}} -> {:error, "HTTP #{code}: #{inspect(body)}"}
|
|
{:error, reason} -> {:error, inspect(reason)}
|
|
end
|
|
```
|
|
|
|
**Gotcha**: Always match on the status code, not just `{:ok, _}`. A 401 or 500
|
|
response is still `{:ok, %Req.Response{}}` — it's only `:error` if the HTTP
|
|
request itself fails (timeout, DNS, connection refused).
|
|
|
|
---
|
|
|
|
## LiveView Form Gotchas (Expanded)
|
|
|
|
### phx-submit Doesn't Fire Without a Submit Button
|
|
If your form has `phx-submit="do_thing"` but no `<button type="submit">`, pressing
|
|
Enter in a text input may not trigger the event in all browsers.
|
|
|
|
### Textarea Value Persistence
|
|
When using `phx-change` on a form with a textarea, the server receives the current
|
|
value on every keystroke. If your `handle_event` for `phx-change` doesn't re-assign
|
|
the textarea value, it can appear to reset or flicker:
|
|
```elixir
|
|
# In handle_event("update", params, socket):
|
|
def handle_event("update", %{"prompt" => prompt}, socket) do
|
|
{:noreply, assign(socket, prompt: prompt)} # ← must re-assign
|
|
end
|
|
```
|
|
|
|
### Form Params Are Always Strings
|
|
All form params arrive as strings, even for number inputs:
|
|
```elixir
|
|
# WRONG:
|
|
def handle_event("set_count", %{"count" => count}, socket) when is_integer(count)
|
|
# This clause NEVER matches — count is always a string
|
|
|
|
# RIGHT:
|
|
def handle_event("set_count", %{"count" => count_str}, socket) do
|
|
count = String.to_integer(count_str)
|
|
{:noreply, assign(socket, count: count)}
|
|
end
|
|
```
|
|
|
|
---
|
|
|
|
## LiveDashboard Gotchas
|
|
|
|
### table() Component Limitations
|
|
The LiveDashboard `table()` component expects very specific data shapes and can be
|
|
finicky with dynamic data. If your data doesn't fit cleanly, use `card()` with
|
|
formatted text instead — it's more flexible and less error-prone.
|
|
|
|
**What went wrong**: Mission Control initially tried to use `table()` for task display
|
|
but hit issues with dynamic columns. Switched to `card()` with pre-formatted text,
|
|
which worked immediately.
|
|
|
|
### Custom Page render_page/1 Returns Tuples
|
|
`render_page/1` must return `{:ok, component_tree}`, not just a component:
|
|
```elixir
|
|
# WRONG:
|
|
def render_page(assigns), do: row(components: [...])
|
|
|
|
# RIGHT:
|
|
def render_page(_assigns), do: {:ok, row(components: [...])}
|
|
```
|
|
|
|
---
|
|
|
|
## Debugging LiveView Connections
|
|
|
|
### Diagnosis Checklist (when LiveView "doesn't work")
|
|
1. **Check `liveSocket.isConnected()`** in browser console — `false` means the
|
|
WebSocket/longpoll connection failed
|
|
2. **Check `_mount_attempts`** — if climbing into thousands, it's retrying and failing
|
|
3. **Check Phoenix logs** for `check_origin` errors (most common cause)
|
|
4. **Check Caddy/reverse proxy** — WebSocket upgrade headers may be stripped
|
|
5. **Check `runtime.exs`** — host/port/scheme must match the actual public URL
|
|
6. **Use Dendrite** to automate this: navigate to the page, run
|
|
`liveSocket.isConnected()` via JS, check the result programmatically
|
|
|
|
### Longpoll vs WebSocket
|
|
Phoenix LiveView supports both transports. Behind Caddy, longpoll is often more
|
|
reliable. In `app.js`:
|
|
```javascript
|
|
let liveSocket = new LiveSocket("/live", Socket, {
|
|
params: {_csrf_token: csrfToken},
|
|
// transport: WebSocket // uncomment to force WebSocket
|
|
})
|
|
```
|
|
|
|
If WebSocket connections fail silently, LiveView falls back to longpoll automatically.
|
|
This is fine for most use cases.
|
|
|
|
---
|
|
|
|
## Caddy + Phoenix Integration Notes
|
|
|
|
### Reverse Proxy Config
|
|
```
|
|
cortex.hydrascale.net {
|
|
reverse_proxy localhost:4000
|
|
encode gzip
|
|
}
|
|
```
|
|
|
|
**Important**: Caddy handles TLS termination. Phoenix should listen on plain HTTP
|
|
(127.0.0.1 only). Don't configure Phoenix for HTTPS — let Caddy do it.
|
|
|
|
### The Self-Check Trap
|
|
If your Phoenix status page monitors URLs including its own domain
|
|
(e.g., `cortex.hydrascale.net`), the HTTP request goes through Caddy, back to
|
|
Phoenix, creating a circular dependency that times out. The timeout handler may
|
|
also lose the site name, producing mysterious "?" entries.
|
|
|
|
**Fix**: Don't have the app check itself. If the status page is loading, it's up.
|
|
|
|
---
|
|
|
|
## Release Build Gotchas
|
|
|
|
### Mix Release vs Mix Run
|
|
In development: `mix phx.server` or `iex -S mix phx.server`
|
|
In production: always use a release build:
|
|
```bash
|
|
MIX_ENV=prod mix deps.get
|
|
MIX_ENV=prod mix compile
|
|
MIX_ENV=prod mix assets.deploy
|
|
MIX_ENV=prod mix release --overwrite
|
|
```
|
|
|
|
**Gotcha**: `mix assets.deploy` must run BEFORE `mix release`. The release bundles
|
|
the compiled assets — if you skip this step, the release will serve stale CSS/JS
|
|
or no assets at all.
|
|
|
|
### Config Hierarchy Matters
|
|
```
|
|
config/config.exs → compile-time defaults (all envs)
|
|
config/dev.exs → compile-time dev overrides
|
|
config/prod.exs → compile-time prod overrides
|
|
config/runtime.exs → runtime config (reads env vars, runs at boot)
|
|
```
|
|
|
|
**Critical**: `Application.get_env/3` in module attributes (`@foo Application.get_env(...)`)
|
|
reads at **compile time**. Use `Application.compile_env/3` to make this explicit, or
|
|
better yet, read config in a function that runs at runtime.
|
|
|
|
### SECRET_KEY_BASE
|
|
Required in prod. Generate with: `mix phx.gen.secret`
|
|
Set as environment variable or hardcode in runtime.exs (on a single-server deploy
|
|
where the .env file is secured, this is fine).
|