# Elixir Part 2: Concurrency, OTP & State Machines

## BEAM Process Fundamentals

- **Processes are cheap** — ~2KB initial memory, microseconds to spawn, millions concurrent
- **No shared memory** — message passing only
- **Per-process GC** — no stop-the-world pauses
- **Preemptive scheduling** — one scheduler per CPU core, ~4000 reductions then yield
- **Process isolation** — one crash doesn't affect others

### Spawning and Messaging
```elixir
pid = spawn(fn ->
  receive do
    {:greet, name} -> IO.puts("Hello, #{name}!")
  end
end)
send(pid, {:greet, "Michael"})

# Linked — bidirectional crash propagation
pid = spawn_link(fn -> raise "boom" end)

# Monitored — one-directional crash notification
ref = Process.monitor(pid)
receive do
  {:DOWN, ^ref, :process, ^pid, reason} -> handle_crash(reason)
end
```

### Task — Structured Concurrency
```elixir
Task.start(fn -> send_email(user) end)                    # Fire and forget
task = Task.async(fn -> expensive_computation() end)       # Async/await
result = Task.await(task, 30_000)

Task.async_stream(urls, &fetch_url/1, max_concurrency: 10)  # Parallel map
|> Enum.to_list()
```

---

## GenServer — Stateful Server Processes

```elixir
defmodule Counter do
  use GenServer

  # Client API
  def start_link(initial \\ 0), do: GenServer.start_link(__MODULE__, initial, name: __MODULE__)
  def increment, do: GenServer.cast(__MODULE__, :increment)
  def get, do: GenServer.call(__MODULE__, :get)

  # Server callbacks
  @impl true
  def init(initial), do: {:ok, initial}

  @impl true
  def handle_cast(:increment, count), do: {:noreply, count + 1}

  @impl true
  def handle_call(:get, _from, count), do: {:reply, count, count}

  @impl true
  def handle_info(:tick, count) do
    Process.send_after(self(), :tick, 1000)
    {:noreply, count}
  end
end
```

**Key principle:** Callbacks run sequentially — this is both the synchronization mechanism and potential bottleneck. Keep callbacks fast; delegate heavy work to spawned tasks.

---

## GenStateMachine — State Machines for Agentic Workflows

**Why GenStateMachine over GenServer for agents?** GenServer has a single state and handles all messages uniformly. GenStateMachine (wrapping Erlang's `:gen_statem`) provides:
- **Explicit states** with per-state event handling
- **Built-in timeouts** — state timeouts (reset on state change), generic timeouts, event timeouts
- **Postpone** — defer events until the right state
- **State enter callbacks** — run setup logic when entering a state
- **Event types** — distinguish calls, casts, info, timeouts, internal events

These map naturally onto agentic patterns: an agent in "thinking" state ignores new requests (postpone), has retry timeouts, transitions through well-defined phases, and runs setup on each state entry.

### Installation
```elixir
{:gen_state_machine, "~> 3.0"}
```

### Callback Modes

**`:handle_event_function`** (default) — single `handle_event/4` for all states:
```elixir
defmodule AgentFSM do
  use GenStateMachine

  # Client API
  def start_link(opts), do: GenStateMachine.start_link(__MODULE__, opts)
  def submit(pid, task), do: GenStateMachine.call(pid, {:submit, task})
  def status(pid), do: GenStateMachine.call(pid, :status)

  # Server callbacks
  def init(opts) do
    {:ok, :idle, %{tasks: [], results: [], config: opts}}
  end

  # State: idle
  def handle_event({:call, from}, {:submit, task}, :idle, data) do
    {:next_state, :processing, %{data | tasks: [task]},
     [{:reply, from, :accepted}, {:state_timeout, 30_000, :timeout}]}
  end

  def handle_event({:call, from}, :status, state, data) do
    {:keep_state_and_data, [{:reply, from, {state, length(data.results)}}]}
  end

  # State: processing — with timeout
  def handle_event(:state_timeout, :timeout, :processing, data) do
    {:next_state, :error, %{data | error: :timeout}}
  end

  # Internal event for completion
  def handle_event(:info, {:result, result}, :processing, data) do
    {:next_state, :complete, %{data | results: [result | data.results]}}
  end

  # Postpone submissions while processing
  def handle_event({:call, _from}, {:submit, _task}, :processing, _data) do
    {:keep_state_and_data, :postpone}
  end
end
```

**`:state_functions`** — each state is a separate function (states must be atoms):
```elixir
defmodule WorkflowFSM do
  use GenStateMachine, callback_mode: [:state_functions, :state_enter]

  def init(_), do: {:ok, :pending, %{}}

  # State enter callbacks — run on every state transition
  def pending(:enter, _old_state, data) do
    {:keep_state, %{data | entered_at: DateTime.utc_now()}}
  end

  def pending({:call, from}, {:start, params}, data) do
    {:next_state, :running, %{data | params: params},
     [{:reply, from, :ok}, {:state_timeout, 60_000, :execution_timeout}]}
  end

  def running(:enter, :pending, data) do
    # Setup when entering running from pending
    send(self(), :execute)
    {:keep_state, data}
  end

  def running(:info, :execute, data) do
    result = do_work(data.params)
    {:next_state, :complete, %{data | result: result}}
  end

  def running(:state_timeout, :execution_timeout, data) do
    {:next_state, :failed, %{data | error: :timeout}}
  end

  # Postpone any calls while running
  def running({:call, _from}, _request, _data) do
    {:keep_state_and_data, :postpone}
  end

  def complete(:enter, _old, data), do: {:keep_state, data}
  def complete({:call, from}, :get_result, data) do
    {:keep_state_and_data, [{:reply, from, {:ok, data.result}}]}
  end

  def failed(:enter, _old, data), do: {:keep_state, data}
  def failed({:call, from}, :get_error, data) do
    {:keep_state_and_data, [{:reply, from, {:error, data.error}}]}
  end
end
```

### Timeout Types

| Type | Behavior | Use Case |
|------|----------|----------|
| `{:timeout, ms, event}` | Generic — survives state changes | Periodic polling |
| `{:state_timeout, ms, event}` | Resets on state change | Per-state deadlines |
| `{:event_timeout, ms, event}` | Resets on any event | Inactivity detection |

### Key Actions in Return Tuples

```elixir
# Return format: {:next_state, new_state, new_data, actions}
actions = [
  {:reply, from, response},                    # Reply to caller
  {:state_timeout, 30_000, :deadline},          # State-scoped timeout
  {:timeout, 5_000, :poll},                     # Generic timeout
  :postpone,                                    # Defer event to next state
  :hibernate,                                   # Reduce memory footprint
  {:next_event, :internal, :setup}              # Queue internal event
]
```

### When to Use GenStateMachine vs GenServer

| Scenario | Use |
|----------|-----|
| Simple key-value state, CRUD | GenServer |
| Request/response server | GenServer |
| Well-defined state transitions | **GenStateMachine** |
| Need built-in timeouts per state | **GenStateMachine** |
| Events valid only in certain states | **GenStateMachine** |
| Agentic workflow with phases | **GenStateMachine** |
| Need postpone/defer semantics | **GenStateMachine** |

---

## Supervisors — Let It Crash

```elixir
defmodule MyApp.Application do
  use Application

  @impl true
  def start(_type, _args) do
    children = [
      MyApp.Repo,
      {MyApp.Cache, []},
      {Task.Supervisor, name: MyApp.TaskSupervisor},
      MyAppWeb.Endpoint
    ]
    opts = [strategy: :one_for_one, name: MyApp.Supervisor]
    Supervisor.start_link(children, opts)
  end
end
```

**Strategies:**
- `:one_for_one` — restart only crashed child (most common)
- `:one_for_all` — restart all if any crashes (tightly coupled)
- `:rest_for_one` — restart crashed + all started after it

### DynamicSupervisor
```elixir
defmodule MyApp.SessionSupervisor do
  use DynamicSupervisor

  def start_link(_), do: DynamicSupervisor.start_link(__MODULE__, :ok, name: __MODULE__)
  def init(:ok), do: DynamicSupervisor.init(strategy: :one_for_one)

  def start_session(session_id) do
    DynamicSupervisor.start_child(__MODULE__, {MyApp.Session, session_id})
  end
end
```

---

## Registry & Process Discovery

```elixir
# In supervisor
{Registry, keys: :unique, name: MyApp.Registry}

# Register a process
Registry.register(MyApp.Registry, "session:#{id}", %{})

# Lookup
case Registry.lookup(MyApp.Registry, "session:#{id}") do
  [{pid, _value}] -> {:ok, pid}
  [] -> {:error, :not_found}
end

# Use as GenServer name
GenServer.start_link(MyWorker, arg, name: {:via, Registry, {MyApp.Registry, "worker:#{id}"}})
```