Fleet Dashboard Architecture¶
This document explains how amplihack fleet is structured, why each design
decision was made, and what trade-offs were accepted. Read this before
modifying src/fleet.rs.
Contents¶
- Overview
- Thread model
- State model
- Key dispatch
- Session discovery
- Persistence layer
- Terminal safety
- Security model
- What was deliberately left out
Overview¶
The fleet dashboard is implemented entirely in src/fleet.rs as a set of
plain Rust structs and std::thread threads. It does not use tokio,
async/await, or any TUI framework crate. The render loop, the background
refresh threads, and the keyboard reader are all separate concerns connected
through std::sync::mpsc channels.
┌─────────────────────────────────────────────────┐
│ main thread │
│ ┌───────────┐ ┌──────────────┐ ┌──────────┐ │
│ │ keyboard │ │ render loop │ │ state │ │
│ │ reader │→ │ (crossterm) │←─│ FleetTui │ │
│ └───────────┘ └──────────────┘ │ UiState │ │
│ ↓ ↑ └──────────┘ │
│ DashboardKey RefreshMsg │
└─────────────────────────────────────────────────┘
↑ mpsc::Receiver ↑ mpsc::Sender
┌───────────────────┐ ┌────────────────────────┐
│ T4: fast refresh │ │ T5: slow refresh │
│ (500 ms) │ │ (5 s, tmux) │
│ reads lock files │ │ reads capture-pane │
└───────────────────┘ └────────────────────────┘
The entry point run_fleet_dashboard(args, bg_tx) accepts an
Option<Sender<RefreshMsg>>. When bg_tx is None the function runs the
session-collect inline and returns immediately — this path is used in unit
tests to exercise all state transitions without spawning threads.
Thread model¶
T4 — fast refresh (500 ms)¶
Reads ~/.claude/runtime/locks/ on every tick and sends a
RefreshMsg::Sessions(Vec<FleetSessionEntry>) down the channel. On a
channel-send error (receiver dropped) the thread calls break and exits
cleanly without panicking.
T5 — slow refresh (5 s)¶
Calls tmux capture-pane -t <session-id> -p for each active session. Sends
SlowRefreshMsg::CaptureUpdate { session_id, content }. If tmux is absent
the thread immediately exits; the dashboard continues without preview content.
Why std::thread instead of tokio?¶
The dashboard does two things that cooperate poorly with async: raw-mode terminal I/O and blocking file system reads. Dedicating OS threads to each concern makes the code straightforward to read and avoids the overhead of an async runtime for a use-case that is not network-bound.
State model¶
All mutable render state lives in one struct:
pub struct FleetTuiUiState {
pub selected_row: usize,
pub scroll_offset: usize,
pub active_panel: Panel,
pub mode: DashboardMode,
pub status_message: Option<String>,
}
| Field | Purpose |
|---|---|
selected_row |
Which session row is highlighted |
scroll_offset |
How many rows are scrolled off the top of the table |
active_panel |
SessionTable, Editor, ProjectList, or Help |
mode |
Normal, Creating, Adopting, or Help |
status_message |
One-line message shown in the status bar; None = clear |
Having a single state struct makes snapshot-based unit testing straightforward:
construct an initial FleetTuiUiState, dispatch a DashboardKey, assert the
resulting state.
Key dispatch¶
Raw terminal bytes are translated to a typed enum before they reach any application logic:
pub enum DashboardKey {
Up, Down, PageUp, PageDown,
Tab, Enter, Escape,
Char(char),
CtrlC, CtrlU,
Unknown,
}
The crossterm event loop maps crossterm::event::KeyEvent values to
DashboardKey variants. All match arms on raw bytes are isolated to one
function (key_from_event), so the rest of the code never touches raw
terminal bytes.
Session discovery¶
collect_observed_fleet_state() returns Vec<FleetSessionEntry>:
pub struct FleetSessionEntry {
pub session_id: String, // sanitized
pub status: SessionStatus,
pub pid: Option<u32>,
pub project: Option<String>,
pub age_secs: Option<u64>,
}
pub enum SessionStatus { Active, Idle, Dead, Unknown }
The function:
- Reads
~/.claude/runtime/locks/*.lockwithfs::read_dir. - Calls
sanitize_session_id()on every filename component before any use. - Parses each file as JSON; skips entries with parse errors.
- Validates
pidis in1..=4_194_304; marks entryDeadif out of range. - Checks whether the PID is live by reading
/proc/{pid}/stat(Linux) or usingkill(pid, 0)(macOS/BSD); setsSessionStatusaccordingly. - Returns the collected vec sorted by age descending (newest first).
sanitize_session_id strips any byte outside [a-zA-Z0-9_-] and returns
Err on an empty result.
Call-site coverage is mandatory at three use sites — not just at read time:
| Use site | Risk if sanitization is skipped |
|---|---|
| Map key | Unsanitized ID used as a HashMap key can cause mismatched lookups if the same session appears under two forms |
| Display string | Raw bytes reach the TUI renderer; malformed Unicode or ANSI escapes corrupt the terminal |
| Path component | Unsanitized ID used in a file path enables path-traversal (../) attacks |
Any new code that consumes a session ID from a lock file must call
sanitize_session_id() before using the result in any of these three
contexts.
Persistence layer¶
FleetDashboardSummary is the single persisted struct:
#[derive(Serialize, Deserialize, Default)]
pub struct FleetDashboardSummary {
#[serde(default)]
pub version: u8,
#[serde(default)]
pub projects: Vec<PathBuf>,
#[serde(default)]
pub last_full_refresh: Option<i64>,
#[serde(default)]
pub extras: HashMap<String, serde_json::Value>,
}
#[serde(default)] on every field means any older file missing a field
deserializes without error. Unknown top-level fields land in extras and are
preserved on the next write, providing forward compatibility.
Atomic write sequence¶
- Serialize to JSON bytes.
- Open a temp file in the same directory as
fleet_dashboard.json(not/tmp; same-directory guarantees the rename is on the same filesystem mount point). - Set Unix permissions
0600before writing any bytes. - Write all bytes.
fsyncthe temp file (flush kernel buffers to disk).rename(temp, target)— atomic on POSIX.
Capture cache¶
FleetCaptureCache is an in-memory LRU:
- Backed by a
VecDeque<(String, String)>(session_id → content). - Capacity: 64 entries. When full, the oldest entry is evicted before inserting a new one.
- Each entry is capped at 64 KiB; content exceeding this is truncated before insertion.
- Any field holding this cache in a
Serializestruct must be annotated#[serde(skip)]to prevent accidental serialization of ephemeral terminal content.
Terminal safety¶
RAII terminal guard¶
crossterm::terminal::enable_raw_mode() is called once at dashboard startup.
The return value is wrapped in a TerminalGuard whose Drop impl calls
crossterm::terminal::disable_raw_mode():
struct TerminalGuard;
impl Drop for TerminalGuard {
fn drop(&mut self) {
let _ = crossterm::terminal::disable_raw_mode();
let _ = crossterm::execute!(
std::io::stdout(),
crossterm::cursor::Show,
crossterm::terminal::LeaveAlternateScreen,
);
}
}
Because drop runs even when the stack unwinds (Rust panics), the terminal is
always restored — the user's shell is never left in raw mode.
OSC escape stripping¶
tmux capture-pane output can contain OSC sequences (e.g., terminal
hyperlinks or colour-setting sequences) embedded by the captured program.
Before any captured content reaches the render loop, strip_osc_sequences()
removes both OSC termination forms defined by ECMA-48:
\x1b] … \x07(BEL-terminated OSC — the older, widely-used form)\x1b] … \x1b\\(ST-terminated OSC — the standards-compliant form)
Both forms must be stripped. Stripping only the BEL form leaves an injection vector via ST-terminated sequences; stripping only ST leaves the BEL vector open. Any OSC strip function that handles only one form is incomplete.
This prevents terminal-injection attacks through captured pane content.
Platform guard for PID-reuse check¶
The PID-reuse guard uses different mechanisms per platform, controlled by compile-time conditional compilation:
#[cfg(target_os = "linux")]
fn verify_comm(pid: u32, expected: &str) -> bool {
// reads /proc/{pid}/comm
}
#[cfg(not(target_os = "linux"))]
fn verify_comm(pid: u32, expected: &str) -> bool {
// uses sysctl CTL_KERN / KERN_PROC / KERN_PROC_PID on macOS/BSD
}
On Linux the check reads /proc/{pid}/comm — a single-read, low-overhead
operation. On macOS (and other BSD-derived systems) the check uses the
sysctl API to query the process command name. Both paths must agree with
the expected Claude process name before a signal is sent.
Users on macOS will observe the same behaviour (adoption blocked for
mismatched processes) but the underlying mechanism differs. If a macOS user
sees unexpected adoption failures they should check whether the sysctl
API is available in their sandbox environment.
Security model¶
| Concern | Mitigation |
|---|---|
| Path traversal via session IDs | sanitize_session_id() on every lock-file name |
| Symlink attacks on project paths | canonicalize() before is_dir() check |
| PID reuse (signal to wrong process) | UID check + /proc/{pid}/comm (or sysctl) before any signal |
| Oversized tmux output exhausting memory | 64 KiB per-entry cap in FleetCaptureCache |
| Sensitive data in serialized state | #[serde(skip)] on capture cache fields |
| Partial writes to fleet config | Atomic rename sequence (temp file same dir) |
| Terminal injection via captured content | OSC sequence stripping before render |
| Leaked paths in TUI error messages | FleetError::Display shows category only; Debug shows detail |
FleetError has 10 variants. The Display impl deliberately omits raw
filesystem paths, PIDs, and internal state — these appear only in the Debug
representation, which is written to log files rather than shown in the TUI.
What was deliberately left out¶
No tokio¶
The refresh loop does two blocking sleeps per thread and a directory read.
None of that benefits from cooperative scheduling. Adding tokio would
increase compile time and binary size with no measurable gain.
No TUI framework crate (ratatui, tui-rs, etc.)¶
The cockpit renderer is a hand-written ANSI renderer using crossterm
primitives. This keeps the dependency surface minimal and makes it easy to
audit what terminal escapes are being sent. A framework crate would be
justified if the widget count grew significantly.
No persistent session-content cache¶
FleetCaptureCache is in-memory only. Writing captured terminal content to
disk would require careful scrubbing of secrets (API keys, tokens) that may
appear in Claude's output. The per-session 64 KiB cap already limits memory
use to ~4 MiB for a full 64-entry cache.
Conditional workspace helpers: sanitize_session_id and AtomicJsonFile¶
For contributors — Both
sanitize_session_id()andAtomicJsonFilemay not exist as standalone items in the Rust workspace depending on the build configuration. If you find them absent, consult the inline fallback rules documented in the spec (RISK-02 for sanitization, RISK-03 for atomic writes). Do not assume their absence means the behaviour is unimplemented — the logic may be inlined at the call site.
No v0.4.x version tag¶
Version numbers jump from v0.3.x directly to v0.5.0 to avoid a collision
with tags on the main amploxy branch that used v0.4.x during the parallel
development period. This is documented in CHANGELOG.md.
See also