Update the design and PLAN.md (#11)

Reviewed-on: #11 Co-authored-by: Drew Galbraith <drew@tiramisu.one> Co-committed-by: Drew Galbraith <drew@tiramisu.one>
2026-03-14 21:52:38 +00:00 · 2026-03-14 21:52:38 +00:00 · 7420755800
commit 7420755800
parent 669e05b716
2 changed files with 707 additions and 170 deletions
--- a/PLAN.md
+++ b/PLAN.md
@ -1,96 +1,616 @@
-# Implementation Plan
+# Skate Implementation Plan

-## Phase 4: Sandboxing
+This plan closes the gaps between the current codebase and the goals stated in DESIGN.md.
+The phases are ordered by dependency -- each phase builds on the previous.

-### Step 4.1: Create sandbox module with policy types and tracing foundation
- `SandboxPolicy` struct: read-only paths, read-write paths, network allowed bool
- `Sandbox` struct holding policy + working dir
- Add `tracing` spans and events throughout from the start:
-  - `#[instrument]` on all public `Sandbox` methods
-  - `debug!` on policy construction with path lists
-  - `info!` on sandbox creation with full policy summary
- No enforcement yet, just the type skeleton and module wiring
- **Files:** new `src/sandbox/mod.rs`, `src/sandbox/policy.rs`
- **Done when:** compiles, unit tests for policy construction, `RUST_LOG=debug cargo test` shows sandbox trace output
+## Current State Summary

-### Step 4.2: Landlock policy builder with startup gate and tracing
- Translate `SandboxPolicy` into Landlock ruleset using `landlock` crate
- Kernel requirements:
-  - **ABI v4 (kernel 6.7+):** minimum required -- provides both filesystem and network sandboxing
-  - ABI 1-3 have filesystem only, no network restriction -- tools could exfiltrate data freely
- Startup behavior -- on launch, check Landlock ABI version:
-  - ABI >= 4: proceed normally (full filesystem + network sandboxing)
-  - ABI < 4 (including unsupported): **refuse to start** with clear error: "Landlock ABI v4+ required (kernel 6.7+). Use --yolo to run without sandboxing."
-  - `--yolo` flag: skip all Landlock enforcement, log `warn!` at startup, show "UNSANDBOXED" in status bar permanently
- Landlock applied per-child-process via `pre_exec`, NOT to the main process
-  - Main process needs unrestricted network (Claude API) and filesystem (provider)
-  - Each `exec_command` child gets the current policy at spawn time
-  - `:net on/off` takes effect on the next spawned command
- Tracing:
-  - `info!` on kernel ABI version detected
-  - `debug!` for each rule added to ruleset (path, access flags)
-  - `warn!` on `--yolo` mode ("running without kernel sandboxing")
-  - `error!` if ruleset creation fails unexpectedly
- **Files:** `src/sandbox/landlock.rs`, add `landlock` dep to `Cargo.toml`, update CLI args in `src/app/`
- **Done when:** unit test constructs ruleset without panic; `--yolo` flag works on unsupported kernel; startup refuses without flag on unsupported kernel
+Phase 0 (core loop) is functionally complete: the TUI renders conversations, the
+orchestrator drives the Claude API, tools execute inside a Landlock sandbox, and the
+channel boundary between TUI and core is properly maintained.

-### Step 4.3: Sandbox file I/O API with operation tracing
- `Sandbox::read_file`, `Sandbox::write_file`, `Sandbox::list_directory`
- Move `validate_path` from `src/tools/mod.rs` into sandbox
- Tracing:
-  - `debug!` on every file operation: requested path, canonical path, allowed/denied
-  - `trace!` for path validation steps (join, canonicalize, starts_with check)
-  - `warn!` on path escape attempts (log the attempted path for debugging)
-  - `debug!` on successful operations with bytes read/written
- **Files:** `src/sandbox/mod.rs`
- **Done when:** unit tests in tempdir pass; path traversal rejected; `RUST_LOG=trace` shows full path resolution chain
+The major gaps are:

-### Step 4.4: Sandbox command execution with process tracing
- `Sandbox::exec_command(cmd, args, working_dir)` spawns child process with Landlock applied
- Captures stdout/stderr, enforces timeout
- Tracing:
-  - `info!` on command spawn: command, args, working_dir, timeout
-  - `debug!` on command completion: exit code, stdout/stderr byte lengths, duration
-  - `warn!` on non-zero exit codes
-  - `error!` on timeout or spawn failure with full context
-  - `trace!` for Landlock application to child process thread
- **Files:** `src/sandbox/mod.rs` or `src/sandbox/exec.rs`
- **Done when:** unit test runs `echo hello` in tempdir; write outside sandbox fails (on supported kernels)
+1. Tool executor tarpc interface -- the orchestrator calls tools directly rather than
+   via a tarpc client/server split as DESIGN.md specifies. This is the biggest
+   structural gap and a prerequisite for sub-agents (each agent gets its own client).
+2. Session logging (JSONL, tree-addressable) -- no `session/` module exists yet.
+3. Token tracking -- counts are debug-logged but not surfaced to the user.
+4. TUI introspection -- tool blocks and thinking traces cannot be expanded/collapsed.
+5. Status bar is sparse -- no token totals, no activity mode, no network state badge.
+6. Planning Mode -- no dedicated harness instantiation with restricted sandbox.
+7. Sub-agents -- no spawning mechanism, no independent context windows.
+8. Space-bar leader key and which-key help overlay are absent.

-### Step 4.5: Wire tools through Sandbox
- Change `Tool::execute` signature to accept `&Sandbox` instead of (or in addition to) `&Path`
- Update all 4 built-in tools to call `Sandbox` methods instead of `std::fs`/`std::process::Command`
- Remove direct `std::fs` usage from tool implementations
- Update `ToolRegistry` and orchestrator to pass `Sandbox`
- Tracing: tools now inherit sandbox spans automatically via `#[instrument]`
- **Files:** `src/tools/*.rs`, `src/tools/mod.rs`, `src/core/orchestrator.rs`
- **Done when:** all existing tool tests pass through Sandbox; no direct `std::fs` in tool files; `RUST_LOG=debug cargo run` shows sandbox operations during tool execution
+---

-### Step 4.6: Network toggle
- `network_allowed: bool` in `SandboxPolicy`
- `:net on/off` TUI command parsed in input handler, sent as `UserAction::SetNetworkPolicy(bool)`
- Orchestrator updates `Sandbox` policy. Status bar shows network state.
- Only available when Landlock ABI >= 4 (kernel 6.7+); command hidden otherwise
- Status bar shows: network state when available, "UNSANDBOXED" in `--yolo` mode
- Tracing: `info!` on network policy change
- **Files:** `src/tui/input.rs`, `src/tui/render.rs`, `src/core/types.rs`, `src/core/orchestrator.rs`, `src/sandbox/mod.rs`
- **Done when:** toggling `:net` updates status bar; Landlock network restriction applied on ABI >= 4
+## Phase 1 -- Tool Executor tarpc Interface

-### Step 4.7: Integration tests
- Tools + Sandbox in tempdir: write confinement, path traversal rejection, shell command confinement
- Skip Landlock-specific assertions on ABI < 4
- Test `--yolo` mode: sandbox constructed but no kernel enforcement
- Test startup gate: verify error on ABI < 4 without `--yolo`
- Tests should assert tracing output where relevant (use `tracing-test` crate or `tracing_subscriber::fmt::TestWriter`)
- **Files:** `tests/sandbox.rs`
- **Done when:** `cargo test --test sandbox` passes
+**Goal:** Introduce the harness/executor split described in DESIGN.md. The executor
+owns the `ToolRegistry` and `Sandbox`; the orchestrator (harness) communicates with
+it exclusively through a tarpc client. In this phase the transport is in-process
+(tarpc's unbounded channel pair), laying the groundwork for out-of-process execution
+in a later phase.

-### Phase 4 verification (end-to-end)
-1. `cargo test` -- all tests pass
-2. `cargo clippy -- -D warnings` -- zero warnings
-3. `RUST_LOG=debug cargo run -- --project-dir .` -- ask Claude to read a file, observe sandbox trace logs showing path validation and Landlock policy
-4. Ask Claude to write a file outside project dir -- sandbox denies with `warn!` log
-5. Ask Claude to run a shell command -- observe command spawn/completion trace
-6. `:net off` then ask for network access -- verify blocked
-7. Without `--yolo` on ABI < 4: verify startup refuses with clear error
-8. With `--yolo`: verify startup succeeds, "UNSANDBOXED" in status bar, `warn!` in logs
+This is the largest structural change in the plan. Every subsequent phase benefits
+from the cleaner boundary: sub-agents each get their own executor client (Phase 7),
+and the sandbox policy becomes a constructor argument to the executor rather than
+something threaded through the orchestrator.
+
+### 1.1  Define the tarpc service
+
+Create `src/executor/mod.rs`:
+
+```rust
+#[tarpc::service]
+pub trait Executor {
+    /// Return the full list of tools this executor exposes, including their
+    /// JSON Schema input descriptors.  The harness calls this once at startup
+    /// and caches the result for the lifetime of the conversation.
+    async fn list_tools() -> Vec<ToolDefinition>;
+
+    /// Invoke a single tool by name with a JSON-encoded argument object.
+    /// Returns the text content to feed back to the model, or an error string
+    /// that is also fed back (so the model can self-correct).
+    async fn call_tool(name: String, input: serde_json::Value) -> Result<String, String>;
+}
+```
+
+`ToolDefinition` is already defined in `core/types.rs` and is provider-agnostic --
+no new types are needed on the wire.
+
+### 1.2  Implement `ExecutorServer`
+
+Still in `src/executor/mod.rs`, add:
+
+```rust
+pub struct ExecutorServer {
+    registry: ToolRegistry,
+    sandbox: Arc<Sandbox>,
+}
+
+impl ExecutorServer {
+    pub fn new(registry: ToolRegistry, sandbox: Sandbox) -> Self { ... }
+}
+
+impl Executor for ExecutorServer {
+    async fn list_tools(self, _: Context) -> Vec<ToolDefinition> {
+        self.registry.definitions()
+    }
+
+    async fn call_tool(self, _: Context, name: String, input: Value) -> Result<String, String> {
+        match self.registry.get(&name) {
+            None => Err(format!("unknown tool: {name}")),
+            Some(tool) => tool
+                .execute(input, &self.sandbox)
+                .await
+                .map_err(|e| e.to_string()),
+        }
+    }
+}
+```
+
+The `Arc<Sandbox>` is required because tarpc clones the server struct per request.
+
+### 1.3  In-process transport helper
+
+Add a function to `src/executor/mod.rs` (and re-export from `src/app/mod.rs`) that
+wires an `ExecutorServer` to a client over tarpc's in-memory channel:
+
+```rust
+/// Spawn an ExecutorServer on the current tokio runtime and return a client
+/// connected to it via an in-process channel.  The server task runs until
+/// the client is dropped.
+pub fn spawn_local(server: ExecutorServer) -> ExecutorClient {
+    let (client_transport, server_transport) = tarpc::transport::channel::unbounded();
+    let server = tarpc::server::BaseChannel::with_defaults(server_transport);
+    tokio::spawn(server.execute(ExecutorServer::serve(/* ... */)));
+    ExecutorClient::new(tarpc::client::Config::default(), client_transport).spawn()
+}
+```
+
+### 1.4  Refactor `Orchestrator` to use the client
+
+Currently `Orchestrator<P>` holds `ToolRegistry` and `Sandbox` directly and calls
+`tool.execute(input, &sandbox)` in `run_turn`. Replace these fields with:
+
+```rust
+executor: ExecutorClient,
+tool_definitions: Vec<ToolDefinition>,   // fetched once at construction
+```
+
+`run_turn` changes from direct tool dispatch to:
+
+```rust
+let result = self.executor
+    .call_tool(context::current(), name, input)
+    .await;
+```
+
+The `tool_definitions` vec is passed to `provider.stream()` instead of being built
+from the registry on each call.
+
+### 1.5  Update `app/mod.rs`
+
+Replace the inline construction of `ToolRegistry + Sandbox` in `app::run` with:
+
+```rust
+let registry = build_tool_registry();
+let sandbox  = Sandbox::new(policy, project_dir, enforcement)?;
+let executor = executor::spawn_local(ExecutorServer::new(registry, sandbox));
+let orchestrator = Orchestrator::new(provider, executor, system_prompt);
+```
+
+### 1.6  Tests
+
+- Unit: `ExecutorServer::call_tool` with a mock `ToolRegistry` returns correct
+  output and maps errors to `Err(String)`.
+- Integration: `spawn_local` -> `client.call_tool` round-trip through the in-process
+  channel executes a real `read_file` against a temp dir.
+- Integration: existing orchestrator integration tests continue to pass after the
+  refactor (the mock provider path is unchanged; only tool dispatch changes).
+
+### 1.7  Files touched
+
+| Action | File |
+|--------|------|
+| New | `src/executor/mod.rs` |
+| Modified | `src/core/orchestrator.rs` -- remove registry/sandbox, add executor client |
+| Modified | `src/app/mod.rs` -- construct executor, pass client to orchestrator |
+| Modified | `Cargo.toml` -- add `tarpc` with `tokio1` feature |
+
+New dependency: `tarpc` (with `tokio1` and `serde-transport` features).
+
+---
+
+## Phase 2 -- Session Logging
+
+**Goal:** Persist every event to a JSONL file. This is the foundation for token
+accounting, session resume, and future conversation branching.
+
+### 1.1  Add `src/session/` module
+
+Create `src/session/mod.rs` with the following public surface:
+
+```rust
+pub struct SessionWriter { ... }
+
+impl SessionWriter {
+    /// Open (or create) a JSONL log at the given path in append mode.
+    pub async fn open(path: &Path) -> Result<Self, SessionError>;
+
+    /// Append one event.  Never rewrites history.
+    pub async fn append(&self, event: &LogEvent) -> Result<(), SessionError>;
+}
+
+pub struct SessionReader { ... }
+
+impl SessionReader {
+    pub async fn load(path: &Path) -> Result<Vec<LogEvent>, SessionError>;
+}
+```
+
+### 1.2  Define `LogEvent`
+
+```rust
+pub struct LogEvent {
+    pub id: Uuid,
+    pub parent_id: Option<Uuid>,
+    pub timestamp: DateTime<Utc>,
+    pub payload: LogPayload,
+    pub token_usage: Option<TokenUsage>,
+}
+
+pub enum LogPayload {
+    UserMessage { content: String },
+    AssistantMessage { content: Vec<ContentBlock> },
+    ToolCall { tool_name: String, input: serde_json::Value },
+    ToolResult { tool_use_id: String, content: String, is_error: bool },
+}
+
+pub struct TokenUsage {
+    pub input: u32,
+    pub output: u32,
+    pub cache_read: Option<u32>,
+    pub cache_write: Option<u32>,
+}
+```
+
+`id` and `parent_id` form a tree that enables future branching. For now the
+conversation is linear so `parent_id` is always the id of the previous event.
+
+### 1.3  Wire into Orchestrator
+
+- `Orchestrator` holds an `Option<SessionWriter>`.
+- Every time the orchestrator pushes to `ConversationHistory` it also appends a
+  `LogEvent`. Token counts from `StreamEvent::InputTokens` / `OutputTokens` are
+  stored on the final assistant event of each turn.
+- Session file lives at `.skate/sessions/<timestamp>.jsonl`.
+
+### 1.4  Tests
+
+- Unit: `SessionWriter::append` then `SessionReader::load` round-trips all payload
+  variants.
+- Unit: parent_id chain is correct across a simulated multi-turn exchange.
+- Integration: run the orchestrator with a mock provider against a temp dir; assert
+  the JSONL file is written.
+
+---
+
+## Phase 3 -- Token Tracking & Status Bar
+
+**Goal:** Surface token usage in the TUI per-turn and cumulatively.
+
+### 3.1  Per-turn token counts in UIEvent
+
+Add a variant to `UIEvent`:
+
+```rust
+UIEvent::TurnComplete { input_tokens: u32, output_tokens: u32 }
+```
+
+The orchestrator already receives `StreamEvent::InputTokens` and `OutputTokens`;
+it should accumulate them during a turn and emit them in `TurnComplete`.
+
+### 3.2  AppState token counters
+
+Add to `AppState`:
+
+```rust
+pub turn_input_tokens: u32,
+pub turn_output_tokens: u32,
+pub total_input_tokens: u64,
+pub total_output_tokens: u64,
+```
+
+`events.rs` updates these on `TurnComplete`.
+
+### 3.3  Status bar redesign
+
+The status bar currently shows only the mode indicator. Expand it to four sections:
+
+```
+[ MODE ] [ ACTIVITY ]          [ i:1234 o:567 | total i:9999 o:2345 ] [ NET: off ]
+```
+
+- **MODE** -- Normal / Insert / Command
+- **ACTIVITY** -- Plan / Execute (Phase 4 adds Plan; for now always "Execute")
+- **Tokens** -- per-turn input/output, then session cumulative
+- **NET** -- `on` (green) or `off` (red) reflecting `network_allowed`
+
+Update `render.rs` to implement this layout using Ratatui `Layout::horizontal`.
+
+### 3.4  Tests
+
+- Unit: `AppState` accumulates totals correctly across multiple `TurnComplete` events.
+- TUI snapshot test (TestBackend): status bar renders all four sections with correct
+  content after a synthetic `TurnComplete`.
+
+---
+
+## Phase 4 -- TUI Introspection (Expand/Collapse)
+
+**Goal:** Support progressive disclosure -- tool calls and thinking traces start
+collapsed; the user can expand them.
+
+### 4.1  Block model
+
+Replace the flat `Vec<DisplayMessage>` in `AppState` with a `Vec<DisplayBlock>`:
+
+```rust
+pub enum DisplayBlock {
+    UserMessage { content: String },
+    AssistantText { content: String },
+    ToolCall {
+        display: ToolDisplay,
+        result: Option<String>,
+        expanded: bool,
+    },
+    Error { message: String },
+}
+```
+
+### 4.2  Navigation in Normal mode
+
+Add block-level cursor to `AppState`:
+
+```rust
+pub focused_block: Option<usize>,
+```
+
+Keybindings (Normal mode):
+
+| Key | Action |
+|-----|--------|
+| `[` | Move focus to previous block |
+| `]` | Move focus to next block |
+| `Enter` or `Space` | Toggle `expanded` on focused ToolCall block |
+| `j` / `k` | Line scroll (unchanged) |
+
+The focused block is highlighted with a distinct border color.
+
+### 4.3  Render changes
+
+`render.rs` must calculate the height of each `DisplayBlock` depending on whether
+it is collapsed (1-2 summary lines) or expanded (full content). The scroll offset
+operates on pixel-rows, not message indices.
+
+Collapsed tool call shows: `> tool_name(arg_summary) -- result_summary`
+Expanded tool call shows: full input and output as formatted by `tool_display.rs`.
+
+### 4.4  Tests
+
+- Unit: toggling `expanded` on a `ToolCall` block changes height calculation.
+- TUI snapshot: collapsed vs expanded render output for `WriteFile` and `ShellExec`.
+
+---
+
+## Phase 5 -- Space-bar Leader Key & Which-Key Overlay
+
+**Goal:** Support vim-style `<Space>` leader chords for configuration actions. This
+replaces the `:net on` / `:net off` text commands with discoverable hotkeys.
+
+### 5.1  Leader key state machine
+
+Extend `AppState` with:
+
+```rust
+pub leader_active: bool,
+pub leader_timeout: Option<Instant>,
+```
+
+In Normal mode, pressing `Space` sets `leader_active = true` and starts a 1-second
+timeout. The next key is dispatched through the chord table. If the timeout fires
+or an unbound key is pressed, leader mode is cancelled with a brief status message.
+
+### 5.2  Initial chord table
+
+| Chord | Action |
+|-------|--------|
+| `<Space> n` | Toggle network policy |
+| `<Space> c` | Clear history (`:clear`) |
+| `<Space> p` | Switch to Plan mode (Phase 5) |
+| `<Space> ?` | Toggle which-key overlay |
+
+### 5.3  Which-key overlay
+
+A centered popup rendered over the output pane that lists all available chords and
+their descriptions. Rendered only when `leader_active = true` (after a short delay,
+~200 ms, to avoid flicker on fast typists).
+
+### 5.4  Remove `:net on/off` from command parser
+
+Once leader-key network toggle is in place, remove the text-command duplicates to
+keep the command palette small and focused.
+
+### 5.5  Tests
+
+- Unit: leader key state machine transitions (activate, timeout, chord match, cancel).
+- TUI snapshot: which-key overlay renders with correct chord list.
+
+---
+
+## Phase 6 -- Planning Mode
+
+**Goal:** A dedicated planning harness with restricted sandbox that writes a single
+plan file, plus a mechanism to pipe the plan into an execute harness.
+
+### 6.1  Plan harness sandbox policy
+
+In planning mode the orchestrator is instantiated with a `SandboxPolicy` that grants:
+
+- `/` -- ReadOnly (same as execute)
+- `<project_dir>/.skate/plan.md` -- ReadWrite (only this file)
+- Network -- off
+
+All other write attempts fail with a sandbox permission error returned to the model.
+
+### 6.2  Survey tool
+
+Add a new tool `ask_user` that allows the model to present structured questions to
+the user during planning:
+
+```rust
+// Input schema
+{
+  "question": "string",
+  "options": ["string"] | null   // null means free-text answer
+}
+```
+
+The orchestrator sends a new `UIEvent::SurveyRequest { question, options }`. The TUI
+renders an inline prompt. The user's answer is sent back as a `UserAction::SurveyResponse`.
+
+### 6.3  TUI activity mode
+
+`AppState` gets:
+
+```rust
+pub activity: Activity,
+
+pub enum Activity { Plan, Execute }
+```
+
+Switching activity (via `<Space> p`) instantiates a new orchestrator on a fresh
+channel pair. The old orchestrator is shut down cleanly. The status bar ACTIVITY
+section updates.
+
+### 6.4  Plan -> Execute handoff
+
+When the user is satisfied with the plan (`<Space> x` or `:exec`):
+
+1. TUI reads `.skate/plan.md`.
+2. Constructs a new system prompt: `<original system prompt>\n\n## Plan\n<plan content>`.
+3. Instantiates an Execute orchestrator with the full sandbox policy and the
+   augmented system prompt.
+4. Transitions `activity` to `Execute`.
+
+The old Plan orchestrator is dropped.
+
+### 6.5  Edit plan in $EDITOR
+
+Hotkey `<Space> e` (or `:edit-plan`) suspends the TUI (restores terminal), opens
+`$EDITOR` on `.skate/plan.md`, then resumes the TUI after the editor exits.
+
+### 6.6  Tests
+
+- Integration: plan harness rejects write to a file other than plan.md.
+- Integration: survey tool round-trip through channel boundary.
+- Unit: plan -> execute handoff produces correct augmented system prompt.
+
+---
+
+## Phase 7 -- Sub-Agents
+
+**Goal:** The model can spawn independent sub-agents with their own context windows.
+Results are summarised and returned to the parent.
+
+### 7.1  `spawn_agent` tool
+
+Add a new tool with input schema:
+
+```rust
+{
+  "task": "string",           // instruction for the sub-agent
+  "sandbox": {                // optional policy overrides
+    "network": bool,
+    "extra_write_paths": ["string"]
+  }
+}
+```
+
+### 7.2  Sub-agent lifecycle
+
+When `spawn_agent` executes:
+
+1. Create a new `Orchestrator` with an independent conversation history.
+2. The sub-agent's system prompt is the parent's system prompt plus the task
+   description.
+3. The sub-agent runs autonomously (no user interaction) until it emits a
+   `UserAction::Quit` equivalent or hits `MAX_TOOL_ITERATIONS`.
+4. The final assistant message is returned as the tool result (the "summary").
+5. The sub-agent's session is logged to a child JSONL file linked to the parent
+   session by a `parent_session_id` field.
+
+### 7.3  TUI sub-agent view
+
+The agent tree is accessible via `<Space> a`. A side panel shows:
+
+```
+Parent
+ +-- sub-agent 1  [running]
+ +-- sub-agent 2  [done]
+```
+
+Pressing Enter on a sub-agent opens a read-only replay of its conversation (scroll
+only, no input). This is a stretch goal within this phase -- the core spawning
+mechanism is the priority.
+
+### 7.4  Tests
+
+- Integration: spawn_agent with a mock provider runs to completion and returns a
+  summary string.
+- Unit: sub-agent session file has correct parent_session_id link.
+- Unit: MAX_TOOL_ITERATIONS limit is respected within sub-agents.
+
+In this phase `spawn_agent` gains a natural implementation: it calls
+`executor::spawn_local` with a new `ExecutorServer` configured for the child policy,
+constructs a new `Orchestrator` with that client, and runs it to completion. The
+tarpc boundary from Phase 1 makes this straightforward.
+
+---
+
+## Phase 8 -- Prompt Caching
+
+**Goal:** Use Anthropic's prompt caching to reduce cost and latency on long
+conversations. DESIGN.md notes this as a desired property of message construction.
+
+### 8.1  Cache breakpoints
+
+The Anthropic API supports `"cache_control": {"type": "ephemeral"}` on message
+content blocks. The optimal strategy is to mark the last user message of the longest
+stable prefix as a cache write point.
+
+In `provider/claude.rs`, when serializing the messages array:
+
+- Mark the system prompt content block with `cache_control` (it never changes).
+- Mark the penultimate user message with `cache_control` (the conversation history
+  that is stable for the current turn).
+
+### 8.2  Cache token tracking
+
+The `TokenUsage` struct in `session/` already reserves `cache_read` and
+`cache_write` fields. `StreamEvent` must be extended:
+
+```rust
+StreamEvent::CacheReadTokens(u32),
+StreamEvent::CacheWriteTokens(u32),
+```
+
+The Anthropic `message_start` event contains `usage.cache_read_input_tokens` and
+`usage.cache_creation_input_tokens`. Parse these and emit the new variants.
+
+### 8.3  Status bar update
+
+Add cache tokens to the status bar display: `i:1234(c:800) o:567`.
+
+### 8.4  Tests
+
+- Provider unit test: replay a fixture that contains cache token fields; assert the
+  new StreamEvent variants are emitted.
+- Snapshot test: status bar renders cache token counts correctly.
+
+---
+
+## Dependency Graph
+
+```
+Phase 1 (tarpc executor)
+    |
+    +-- Phase 2 (session logging) -- orchestrator refactor is complete
+    |       |
+    |       +-- Phase 3 (token tracking) -- requires session TokenUsage struct
+    |       |
+    |       +-- Phase 7 (sub-agents) -- requires session parent_session_id
+    |
+    +-- Phase 7 (sub-agents) -- spawn_local reuse is natural after Phase 1
+
+Phase 4 (expand/collapse) -- independent, can be done alongside Phase 3
+
+Phase 5 (leader key) -- independent, prerequisite for Phase 6
+
+Phase 6 (planning mode) -- requires Phase 5 (leader key chord <Space> p)
+                        -- benefits from Phase 1 (separate executor per activity)
+
+Phase 8 (prompt caching) -- requires Phase 3 (cache token display)
+```
+
+Recommended order: 1 -> 2 -> 3 -> 4 -> 5 -> 6 -> 8, with 7 after 2 and 6.
+
+---
+
+## Files Touched Per Phase
+
+| Phase | New Files | Modified Files |
+|-------|-----------|----------------|
+| 1 | `src/executor/mod.rs` | `src/core/orchestrator.rs`, `src/core/types.rs`, `src/app/mod.rs`, `Cargo.toml` |
+| 2 | `src/session/mod.rs` | `src/core/orchestrator.rs`, `src/app/mod.rs` |
+| 3 | -- | `src/core/types.rs`, `src/core/orchestrator.rs`, `src/tui/events.rs`, `src/tui/render.rs` |
+| 4 | -- | `src/tui/mod.rs`, `src/tui/render.rs`, `src/tui/events.rs`, `src/tui/input.rs` |
+| 5 | -- | `src/tui/input.rs`, `src/tui/render.rs`, `src/tui/mod.rs` |
+| 6 | `src/tools/ask_user.rs` | `src/core/types.rs`, `src/core/orchestrator.rs`, `src/tui/mod.rs`, `src/tui/input.rs`, `src/tui/render.rs`, `src/app/mod.rs` |
+| 7 | -- | `src/executor/mod.rs`, `src/core/orchestrator.rs`, `src/tui/render.rs`, `src/tui/input.rs` |
+| 8 | -- | `src/provider/claude.rs`, `src/core/types.rs`, `src/session/mod.rs`, `src/tui/render.rs` |
+
+---
+
+## New Dependencies
+
+| Crate | Phase | Reason |
+|-------|-------|--------|
+| `tarpc` | 1 | RPC service trait + in-process transport |
+| `uuid` | 2 | LogEvent ids |
+| `chrono` | 2 | Event timestamps (check if already transitive) |
+
+No other new dependencies are needed. All other required functionality
+(`serde_json`, `tokio`, `ratatui`, `tracing`) is already present.