Landlock

2026-02-24 20:33:51 -08:00 · 2026-02-24 20:33:51 -08:00 · 03cfdf31a8
commit 03cfdf31a8
parent 797d7564b7
19 changed files with 1315 additions and 238 deletions
--- a/PLAN.md
+++ b/PLAN.md
@ -1,81 +1,96 @@
 # Implementation Plan

-## Phase 3: Tool Execution
+## Phase 4: Sandboxing

-### Step 3.1: Enrich the content model
- Replace `ConversationMessage { role, content: String }` with content-block model
- Define `ContentBlock` enum: `Text(String)`, `ToolUse { id, name, input: Value }`, `ToolResult { tool_use_id, content: String, is_error: bool }`
- Change `ConversationMessage.content` from `String` to `Vec<ContentBlock>`
- Add `ConversationMessage::text(role, s)` helper to keep existing call sites clean
- Update serialization, orchestrator, tests, TUI display
- **Files:** `src/core/types.rs`, `src/core/history.rs`
- **Done when:** `cargo test` passes with new model; all existing tests updated
+### Step 4.1: Create sandbox module with policy types and tracing foundation
+- `SandboxPolicy` struct: read-only paths, read-write paths, network allowed bool
+- `Sandbox` struct holding policy + working dir
+- Add `tracing` spans and events throughout from the start:
+  - `#[instrument]` on all public `Sandbox` methods
+  - `debug!` on policy construction with path lists
+  - `info!` on sandbox creation with full policy summary
+- No enforcement yet, just the type skeleton and module wiring
+- **Files:** new `src/sandbox/mod.rs`, `src/sandbox/policy.rs`
+- **Done when:** compiles, unit tests for policy construction, `RUST_LOG=debug cargo test` shows sandbox trace output

-### Step 3.2: Send tool definitions in API requests
- Add `ToolDefinition { name, description, input_schema: Value }` (provider-agnostic)
- Extend `ModelProvider::stream` to accept `&[ToolDefinition]`
- Include `"tools"` array in Claude provider request body
- **Files:** `src/provider/mod.rs`, `src/provider/claude.rs`
- **Done when:** API responses contain `tool_use` content blocks in raw SSE stream
+### Step 4.2: Landlock policy builder with startup gate and tracing
+- Translate `SandboxPolicy` into Landlock ruleset using `landlock` crate
+- Kernel requirements:
+  - **ABI v4 (kernel 6.7+):** minimum required -- provides both filesystem and network sandboxing
+  - ABI 1-3 have filesystem only, no network restriction -- tools could exfiltrate data freely
+- Startup behavior -- on launch, check Landlock ABI version:
+  - ABI >= 4: proceed normally (full filesystem + network sandboxing)
+  - ABI < 4 (including unsupported): **refuse to start** with clear error: "Landlock ABI v4+ required (kernel 6.7+). Use --yolo to run without sandboxing."
+  - `--yolo` flag: skip all Landlock enforcement, log `warn!` at startup, show "UNSANDBOXED" in status bar permanently
+- Landlock applied per-child-process via `pre_exec`, NOT to the main process
+  - Main process needs unrestricted network (Claude API) and filesystem (provider)
+  - Each `exec_command` child gets the current policy at spawn time
+  - `:net on/off` takes effect on the next spawned command
+- Tracing:
+  - `info!` on kernel ABI version detected
+  - `debug!` for each rule added to ruleset (path, access flags)
+  - `warn!` on `--yolo` mode ("running without kernel sandboxing")
+  - `error!` if ruleset creation fails unexpectedly
+- **Files:** `src/sandbox/landlock.rs`, add `landlock` dep to `Cargo.toml`, update CLI args in `src/app/`
+- **Done when:** unit test constructs ruleset without panic; `--yolo` flag works on unsupported kernel; startup refuses without flag on unsupported kernel

-### Step 3.3: Parse tool-use blocks from SSE stream
- Add `StreamEvent::ToolUseStart { id, name }`, `ToolUseInputDelta(String)`, `ToolUseDone`
- Handle `content_block_start` (type "tool_use"), `content_block_delta` (type "input_json_delta"), `content_block_stop` for tool blocks
- Track current block type state in SSE parser
- **Files:** `src/provider/claude.rs`, `src/core/types.rs`
- **Done when:** Unit test with recorded tool-use SSE fixture asserts correct StreamEvent sequence
+### Step 4.3: Sandbox file I/O API with operation tracing
+- `Sandbox::read_file`, `Sandbox::write_file`, `Sandbox::list_directory`
+- Move `validate_path` from `src/tools/mod.rs` into sandbox
+- Tracing:
+  - `debug!` on every file operation: requested path, canonical path, allowed/denied
+  - `trace!` for path validation steps (join, canonicalize, starts_with check)
+  - `warn!` on path escape attempts (log the attempted path for debugging)
+  - `debug!` on successful operations with bytes read/written
+- **Files:** `src/sandbox/mod.rs`
+- **Done when:** unit tests in tempdir pass; path traversal rejected; `RUST_LOG=trace` shows full path resolution chain

-### Step 3.4: Orchestrator accumulates tool-use blocks
- Accumulate `ToolUseInputDelta` fragments into JSON buffer per tool-use id
- On `ToolUseDone`, parse JSON into `ContentBlock::ToolUse`
- After `StreamEvent::Done`, if assistant message contains ToolUse blocks, enter tool-execution phase
- **Files:** `src/core/orchestrator.rs`
- **Done when:** Unit test with mock provider emitting tool-use events produces correct ContentBlocks
+### Step 4.4: Sandbox command execution with process tracing
+- `Sandbox::exec_command(cmd, args, working_dir)` spawns child process with Landlock applied
+- Captures stdout/stderr, enforces timeout
+- Tracing:
+  - `info!` on command spawn: command, args, working_dir, timeout
+  - `debug!` on command completion: exit code, stdout/stderr byte lengths, duration
+  - `warn!` on non-zero exit codes
+  - `error!` on timeout or spawn failure with full context
+  - `trace!` for Landlock application to child process thread
+- **Files:** `src/sandbox/mod.rs` or `src/sandbox/exec.rs`
+- **Done when:** unit test runs `echo hello` in tempdir; write outside sandbox fails (on supported kernels)

-### Step 3.5: Tool trait, registry, and core tools
- `Tool` trait: `name()`, `description()`, `input_schema() -> Value`, `execute(input: Value, working_dir: &Path) -> Result<ToolOutput>`
- `ToolOutput { content: String, is_error: bool }`
- `ToolRegistry`: stores tools, provides `get(name)` and `definitions() -> Vec<ToolDefinition>`
- Risk level: `AutoApprove` (reads), `RequiresApproval` (writes/shell)
- Implement: `read_file` (auto), `list_directory` (auto), `write_file` (approval), `shell_exec` (approval)
- Path validation: `canonicalize` + `starts_with` check, reject paths outside working dir (no Landlock yet)
- **Files:** New `src/tools/` module: `mod.rs`, `read_file.rs`, `write_file.rs`, `list_directory.rs`, `shell_exec.rs`
- **Done when:** Unit tests pass for each tool in temp dirs; path traversal rejected
+### Step 4.5: Wire tools through Sandbox
+- Change `Tool::execute` signature to accept `&Sandbox` instead of (or in addition to) `&Path`
+- Update all 4 built-in tools to call `Sandbox` methods instead of `std::fs`/`std::process::Command`
+- Remove direct `std::fs` usage from tool implementations
+- Update `ToolRegistry` and orchestrator to pass `Sandbox`
+- Tracing: tools now inherit sandbox spans automatically via `#[instrument]`
+- **Files:** `src/tools/*.rs`, `src/tools/mod.rs`, `src/core/orchestrator.rs`
+- **Done when:** all existing tool tests pass through Sandbox; no direct `std::fs` in tool files; `RUST_LOG=debug cargo run` shows sandbox operations during tool execution

-### Step 3.6: Approval gate (TUI <-> core)
- New `UIEvent::ToolApprovalRequest { tool_use_id, tool_name, input_summary }`
- New `UserAction::ToolApprovalResponse { tool_use_id, approved: bool }`
- Orchestrator: check risk level -> auto-approve or send approval request and await response
- Denied tools return `ToolResult { is_error: true }` with denial message
- TUI: render approval prompt overlay with y/n keybindings
- **Files:** `src/core/types.rs`, `src/core/orchestrator.rs`, `src/tui/events.rs`, `src/tui/input.rs`, `src/tui/render.rs`
- **Done when:** Integration test: mock provider + mock TUI channel verifies approval flow
+### Step 4.6: Network toggle
+- `network_allowed: bool` in `SandboxPolicy`
+- `:net on/off` TUI command parsed in input handler, sent as `UserAction::SetNetworkPolicy(bool)`
+- Orchestrator updates `Sandbox` policy. Status bar shows network state.
+- Only available when Landlock ABI >= 4 (kernel 6.7+); command hidden otherwise
+- Status bar shows: network state when available, "UNSANDBOXED" in `--yolo` mode
+- Tracing: `info!` on network policy change
+- **Files:** `src/tui/input.rs`, `src/tui/render.rs`, `src/core/types.rs`, `src/core/orchestrator.rs`, `src/sandbox/mod.rs`
+- **Done when:** toggling `:net` updates status bar; Landlock network restriction applied on ABI >= 4

-### Step 3.7: Tool results fed back to the model
- After executing tool calls: append assistant message (with ToolUse blocks) to history, append user message with ToolResult blocks, re-call provider
- Loop: model may respond with more tool calls or text
- Cap at max iterations (25) to prevent runaway
- **Files:** `src/core/orchestrator.rs`
- **Done when:** Integration test: mock provider returns tool-use then text; orchestrator makes two calls. Max-iteration cap tested.
+### Step 4.7: Integration tests
+- Tools + Sandbox in tempdir: write confinement, path traversal rejection, shell command confinement
+- Skip Landlock-specific assertions on ABI < 4
+- Test `--yolo` mode: sandbox constructed but no kernel enforcement
+- Test startup gate: verify error on ABI < 4 without `--yolo`
+- Tests should assert tracing output where relevant (use `tracing-test` crate or `tracing_subscriber::fmt::TestWriter`)
+- **Files:** `tests/sandbox.rs`
+- **Done when:** `cargo test --test sandbox` passes

-### Step 3.8: TUI display for tool activity
- New `UIEvent::ToolExecuting { tool_name, input_summary }`, `UIEvent::ToolResult { tool_name, output_summary, is_error }`
- Render tool calls as distinct visual blocks in conversation view
- Render tool results inline (truncated if long)
- **Files:** `src/tui/render.rs`, `src/tui/events.rs`
- **Done when:** Visual check with `cargo run`; TestBackend test for tool block rendering
-
-### Phase 3 verification (end-to-end)
+### Phase 4 verification (end-to-end)
 1. `cargo test` -- all tests pass
 2. `cargo clippy -- -D warnings` -- zero warnings
-3. `cargo run -- --project-dir .` -- ask Claude to read a file, approve, see contents
-4. Ask Claude to write a file -- approve, verify written
-5. Ask Claude to run a shell command -- approve, verify output
-6. Deny an approval -- Claude gets denial and responds gracefully
-
-## Phase 4: Sandboxing
- Landlock: read-only system, read-write project dir, network blocked
- Tools execute through `Sandbox`, never directly
- `:net on/off` toggle, state in status bar
- Graceful degradation on older kernels
- **Done when:** Writes outside project dir fail; network toggle works
+3. `RUST_LOG=debug cargo run -- --project-dir .` -- ask Claude to read a file, observe sandbox trace logs showing path validation and Landlock policy
+4. Ask Claude to write a file outside project dir -- sandbox denies with `warn!` log
+5. Ask Claude to run a shell command -- observe command spawn/completion trace
+6. `:net off` then ask for network access -- verify blocked
+7. Without `--yolo` on ABI < 4: verify startup refuses with clear error
+8. With `--yolo`: verify startup succeeds, "UNSANDBOXED" in status bar, `warn!` in logs