Use Landlock to restrict bash calls. (#5)

https://docs.kernel.org/userspace-api/landlock.html
Reviewed-on: #5
Co-authored-by: Drew Galbraith <drew@tiramisu.one>
Co-committed-by: Drew Galbraith <drew@tiramisu.one>
This commit is contained in:
Drew 2026-03-02 03:51:46 +00:00 committed by Drew
parent 797d7564b7
commit 7efc6705d3
19 changed files with 1315 additions and 238 deletions

153
PLAN.md
View file

@ -1,81 +1,96 @@
# Implementation Plan
## Phase 3: Tool Execution
## Phase 4: Sandboxing
### Step 3.1: Enrich the content model
- Replace `ConversationMessage { role, content: String }` with content-block model
- Define `ContentBlock` enum: `Text(String)`, `ToolUse { id, name, input: Value }`, `ToolResult { tool_use_id, content: String, is_error: bool }`
- Change `ConversationMessage.content` from `String` to `Vec<ContentBlock>`
- Add `ConversationMessage::text(role, s)` helper to keep existing call sites clean
- Update serialization, orchestrator, tests, TUI display
- **Files:** `src/core/types.rs`, `src/core/history.rs`
- **Done when:** `cargo test` passes with new model; all existing tests updated
### Step 4.1: Create sandbox module with policy types and tracing foundation
- `SandboxPolicy` struct: read-only paths, read-write paths, network allowed bool
- `Sandbox` struct holding policy + working dir
- Add `tracing` spans and events throughout from the start:
- `#[instrument]` on all public `Sandbox` methods
- `debug!` on policy construction with path lists
- `info!` on sandbox creation with full policy summary
- No enforcement yet, just the type skeleton and module wiring
- **Files:** new `src/sandbox/mod.rs`, `src/sandbox/policy.rs`
- **Done when:** compiles, unit tests for policy construction, `RUST_LOG=debug cargo test` shows sandbox trace output
### Step 3.2: Send tool definitions in API requests
- Add `ToolDefinition { name, description, input_schema: Value }` (provider-agnostic)
- Extend `ModelProvider::stream` to accept `&[ToolDefinition]`
- Include `"tools"` array in Claude provider request body
- **Files:** `src/provider/mod.rs`, `src/provider/claude.rs`
- **Done when:** API responses contain `tool_use` content blocks in raw SSE stream
### Step 4.2: Landlock policy builder with startup gate and tracing
- Translate `SandboxPolicy` into Landlock ruleset using `landlock` crate
- Kernel requirements:
- **ABI v4 (kernel 6.7+):** minimum required -- provides both filesystem and network sandboxing
- ABI 1-3 have filesystem only, no network restriction -- tools could exfiltrate data freely
- Startup behavior -- on launch, check Landlock ABI version:
- ABI >= 4: proceed normally (full filesystem + network sandboxing)
- ABI < 4 (including unsupported): **refuse to start** with clear error: "Landlock ABI v4+ required (kernel 6.7+). Use --yolo to run without sandboxing."
- `--yolo` flag: skip all Landlock enforcement, log `warn!` at startup, show "UNSANDBOXED" in status bar permanently
- Landlock applied per-child-process via `pre_exec`, NOT to the main process
- Main process needs unrestricted network (Claude API) and filesystem (provider)
- Each `exec_command` child gets the current policy at spawn time
- `:net on/off` takes effect on the next spawned command
- Tracing:
- `info!` on kernel ABI version detected
- `debug!` for each rule added to ruleset (path, access flags)
- `warn!` on `--yolo` mode ("running without kernel sandboxing")
- `error!` if ruleset creation fails unexpectedly
- **Files:** `src/sandbox/landlock.rs`, add `landlock` dep to `Cargo.toml`, update CLI args in `src/app/`
- **Done when:** unit test constructs ruleset without panic; `--yolo` flag works on unsupported kernel; startup refuses without flag on unsupported kernel
### Step 3.3: Parse tool-use blocks from SSE stream
- Add `StreamEvent::ToolUseStart { id, name }`, `ToolUseInputDelta(String)`, `ToolUseDone`
- Handle `content_block_start` (type "tool_use"), `content_block_delta` (type "input_json_delta"), `content_block_stop` for tool blocks
- Track current block type state in SSE parser
- **Files:** `src/provider/claude.rs`, `src/core/types.rs`
- **Done when:** Unit test with recorded tool-use SSE fixture asserts correct StreamEvent sequence
### Step 4.3: Sandbox file I/O API with operation tracing
- `Sandbox::read_file`, `Sandbox::write_file`, `Sandbox::list_directory`
- Move `validate_path` from `src/tools/mod.rs` into sandbox
- Tracing:
- `debug!` on every file operation: requested path, canonical path, allowed/denied
- `trace!` for path validation steps (join, canonicalize, starts_with check)
- `warn!` on path escape attempts (log the attempted path for debugging)
- `debug!` on successful operations with bytes read/written
- **Files:** `src/sandbox/mod.rs`
- **Done when:** unit tests in tempdir pass; path traversal rejected; `RUST_LOG=trace` shows full path resolution chain
### Step 3.4: Orchestrator accumulates tool-use blocks
- Accumulate `ToolUseInputDelta` fragments into JSON buffer per tool-use id
- On `ToolUseDone`, parse JSON into `ContentBlock::ToolUse`
- After `StreamEvent::Done`, if assistant message contains ToolUse blocks, enter tool-execution phase
- **Files:** `src/core/orchestrator.rs`
- **Done when:** Unit test with mock provider emitting tool-use events produces correct ContentBlocks
### Step 4.4: Sandbox command execution with process tracing
- `Sandbox::exec_command(cmd, args, working_dir)` spawns child process with Landlock applied
- Captures stdout/stderr, enforces timeout
- Tracing:
- `info!` on command spawn: command, args, working_dir, timeout
- `debug!` on command completion: exit code, stdout/stderr byte lengths, duration
- `warn!` on non-zero exit codes
- `error!` on timeout or spawn failure with full context
- `trace!` for Landlock application to child process thread
- **Files:** `src/sandbox/mod.rs` or `src/sandbox/exec.rs`
- **Done when:** unit test runs `echo hello` in tempdir; write outside sandbox fails (on supported kernels)
### Step 3.5: Tool trait, registry, and core tools
- `Tool` trait: `name()`, `description()`, `input_schema() -> Value`, `execute(input: Value, working_dir: &Path) -> Result<ToolOutput>`
- `ToolOutput { content: String, is_error: bool }`
- `ToolRegistry`: stores tools, provides `get(name)` and `definitions() -> Vec<ToolDefinition>`
- Risk level: `AutoApprove` (reads), `RequiresApproval` (writes/shell)
- Implement: `read_file` (auto), `list_directory` (auto), `write_file` (approval), `shell_exec` (approval)
- Path validation: `canonicalize` + `starts_with` check, reject paths outside working dir (no Landlock yet)
- **Files:** New `src/tools/` module: `mod.rs`, `read_file.rs`, `write_file.rs`, `list_directory.rs`, `shell_exec.rs`
- **Done when:** Unit tests pass for each tool in temp dirs; path traversal rejected
### Step 4.5: Wire tools through Sandbox
- Change `Tool::execute` signature to accept `&Sandbox` instead of (or in addition to) `&Path`
- Update all 4 built-in tools to call `Sandbox` methods instead of `std::fs`/`std::process::Command`
- Remove direct `std::fs` usage from tool implementations
- Update `ToolRegistry` and orchestrator to pass `Sandbox`
- Tracing: tools now inherit sandbox spans automatically via `#[instrument]`
- **Files:** `src/tools/*.rs`, `src/tools/mod.rs`, `src/core/orchestrator.rs`
- **Done when:** all existing tool tests pass through Sandbox; no direct `std::fs` in tool files; `RUST_LOG=debug cargo run` shows sandbox operations during tool execution
### Step 3.6: Approval gate (TUI <-> core)
- New `UIEvent::ToolApprovalRequest { tool_use_id, tool_name, input_summary }`
- New `UserAction::ToolApprovalResponse { tool_use_id, approved: bool }`
- Orchestrator: check risk level -> auto-approve or send approval request and await response
- Denied tools return `ToolResult { is_error: true }` with denial message
- TUI: render approval prompt overlay with y/n keybindings
- **Files:** `src/core/types.rs`, `src/core/orchestrator.rs`, `src/tui/events.rs`, `src/tui/input.rs`, `src/tui/render.rs`
- **Done when:** Integration test: mock provider + mock TUI channel verifies approval flow
### Step 4.6: Network toggle
- `network_allowed: bool` in `SandboxPolicy`
- `:net on/off` TUI command parsed in input handler, sent as `UserAction::SetNetworkPolicy(bool)`
- Orchestrator updates `Sandbox` policy. Status bar shows network state.
- Only available when Landlock ABI >= 4 (kernel 6.7+); command hidden otherwise
- Status bar shows: network state when available, "UNSANDBOXED" in `--yolo` mode
- Tracing: `info!` on network policy change
- **Files:** `src/tui/input.rs`, `src/tui/render.rs`, `src/core/types.rs`, `src/core/orchestrator.rs`, `src/sandbox/mod.rs`
- **Done when:** toggling `:net` updates status bar; Landlock network restriction applied on ABI >= 4
### Step 3.7: Tool results fed back to the model
- After executing tool calls: append assistant message (with ToolUse blocks) to history, append user message with ToolResult blocks, re-call provider
- Loop: model may respond with more tool calls or text
- Cap at max iterations (25) to prevent runaway
- **Files:** `src/core/orchestrator.rs`
- **Done when:** Integration test: mock provider returns tool-use then text; orchestrator makes two calls. Max-iteration cap tested.
### Step 4.7: Integration tests
- Tools + Sandbox in tempdir: write confinement, path traversal rejection, shell command confinement
- Skip Landlock-specific assertions on ABI < 4
- Test `--yolo` mode: sandbox constructed but no kernel enforcement
- Test startup gate: verify error on ABI < 4 without `--yolo`
- Tests should assert tracing output where relevant (use `tracing-test` crate or `tracing_subscriber::fmt::TestWriter`)
- **Files:** `tests/sandbox.rs`
- **Done when:** `cargo test --test sandbox` passes
### Step 3.8: TUI display for tool activity
- New `UIEvent::ToolExecuting { tool_name, input_summary }`, `UIEvent::ToolResult { tool_name, output_summary, is_error }`
- Render tool calls as distinct visual blocks in conversation view
- Render tool results inline (truncated if long)
- **Files:** `src/tui/render.rs`, `src/tui/events.rs`
- **Done when:** Visual check with `cargo run`; TestBackend test for tool block rendering
### Phase 3 verification (end-to-end)
### Phase 4 verification (end-to-end)
1. `cargo test` -- all tests pass
2. `cargo clippy -- -D warnings` -- zero warnings
3. `cargo run -- --project-dir .` -- ask Claude to read a file, approve, see contents
4. Ask Claude to write a file -- approve, verify written
5. Ask Claude to run a shell command -- approve, verify output
6. Deny an approval -- Claude gets denial and responds gracefully
## Phase 4: Sandboxing
- Landlock: read-only system, read-write project dir, network blocked
- Tools execute through `Sandbox`, never directly
- `:net on/off` toggle, state in status bar
- Graceful degradation on older kernels
- **Done when:** Writes outside project dir fail; network toggle works
3. `RUST_LOG=debug cargo run -- --project-dir .` -- ask Claude to read a file, observe sandbox trace logs showing path validation and Landlock policy
4. Ask Claude to write a file outside project dir -- sandbox denies with `warn!` log
5. Ask Claude to run a shell command -- observe command spawn/completion trace
6. `:net off` then ask for network access -- verify blocked
7. Without `--yolo` on ABI < 4: verify startup refuses with clear error
8. With `--yolo`: verify startup succeeds, "UNSANDBOXED" in status bar, `warn!` in logs