# Implementation Plan ## Phase 3: Tool Execution ### Step 3.1: Enrich the content model - Replace `ConversationMessage { role, content: String }` with content-block model - Define `ContentBlock` enum: `Text(String)`, `ToolUse { id, name, input: Value }`, `ToolResult { tool_use_id, content: String, is_error: bool }` - Change `ConversationMessage.content` from `String` to `Vec` - Add `ConversationMessage::text(role, s)` helper to keep existing call sites clean - Update serialization, orchestrator, tests, TUI display - **Files:** `src/core/types.rs`, `src/core/history.rs` - **Done when:** `cargo test` passes with new model; all existing tests updated ### Step 3.2: Send tool definitions in API requests - Add `ToolDefinition { name, description, input_schema: Value }` (provider-agnostic) - Extend `ModelProvider::stream` to accept `&[ToolDefinition]` - Include `"tools"` array in Claude provider request body - **Files:** `src/provider/mod.rs`, `src/provider/claude.rs` - **Done when:** API responses contain `tool_use` content blocks in raw SSE stream ### Step 3.3: Parse tool-use blocks from SSE stream - Add `StreamEvent::ToolUseStart { id, name }`, `ToolUseInputDelta(String)`, `ToolUseDone` - Handle `content_block_start` (type "tool_use"), `content_block_delta` (type "input_json_delta"), `content_block_stop` for tool blocks - Track current block type state in SSE parser - **Files:** `src/provider/claude.rs`, `src/core/types.rs` - **Done when:** Unit test with recorded tool-use SSE fixture asserts correct StreamEvent sequence ### Step 3.4: Orchestrator accumulates tool-use blocks - Accumulate `ToolUseInputDelta` fragments into JSON buffer per tool-use id - On `ToolUseDone`, parse JSON into `ContentBlock::ToolUse` - After `StreamEvent::Done`, if assistant message contains ToolUse blocks, enter tool-execution phase - **Files:** `src/core/orchestrator.rs` - **Done when:** Unit test with mock provider emitting tool-use events produces correct ContentBlocks ### Step 3.5: Tool trait, registry, and core tools - `Tool` trait: `name()`, `description()`, `input_schema() -> Value`, `execute(input: Value, working_dir: &Path) -> Result` - `ToolOutput { content: String, is_error: bool }` - `ToolRegistry`: stores tools, provides `get(name)` and `definitions() -> Vec` - Risk level: `AutoApprove` (reads), `RequiresApproval` (writes/shell) - Implement: `read_file` (auto), `list_directory` (auto), `write_file` (approval), `shell_exec` (approval) - Path validation: `canonicalize` + `starts_with` check, reject paths outside working dir (no Landlock yet) - **Files:** New `src/tools/` module: `mod.rs`, `read_file.rs`, `write_file.rs`, `list_directory.rs`, `shell_exec.rs` - **Done when:** Unit tests pass for each tool in temp dirs; path traversal rejected ### Step 3.6: Approval gate (TUI <-> core) - New `UIEvent::ToolApprovalRequest { tool_use_id, tool_name, input_summary }` - New `UserAction::ToolApprovalResponse { tool_use_id, approved: bool }` - Orchestrator: check risk level -> auto-approve or send approval request and await response - Denied tools return `ToolResult { is_error: true }` with denial message - TUI: render approval prompt overlay with y/n keybindings - **Files:** `src/core/types.rs`, `src/core/orchestrator.rs`, `src/tui/events.rs`, `src/tui/input.rs`, `src/tui/render.rs` - **Done when:** Integration test: mock provider + mock TUI channel verifies approval flow ### Step 3.7: Tool results fed back to the model - After executing tool calls: append assistant message (with ToolUse blocks) to history, append user message with ToolResult blocks, re-call provider - Loop: model may respond with more tool calls or text - Cap at max iterations (25) to prevent runaway - **Files:** `src/core/orchestrator.rs` - **Done when:** Integration test: mock provider returns tool-use then text; orchestrator makes two calls. Max-iteration cap tested. ### Step 3.8: TUI display for tool activity - New `UIEvent::ToolExecuting { tool_name, input_summary }`, `UIEvent::ToolResult { tool_name, output_summary, is_error }` - Render tool calls as distinct visual blocks in conversation view - Render tool results inline (truncated if long) - **Files:** `src/tui/render.rs`, `src/tui/events.rs` - **Done when:** Visual check with `cargo run`; TestBackend test for tool block rendering ### Phase 3 verification (end-to-end) 1. `cargo test` -- all tests pass 2. `cargo clippy -- -D warnings` -- zero warnings 3. `cargo run -- --project-dir .` -- ask Claude to read a file, approve, see contents 4. Ask Claude to write a file -- approve, verify written 5. Ask Claude to run a shell command -- approve, verify output 6. Deny an approval -- Claude gets denial and responds gracefully ## Phase 4: Sandboxing - Landlock: read-only system, read-write project dir, network blocked - Tools execute through `Sandbox`, never directly - `:net on/off` toggle, state in status bar - Graceful degradation on older kernels - **Done when:** Writes outside project dir fail; network toggle works