drew/skate

Drew Galbraith 797d7564b7 Add tool use to the orchestrator (#4 )

Add tool use without sandboxing.

Currently available tools are list dir, read file, write file and exec bash.

Reviewed-on: #4
Co-authored-by: Drew Galbraith <drew@tiramisu.one>
Co-committed-by: Drew Galbraith <drew@tiramisu.one>

2026-03-02 03:00:13 +00:00

4.9 KiB

Raw Blame History

Implementation Plan

Phase 3: Tool Execution

Step 3.1: Enrich the content model

Replace ConversationMessage { role, content: String } with content-block model
Define ContentBlock enum: Text(String), ToolUse { id, name, input: Value }, ToolResult { tool_use_id, content: String, is_error: bool }
Change ConversationMessage.content from String to Vec<ContentBlock>
Add ConversationMessage::text(role, s) helper to keep existing call sites clean
Update serialization, orchestrator, tests, TUI display
Files: src/core/types.rs, src/core/history.rs
Done when: cargo test passes with new model; all existing tests updated

Step 3.2: Send tool definitions in API requests

Add ToolDefinition { name, description, input_schema: Value } (provider-agnostic)
Extend ModelProvider::stream to accept &[ToolDefinition]
Include "tools" array in Claude provider request body
Files: src/provider/mod.rs, src/provider/claude.rs
Done when: API responses contain tool_use content blocks in raw SSE stream

Step 3.3: Parse tool-use blocks from SSE stream

Add StreamEvent::ToolUseStart { id, name }, ToolUseInputDelta(String), ToolUseDone
Handle content_block_start (type "tool_use"), content_block_delta (type "input_json_delta"), content_block_stop for tool blocks
Track current block type state in SSE parser
Files: src/provider/claude.rs, src/core/types.rs
Done when: Unit test with recorded tool-use SSE fixture asserts correct StreamEvent sequence

Step 3.4: Orchestrator accumulates tool-use blocks

Accumulate ToolUseInputDelta fragments into JSON buffer per tool-use id
On ToolUseDone, parse JSON into ContentBlock::ToolUse
After StreamEvent::Done, if assistant message contains ToolUse blocks, enter tool-execution phase
Files: src/core/orchestrator.rs
Done when: Unit test with mock provider emitting tool-use events produces correct ContentBlocks

Step 3.5: Tool trait, registry, and core tools

Tool trait: name(), description(), input_schema() -> Value, execute(input: Value, working_dir: &Path) -> Result<ToolOutput>
ToolOutput { content: String, is_error: bool }
ToolRegistry: stores tools, provides get(name) and definitions() -> Vec<ToolDefinition>
Risk level: AutoApprove (reads), RequiresApproval (writes/shell)
Implement: read_file (auto), list_directory (auto), write_file (approval), shell_exec (approval)
Path validation: canonicalize + starts_with check, reject paths outside working dir (no Landlock yet)
Files: New src/tools/ module: mod.rs, read_file.rs, write_file.rs, list_directory.rs, shell_exec.rs
Done when: Unit tests pass for each tool in temp dirs; path traversal rejected

Step 3.6: Approval gate (TUI <-> core)

New UIEvent::ToolApprovalRequest { tool_use_id, tool_name, input_summary }
New UserAction::ToolApprovalResponse { tool_use_id, approved: bool }
Orchestrator: check risk level -> auto-approve or send approval request and await response
Denied tools return ToolResult { is_error: true } with denial message
TUI: render approval prompt overlay with y/n keybindings
Files: src/core/types.rs, src/core/orchestrator.rs, src/tui/events.rs, src/tui/input.rs, src/tui/render.rs
Done when: Integration test: mock provider + mock TUI channel verifies approval flow

Step 3.7: Tool results fed back to the model

After executing tool calls: append assistant message (with ToolUse blocks) to history, append user message with ToolResult blocks, re-call provider
Loop: model may respond with more tool calls or text
Cap at max iterations (25) to prevent runaway
Files: src/core/orchestrator.rs
Done when: Integration test: mock provider returns tool-use then text; orchestrator makes two calls. Max-iteration cap tested.

Step 3.8: TUI display for tool activity

New UIEvent::ToolExecuting { tool_name, input_summary }, UIEvent::ToolResult { tool_name, output_summary, is_error }
Render tool calls as distinct visual blocks in conversation view
Render tool results inline (truncated if long)
Files: src/tui/render.rs, src/tui/events.rs
Done when: Visual check with cargo run; TestBackend test for tool block rendering

Phase 3 verification (end-to-end)

cargo test -- all tests pass
cargo clippy -- -D warnings -- zero warnings
cargo run -- --project-dir . -- ask Claude to read a file, approve, see contents
Ask Claude to write a file -- approve, verify written
Ask Claude to run a shell command -- approve, verify output
Deny an approval -- Claude gets denial and responds gracefully

Phase 4: Sandboxing

Landlock: read-only system, read-write project dir, network blocked
Tools execute through Sandbox, never directly
:net on/off toggle, state in status bar
Graceful degradation on older kernels
Done when: Writes outside project dir fail; network toggle works

4.9 KiB Raw Blame History

Implementation Plan

Phase 3: Tool Execution

Step 3.1: Enrich the content model

Step 3.2: Send tool definitions in API requests

Step 3.3: Parse tool-use blocks from SSE stream

Step 3.4: Orchestrator accumulates tool-use blocks

Step 3.5: Tool trait, registry, and core tools

Step 3.6: Approval gate (TUI <-> core)

Step 3.7: Tool results fed back to the model

Step 3.8: TUI display for tool activity

Phase 3 verification (end-to-end)

Phase 4: Sandboxing

4.9 KiB

Raw Blame History