drew/skate

Drew Galbraith 7420755800 Update the design and PLAN.md (#11 )

Reviewed-on: #11
Co-authored-by: Drew Galbraith <drew@tiramisu.one>
Co-committed-by: Drew Galbraith <drew@tiramisu.one>

2026-03-14 21:52:38 +00:00

4.9 KiB

Raw Permalink Blame History

Skate Design

This is a TUI coding agent harness built for one user. The unique design goals compared to other coding agents are:

Allow autonomous execution without permission prompts without fully sacrificing security. The user can configure what permissions the coding agent has before execution and these are enforced using kernel-level sandboxing.
The UI supports introspection to better understand how the harness is performing. Information may start collapsed, but it is possible to introspect things like tool uses and thinking chains. Additionally token usage is elevated to understand where the harness is performing inefficiently.
The UI is modal and supports neovim like hotkeys for navigation and configuratiorn (i.e. using the space bar as a leader key). We prefer having hotkeys over adding custom slash commands (/model) to the text chat interface. The text chat should be reserved for things that go straight to the underlying model.

Architecture

The coding agent is broken into three main components, the TUI, the harness, and the tool executor.

The harness communicates with the tool executor via a tarpc interface.

The TUI and harness communicate over a Channel boundary and are fully decoupled in a way that supports running the harness without the TUI (i.e. in scripting mode).

Harness Design

The harness follows a fairly straightforward design loop.

Send message to underlying model.
If model requests a tool use, execute it (via a call to the executor) and return to 1.
Else, wait for further user input.

Harness Instantiation

The harness is instantiated with a system prompt and a tarpc client to the tool executor. (In the first iteration we use an in process channel for the tarpc client).

Model Integration

The harness uses a trait system to make it agnostic to the underlying coding agent used.

This trait unifies a variety of APIs using a StreamEvent interface for streaming responses from the API.

Currently, only Anthropic's Claude API is supported.

Messages are constructed in such a way to support prompt caching when available.

Session Logging

JSONL format, one event per line
Events: user message, assistant message, tool call, tool result.
Tree-addressable via parent IDs (enables conversation branching later)
Token usage stored per event
Linear UX for now, branching deferred

Executor Design

The key aspect of the executor design is that is configured with sandbox permissions that allow tool use without any user prompting. Either the tool use succeeds within the sandbox and is returned to the model or it fails with a permission error to the model.

The sandboxing allows running arbitrary shell commands without prompting.

Executor Interface

The executor interface exposed to the harness has the following methods.

list_available_tools: takes no arguments and returns tool names, descriptions, and argument schema.
call_tool: takes a tool name and its arguments and returns either a result or an error.

Sandboxing

Sandboxing is done using the linux kernel feature "Landlock".

This allows restricting file system access (either read only, read/write, or no access) as well as network access (either on/off).

TUI Design

The bulk of the complexity of this coding agent is pushed to TUI in this design.

The driving goals of the TUI are:

Support (neo)vim style keyboard navigation and modal editing.
Full progressive discloure of information, high level information is grokable at a glance but full tool use and thinking traces can be expanded.
Support for instantiating multiple different instances of the core harness (i.e. different instantiations for code review vs planning vs implementation).

UI

Agent view: Tree-based hierarchy (not flat tabs) for sub-agent inspection
Modes: Normal, Insert, Command (: prefix from Normal mode)
Activity modes: Plan and Execute are visually distinct activities in the TUI
Streaming: Barebones styled text initially, full markdown rendering deferred
Token usage: Per-turn display (between user inputs), cumulative in status bar
Status bar: Mode indicator, current activity (Plan/Execute), token totals, network policy state

Planning Mode

In planning mode the TUI instantiates a harness with read access to the project directory and write access to a single plan markdown file.

The TUI then provides a glue mechanism that can then pipe that plan into a new instantiation of the harness in execute mode.

Additionally we specify a schema for "surveys" that allow the model to ask the user questions about the plan.

We also provide a hotkey (Ctrl+G or :edit-plan) that allows opening the plan in the users $EDITOR.

Sub-Agents

Independent context windows with summary passed back to parent
Fully autonomous once spawned
Hard deny on unpermitted actions
Plan executor is a specialized sub-agent where the plan replaces the summary
Direct user interaction with sub-agents deferred

4.9 KiB Raw Permalink Blame History