skate/DESIGN.md

# Skate Design

This is a TUI coding agent harness built for one user. The unique design goals compared
to other coding agents are:

1) Allow autonomous execution without permission prompts without fully sacrificing security.
The user can configure what permissions the coding agent has before execution and these
are enforced using kernel-level sandboxing.

2) The UI supports introspection to better understand how the harness is performing.
Information may start collapsed, but it is possible to introspect things like tool uses
and thinking chains. Additionally token usage is elevated to understand where the harness
is performing inefficiently.

3) The UI is modal and supports neovim like hotkeys for navigation and configuratiorn
(i.e. using the space bar as a leader key). We prefer having hotkeys over adding custom
slash commands (/model) to the text chat interface. The text chat should be reserved for
things that go straight to the underlying model.


## Architecture

The coding agent is broken into three main components, the TUI, the harness, and the tool executor.

The harness communicates with the tool executor via a tarpc interface.

The TUI and harness communicate over a Channel boundary and are fully decoupled
in a way that supports running the harness without the TUI (i.e. in scripting mode).

## Harness Design

The harness follows a fairly straightforward design loop.

1. Send message to underlying model.
2. If model requests a tool use, execute it (via a call to the executor) and return to 1.
3. Else, wait for further user input.

### Harness Instantiation

The harness is instantiated with a system prompt and a tarpc client to the tool executor.
(In the first iteration we use an in process channel for the tarpc client).

### Model Integration

The harness uses a trait system to make it agnostic to the underlying coding agent used.

This trait unifies a variety of APIs using a `StreamEvent` interface for streaming responses
from the API.

Currently, only Anthropic's Claude API is supported.

Messages are constructed in such a way to support prompt caching when available.

### Session Logging
- JSONL format, one event per line
- Events: user message, assistant message, tool call, tool result.
- Tree-addressable via parent IDs (enables conversation branching later)
- Token usage stored per event
- Linear UX for now, branching deferred

## Executor Design

The key aspect of the executor design is that is configured with sandbox permissions
that allow tool use without any user prompting. Either the tool use succeeds within the
sandbox and is returned to the model or it fails with a permission error to the model.

The sandboxing allows running arbitrary shell commands without prompting.

### Executor Interface

The executor interface exposed to the harness has the following methods.

- list_available_tools: takes no arguments and returns tool names, descriptions, and argument schema.
- call_tool: takes a tool name and its arguments and returns either a result or an error.

### Sandboxing

Sandboxing is done using the linux kernel feature "Landlock".

This allows restricting file system access (either read only, read/write, or no access)
as well as network access (either on/off).

## TUI Design

The bulk of the complexity of this coding agent is pushed to TUI in this design.

The driving goals of the TUI are:

- Support (neo)vim style keyboard navigation and modal editing.
- Full progressive discloure of information, high level information is grokable at a glance
  but full tool use and thinking traces can be expanded.
- Support for instantiating multiple different instances of the core harness (i.e. different
  instantiations for code review vs planning vs implementation).

## UI
- **Agent view:** Tree-based hierarchy (not flat tabs) for sub-agent inspection
- **Modes:** Normal, Insert, Command (`:` prefix from Normal mode)
- **Activity modes:** Plan and Execute are visually distinct activities in the TUI
- **Streaming:** Barebones styled text initially, full markdown rendering deferred
- **Token usage:** Per-turn display (between user inputs), cumulative in status bar
- **Status bar:** Mode indicator, current activity (Plan/Execute), token totals, network policy state

## Planning Mode

In planning mode the TUI instantiates a harness with read access to the project directory
and write access to a single plan markdown file.

The TUI then provides a glue mechanism that can then pipe that plan into a new instantiation of the
harness in execute mode.

Additionally we specify a schema for "surveys" that allow the model to ask the user questions about
the plan.

We also provide a hotkey (Ctrl+G or :edit-plan) that allows opening the plan in the users `$EDITOR`.

## Sub-Agents
- Independent context windows with summary passed back to parent
- Fully autonomous once spawned
- Hard deny on unpermitted actions
- Plan executor is a specialized sub-agent where the plan replaces the summary
- Direct user interaction with sub-agents deferred