Add command interface and status indicator.
This commit is contained in:
parent
5cb6647513
commit
7b9525ef95
7 changed files with 358 additions and 22 deletions
|
|
@ -67,7 +67,7 @@
|
|||
- **`sandbox`:** Landlock policy construction, path validation logic (without applying kernel rules)
|
||||
- **`core`:** Conversation tree operations (insert, query by parent, turn computation, token totals), orchestrator state machine transitions against mock `StreamEvent` sequences
|
||||
- **`session`:** JSONL serialization roundtrips, parent ID chain reconstruction
|
||||
- **`tui`:** Widget rendering via Ratatui `TestBackend`, snapshot tests with `insta` crate for layout/mode indicator/token display
|
||||
- **`tui`:** Widget rendering via Ratatui `TestBackend`
|
||||
|
||||
### Integration Tests — Component Boundaries
|
||||
- **Core ↔ Provider:** Mock `ModelProvider` replaying recorded API sessions (full SSE streams with tool use). Tests the complete orchestration loop deterministically without network.
|
||||
|
|
@ -78,11 +78,6 @@
|
|||
- **Recorded session replay:** Capture real Claude API HTTP request/response pairs, replay deterministically. Exercises full stack (core + channel + mock TUI) without cost or network dependency. Primary E2E test strategy.
|
||||
- **Live API tests:** Small suite behind feature flag / env var. Verifies real API integration. Run manually before releases, not in CI.
|
||||
|
||||
### Snapshot Testing
|
||||
- `insta` crate for TUI visual regression testing from Phase 2 onward
|
||||
- Capture rendered `TestBackend` buffers as string snapshots
|
||||
- Catches layout, mode indicator, and token display regressions
|
||||
|
||||
### Benchmarking — SWE-bench
|
||||
- **Target:** SWE-bench Verified (500 curated problems) as primary benchmark
|
||||
- **Secondary:** SWE-bench Pro for testing planning mode on longer-horizon tasks
|
||||
|
|
@ -93,7 +88,6 @@
|
|||
|
||||
### Test Sequencing
|
||||
- Phase 1: Unit tests for SSE parser, event types, message serialization
|
||||
- Phase 2: Snapshot tests for TUI with `insta`
|
||||
- Phase 4: Recorded session replay infrastructure (core loop complex enough to warrant it)
|
||||
- Phase 6-7: Headless mode + first SWE-bench Verified run
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue