// CASE FILE 03 — CODENAME: CHAOS WORKS — CODING AGENT · SELF-SCAFFOLDING

CODE THAT CONJURES ITSELF, ORNITH 1.0.

A desktop coding-agent harness (Next.js + Electron, React + TypeScript) for the self-scaffolding ornith:9b model on local Ollama — each session binds its own pluggable Python agent harness that plans tasks and streams native tool-calls: file edits, reads, shell runs. Like Wanda's constructs: the scaffolding appears exactly when the work demands it.

Inspired By

Wanda

Role

Creator

Model

ornith:9b · Local

Tool-Calls

Native

View on GitHub ↗ ← Back to Home

Plan★Scaffold★Tool-Call★Critique★Inspect★Ship★ Plan★Scaffold★Tool-Call★Critique★Inspect★Ship★

FILE 01 — THE MISSION

A HARNESS FOR A MODEL THAT BUILDS ITS OWN

Coding agents usually get one hard-coded harness and like it. Ornith 1.0 inverts that: built for the self-scaffolding ornith:9b model running on local Ollama, every session binds its own pluggable Python agent harness — the scaffolding is part of the experiment, not a constant.

The model plans tasks and streams native tool-calls — file edits, reads, shell runs — while the desktop app makes every step visible: an Inspector / Thinking / Critiques workspace surfaces the agent's ReAct reasoning and an L2–L7 critic ledger for every response. Chaos, fully observed.

"The best harness is the one the session conjures for itself."

FILE SNAPSHOT
Model — self-scaffolding ornith:9b, local Ollama
Harnesses — pluggable Python, per session
Tool-calls — native: edits, reads, shell runs
Critics — L2–L7 ledger on every response
Desktop — Next.js + Electron, React + TS
Tested — full Playwright e2e suite

FILE 02 — THE RITUAL

FROM EMPTY SESSION TO RUNNING AGENT

Bind a Harness

Each session picks from the harness registry (load / unload) — a pluggable Python agent harness that defines how this agent will work.

Plan the Task

The self-scaffolding ornith:9b model plans the work — running entirely on local Ollama, no cloud dependency.

Stream Native Tool-Calls

File edits, reads, and shell runs stream live as native tool-calls — you watch the code change as the agent reasons.

Face the Critics

An L2–L7 critic ledger attaches to every response — layered verdicts on the agent's output before you trust it.

Inspect Everything

The Inspector / Thinking / Critiques workspace lays out ReAct reasoning, tool activity, and critic verdicts side by side — with session and folder organisation keeping long projects sane.

FILE 03 — CORE SYSTEMS

CHAOS MAGIC, FULLY OBSERVED

🔮

Self-Scaffolding Model

ornith:9b on Local Ollama

A model that constructs its own working scaffolding — the harness binds per session instead of being welded to the app.

ornith:9bOllama

🧩

Harness Registry

Pluggable Python · Load / Unload

Swap agent harnesses like spell books — each session binds the Python harness it needs, and the registry manages the collection.

Pluggable HarnessesPer-Session

🛠️

Native Tool-Calls

Edits · Reads · Shell Runs

No brittle text-parsing — the model emits native tool-calls that stream into the workspace as they execute.

ReActStreaming

🕵️

Inspector Workspace

Thinking · Critiques · L2–L7 Ledger

Reasoning, tool activity, and a seven-layer critic ledger per response — the whole session is glass-walled.

Critic LedgerObservability

🖥️

Desktop-Grade App

Next.js + Electron · React + TS

Session and folder organisation for real projects — a coding agent that lives where the code lives.

Next.jsElectron

🎭

Playwright E2E Suite

The Harness Is Itself Tested

A full end-to-end test suite covers the app — because a tool for judging agents should survive judgment itself.

PlaywrightE2E

Critic Layers (L2–L7) per Response

100%

Local Model Inference

Params in the Self-Scaffolding Model

END OF FILE

MISSION LOGGED. RETURN TO BASE.

← Return to Home Browse All Case Files GitHub ↗