Automated research systems

Research loops you can audit.

Mutome is a local-first workbench for eval-driven research loops: propose candidates, run validators and benchmarks, keep what improves, and archive what fails.

Randomness in discovery.Determinism in verification.

Request access

Premise

Model confidence is not proof.

The engine separates discovery from promotion. A model can suggest a protocol, patch, proof strategy, or experiment; Mutome only moves it forward when the run leaves replayable evidence behind.

Loop

Propose. Verify. Archive.

Propose

Generate candidate lab protocols, code patches, proof routes, and experiment variants through structured exploration.

Verify

Evaluate candidates with deterministic validators, frozen benchmarks, replay commands, and predeclared metrics.

No promotion without a gate.

heuristic

candidate idea, untrusted until measured

model_checked

tests, validators, or adapters reproduce the claim

solver_certified

a solver certifies the relevant subclaim

proof_checked

machine-checkable proof artifacts exist

reviewed

human review, attribution, and reproduction are complete

Release path

Engine now. Desktop next.

The current engine runs protocol search, protocol autotune, and command autotune. It can evolve lab protocols against frozen validators, run git-backed patch/eval/metric loops, and package campaigns with artifacts, hashes, reports, and candidate ledgers.

The desktop release turns those outputs into a native research cockpit: launch runs, watch score lift, inspect the current mutation, connect local providers, and decide whether to keep, harden, or discard a candidate.

Research automation, with the evidence still attached.

Request access