RFD 0048 — The test atom: in-language unit tests, and why a test is substrate
- State: accepted — partially implemented (v1 shipped in #804; deferred assertion forms remain)
- Depends on: RFD 0015 (
mutatebody surface — a test body is a mutate body plusassert), RFD 0025 (checkdischarge — the closest relative, and the contrast that defines a test), RFD 0020 (runtime engine —Engine::evaluate, the one read path a deductive-plane assert uses), RFD 0042 (self-validating.oxbin— the trust argument for carrying tests in the artifact), the scenario harness (#764 — the integration-test sibling). - Prior art: the §19 walking-example
test "…" { … }sketch (a pre-implementation surface, superseded here); the scenario-harnessexpectvocabulary (oxc-driver/src/harness.rs); Rust’s#[test]/#[cfg(test)](the source-discovery model this RFD weighs and rejects for Argon).
Question
Argon needs in-language tests: a way to write, inside a package, a unit test of what its rules, fns, and mutations actually do. Two questions follow. (1) Surface: what does a test look like and what does it assert? (2) Carrier: is a test a substrate concern — an elaborated declaration persisted in the .oxbin axiom-event log and mechanized in the Lean carrier taxonomy, like mutate/query/check — or a tooling concern, discovered from source by ox test and never persisted? The carrier question is the load-bearing one, because it decides whether the Lean mechanizes anything about tests at all.
Context
The test keyword has been reserved since the early grammar, parsing-but-silently-swallowing its body — the last entry on the v0.2.1 loudness burn-down (#342). The scenario harness (#764, ox run-scenario over scenarios/*.toml) shipped as the integration-test surface and deliberately reserved the tests/ directory for the in-language test atom, the unit-test surface. The §19 walking-example sketched a test form (imperative let/insert + assert) that never parsed.
The construct shipped in #804. During review one architectural question was deferred to this RFD: putting a test in the substrate means adding AxiomKind::TestDecl, which — because AxiomKind is @[language_interface]-mechanized — forces a testDecl variant into the Lean. AGENTS.md says “the Lean does not cover … tooling.” So either tests are not tooling, or the carrier choice is wrong. This RFD settles it.
Decision
A test is substrate, at the carrier / data-shape layer. The runner is tooling. Precisely:
- The test declaration is an elaborated Core IR artifact, carried in the axiom-event log as
AxiomKind::TestDeclwith bodyTestDeclBody { name: String, operations: Vec<Operation> }— structurally identical toMutationDeclBody. The body is the sameOperationCore IR amutatelowers to, plus the test-only assertion ops:Operation::Assert { condition, rendered }(value/boolean),Operation::AssertDerivable { predicate, args, negated, rendered }(derivability), andOperation::AssertRejects { ops, expected_code, rendered }(negative enforcement — the block’s own loweredOperationstream, run isolated). The Lean carriestestDeclin theAxiomKindtaxonomy (Argon/Storage/AxiomKind.lean). - The runner is tooling, in Rust and prose only:
ox test, the per-test fresh-store orchestration, the PASS/FAIL/ERROR classification and reporting,--filter, the non-zero exit. The Lean mechanizes none of this, andOperation::Assert’s arity is not in the drift gate (Operationis not a@[language_interface]inductive — same as every other mutate op).
The surface (shipped):
test "active leases are returned for tenant" {
let alice = insert Person { name: "Alice", age: 30 };
let unit = insert Property { address: "1 Main", sqft: 700 };
let l = sign_lease(alice, unit, 2500, 365);
assert alice.active_leases() == [l];
}
A test "<string-name>" { … } is a named imperative block run top-to-bottom against a fresh store. The body is the mutate-body statement set (RFD 0015) interleaved with assert <bool-expr>; — no separate fixture block. assert mirrors require, with one difference: a failed require aborts the body; a failed assert records a pass/fail outcome and execution continues. An assert whose condition names a derived predicate / pub query / nav-method is evaluated against the reasoner’s materialized extent (via Store::query_derive, the one read path), so a test asserts what the rules derive, not just what was stored. assert is test-only: a stray assert in a mutate/fn body refuses at build time with OE1318.
The assertion vocabulary completes its negative-enforcement half (shipped): alongside the positive assert <bool-expr> and assert [not] derivable F(args), the rejection form assert rejects [( Pkg::Code )] { <mutate-body-stmts> } asserts that a write block is refused by a write-path guard. Argon is a constraint language — checks, where/iff invariants, group axioms — and without this form the test atom could assert what a model accepts but not what it refuses, which is the more important property of a constraint. The block runs against an isolated copy of the test world (committing nothing back) and the result is classified: a genuine write-path guard rejection (a where-invariant OE0668, a check delta-guard, a group axiom, an endpoint refusal) PASSes — matching a pinned Pkg::Code when given, a different code FAILing as the wrong reason; an accepted write FAILs; and a non-guard error (a typo, an unbound reference, a type error) ERRORs loudly. The guard-vs-non-guard line is load-bearing: only a genuine constraint refusal satisfies rejects, so a broken test can never masquerade as a passing rejection test. rejects, like derivable, is a contextual keyword (leading position after assert).
Rationale
The carrier ruling is validatable on four independent grounds; each was checked against the code and the Lean.
-
AxiomKindis the event-log taxonomy, not the five-atom set — and tests join an existing precedent. The five atoms (meta-calculus, constructs, rule, trait, macro) are the vocabulary-introduction primitives and are fixed.AxiomKindis broader: it already carriesruleDecl,queryDecl,mutationDecl,computeDecl, andbridgeDecl— none of which is one of the five atoms.testDeclsits in exactly that company. So calling it “the test atom” is colloquial naming of a declaration form, not a claim of a sixth substrate atom. (Verifiable:Argon/Storage/AxiomKind.lean,oxc-protocol/src/storage.rs.) -
A test body is Core IR, which the Lean scope already covers.
TestDeclBody.operationsisVec<Operation>— the identical elaborated IR amutatelowers to. AGENTS.md’s mechanization scope lists “Core IR: elaborated intermediate representation; surface-to-IR lowering preservation” and the Storage event log. A test, once elaborated, is a member of both. Carrying it is consistent with what is already mechanized; excluding it would be the special case requiring justification. -
The carrier/runner split is exactly what “the Lean does not cover tooling” means. The Lean
testDeclis a tag in the carrier enum; it carries no test semantics — there is no mechanized pass/fail relation, noox testmodel, no isolation theorem. The tooling (the runner and its reporting) lives in Rust and prose, untouched by the Lean. So the scope line holds verbatim: the event-log taxonomy (Storage) is substrate and mechanized; the runner is tooling and is not. AddingtestDeclto the taxonomy no more “mechanizes tooling” thanmutationDecldoes. -
Trust and reproducibility (RFD 0042). The
.oxbinis the package’s complete elaborated form, and it is self-validating: it re-checks its own invariants at load. A test carried in the artifact runs against the exact elaborated state thatox check/ox buildvalidated — there is no second elaboration that could drift from the checked one. Source-discovery would re-parse and re-elaborate test bodies independently of the artifact, opening precisely that drift. Carrying tests is the choice that preserves “the artifact is the package.”
Alternatives
- Source-discovery (the Rust
#[cfg(test)]model): rejected.ox testwould parsetests/*.ar, elaborate test bodies on demand, and run them, persisting nothing. It is appealing on the intuition that “tests aren’t knowledge,” but it (a) reintroduces a second elaboration path that can drift from the checked artifact (against RFD 0042), (b) makes a test the only package-level declaration not carried in the artifact, splitting the declaration model for no semantic gain, and (c) buys nothing the carrier model lacks — tests are already inert at query/serve time, so persisting them costs nothing at runtime. - A sixth atom: rejected, and a category error. A test introduces no vocabulary and no new substrate primitive; it is a declaration form whose body is existing Core IR. The five atoms are fixed (see ground 1).
- Stripping tests from every artifact: not now. Whether a release artifact should omit its tests (as a Rust release binary omits
#[cfg(test)]) is a build-profile question, not a carrier question — it does not bear on whether tests are substrate. Deferred (see open questions).
Consequences
AxiomKind::TestDeclis in the.oxbinevent log and the Lean carrier taxonomy (the variant count is now 26). Tests are inert at query/serve time — onlyox testenumerates them.- A test is not a
check. Acheckis a standing universal obligation that fires a diagnostic over whatever world is loaded and is observer-only (it populates no IDB); a test is an existential example that constructs a known world and asserts a specific expected outcome. Check-firing is one thing a test can assert — directly, viaassert rejects [( Code )] { <write> }, which asserts the write is refused by a guard (and on the exact code when pinned) — but the two are duals, not the same construct. - Isolation is fresh-store-per-test (the scenario-harness discipline), so tests are order-independent; within a test, writes are read-your-writes over the committed + deductive state.
- A published artifact currently carries its tests. This is acceptable (they are inert) and is the trust-preserving default; a future release profile may strip them.
Open questions
- Shared / parameterized fixtures across tests — the one deferred assertion-surface item (the
fixture/expectblock forms; refuses loudly until built, never silently accepted). The dedicated derivability formsassert derivable F/assert not derivable Fand the three-valued outcomes (anINCONCLUSIVEresult distinct from FAIL when a derivability assert is evaluated over an open-world relation, where absence iscan, notnot— the same fail-closed discipline RFD 0045 forced for the write side) are now built (§17.14): present ⇒ PASS/FAIL, absent under CWA ⇒ FAIL/PASS, absent under OWA ⇒ a loud INCONCLUSIVE that exitsox testnon-zero. The negative-enforcement formassert rejects [( Pkg::Code )] { … }is also now built (§17.14): a write block is run isolated and a write-path guard rejection PASSes (on the pinned code when given), an accepted write FAILs, and a non-guard error ERRORs loudly — so the assertion vocabulary covers both what a model accepts and what it refuses. - Release-profile stripping of tests from a distributed artifact (a build-profile lever, orthogonal to the carrier ruling).
- An assertion-soundness theorem in Lean (optional research): that a passing deductive-plane assert implies the asserted derivation holds at the fixpoint. The runner stays tooling regardless; this would only mechanize the meaning of the carried
Assertop, adjacent toReasoning/Checks.lean.