Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

RFD 0019 — Mutation write-path correctness: construction, identity, read-your-writes, and exact values

  • State: accepted — partially implemented (RC1/RC2/RC4 landed; required-fields and within-body read-your-writes remain)
  • Opened: 2026-06-06
  • Decides: the surface and runtime semantics that take Argon’s write path — insert/construct → mint identity → persist fields and relations as ABox events → read them back within the same body and on reload — from silently incorrect to correct, durable, and Lean-conformant. Records the decisions taken on the design questions in RP-010 §6/§11. Refines RFD 0015 (the mutate body surface — insert disambiguation), and connects to RFD 0001 (identity), RFD 0006 (IndividualPropertyAssertion), RFD 0007 (required/missing fields under OWA), and RFD 0016 (exact Real). Its read/execution-path counterpart is RFD 0018 (RP-009); this RFD owns the write side and coordinates at the seams (§Consequences).

This RFD is the architecture decision record for the mutation write path. It is Lean-first where it touches semantics — the engine conforms to spec/lean/Argon/Runtime/MutationSemantics.lean (+ MutationFreshness.lean, AggregateExact.lean); divergence is a bug. It commits a plan and a set of surface-semantics decisions, not code. The full verified evidence lives in RP-010; this RFD states the decisions and why.


Question

Argon can parse and type a rich ontology and can reason over facts at small scale, but the write path is silently incorrect: real ontologies produce no usable instance data. Concretely (verified against origin/main @ 967cc03; see RP-010 §1 for the file:line evidence):

  • RC1 — positional insert Concept(...) does not construct. The parser dispatches purely syntactically (L_PAREN ⇒ relation-tuple op, L_BRACE ⇒ struct/construct op), never consulting whether the head resolves to a concept or a relation. So insert Person(name) lowers to a relation-tuple assertion that mints no individual, persists no fields, and lands under NameRef::DEFAULT (a garbage relation id). In let position it is totally silent: lower_let_stmt (mutate_lower.rs:271) only binds an INSERT_STRUCT_OP, so let p = insert Person(name) emits zero ops and binds nothing — and a later unbound p then collapses to a blake3("individual:{name}") content-hash id (oxc-runtime/src/lib.rs:3874), identical across runs. The brace form insert Type { f = v } is correct.
  • RC2 — no read-your-writes; navigation fields invisible. A Term::Proj reader in a mutate body reads committed (pre-mutation) state, not this body’s own buffered writes, so coll = coll + [x] clobbers. And from <rel>.endpoint navigation-view fields — which do resolve in the derive/query read path — are invisible inside mutate bodies.
  • RC4 — exact Real mishandled. The rule-body aggregate fold is exact BigRational (RFD 0016), but the mutate/term value path is f64-based: as_f64 has no Value::Real arm, so mutate-expression arithmetic/aggregates error on exact-Real operands.
  • §1.5 — required fields unchecked. No construction-time field-coverage check exists; insert Type { … } with missing required (incl. inherited) fields silently creates an incomplete individual. The relevant diagnostics (OE1908, OE1014) are defined-but-never-emitted.

What surface and runtime semantics make this path correct — and which of the “silent-wrong” behaviors become loud?

Context

The substrate is ahead of the implementation here. The Lean mechanizes a denotational mutate interpreter (Runtime/MutationSemantics.lean, 734 lines): a typed-literal insert mints a fresh IndividualId and emits an iof assertion plus per-field IndividualPropertyAssertions; a relation insert emits a RelationTuple; atomicity is structural (a failing run is Except.error, which carries no effects). MutationFreshness.lean proves mutate_run_fresh_ge (every minted id exceeds every committed one). AggregateExact.lean proves the folds are exact over Rat. So the correct behavior is mechanized; the Rust write path simply does not match it. This RFD’s job is to (a) settle the few surface-semantics calls the Lean does not pin down, and (b) commit the build order that makes the Rust conform.

The applied pressure is concrete: the residential-lease + accounting overlay (sharpe-ontology, ~5.2k LOC of .ar) cannot create a single faithful instance graph today — its entire Create* surface uses positional concept-insert, which silently persists nothing. This RFD is the prerequisite for the overlay (and therefore the production reasoner, RFD 0018) having real data to operate on.

Decision

1. Construction is brace-only; positional concept-insert is a hard error (RC1)

There is exactly one construct syntax: insert C { field = value, … }. The two surface forms are disjoint by bracket, and the bracket is the semantics:

  • insert C { … }construct. Mints a fresh IndividualId, asserts iof(id, C), and persists every supplied field (scalar, list, nested-individual, relation-typed, inherited) as ABox events. Value-producing: let p = insert C { … } binds p to the new individual.
  • insert R(args)relation-tuple assertion, only when R resolves to a relation. Emits a RelationTuple under R’s id.
  • insert C(args) where C resolves to a concepthard error (a new diagnostic, tentatively OE0212, finalized in grammar.toml at implementation): “positional insert of a concept constructs nothing; use the brace form insert C { … }.” The elaborator already has the module’s concept/relation index; it resolves the head and rejects rather than emitting a garbage tuple.

This kills the silent misroute, keeps the surface unambiguous (parens = relation, braces = construct), and avoids overloading parens to mean “construct sometimes.” The cost is an overlay migration (positional concept-inserts → brace form), which is mechanical and one-time.

The unbound-Var blake3 fallback (resolve_term_to_value:3874) is deleted: an unresolved Term::Var in a value position is a hard error, never a silently-minted content-hash id.

2. Identity: fresh monotonic surrogate, confirmed; no content-hash minting (RC1, RFD 0001)

The AtomicU64 surrogate minter is correct and stays. Distinct constructions get distinct ids; replay correctness comes from the event log replaying already-minted ids (MutationFreshness.mutate_run_fresh_ge is the backbone). A content/key-addressed “explicit identity key” (upsert/merge-on-key) is a future feature, explicitly out of scope for v1; if/when added it is opt-in, never the default. The only id-minting path is the surrogate counter.

3. Read-your-writes within the body (RC2)

A read of a field or collection inside a mutate body reflects this body’s own prior writes, not just committed state. So account.records = account.records + [r] accumulates correctly, and construct-then-read works. The value resolver consults the in-body buffered-write overlay (st.collections and the pending property/iof writes) before falling back to committed storage. Commit remains atomic at the body level (buffer → flush; a failing run flushes nothing — matching the Lean’s Except.error atomicity).

4. Navigation-view fields are computable in mutate bodies (RC2)

from <rel>.endpoint projections (and multi-hop chains like pair.book.account.records) resolve inside a mutate body exactly as they do in the derive/query read path (which already supports multi-hop projection — proven by keystone_met_integration). This removes the overlay’s parameter-passing workaround. The mutate evaluator materializes the navigation view on read (read-path parity), under the same world-assumption semantics.

5. One exact value model end to end (RC4, RFD 0016)

There is a single value model: exact BigRational for the exact tower (Real/Decimal/Money), with f64 reserved for explicitly-float types. Value::Real flows through the mutate/compute value path — as_f64/eval_binary and the mutate-path aggregate helpers (aggregate_sum/aggregate_extremum/aggregate_avg) gain a Value::Real arm and stay exact (no f64 promotion on the exact path). This is validated differentially against AggregateExact.lean. So require { value == sum(record.value …) } with value: Real evaluates exactly.

6. Required-field coverage is validated at build and at construction (§1.5, RFD 0007)

Construction with missing required (incl. inherited) fields is rejected with a real diagnostic:

  • Build-time (where statically knowable): a checker pass over insert C { … } against C’s field schema (walking <: for inherited required fields) emits a build diagnostic. The diagnostic-code reconciliation (OE0207 vs the existing OE1908/OE1014) is settled during implementation; the existing defined-but-unemitted codes are wired or replaced, not left dead.
  • Runtime: a construct-time guard catches the dynamically-unknowable cases at the mutate-rejection channel.

Required fields are CWA-at-construction even under concept-level OWA (an individual you are building now must satisfy its required structure), per RFD 0007’s intent distinctions; three-valued/OWA subtleties apply only where a field’s intent is epistemic/optional.

7. Loud failure is a deliverable (§1.7)

Every form that cannot execute correctly becomes a hard error or a build-time diagnostic, never a silent no-op or garbage write: positional concept-insert (§1), unbound-Var id (§2), construct with missing required fields (§6). Making these loud is part of the work, not a follow-on.

Rationale

  • Bracket-as-semantics is the least surprising surface. Once { } is the construct form (it already is, and it is correct), letting ( ) also construct — disambiguated only by a name lookup the reader must perform in their head — is the ambiguity that produced RC1 in the first place. Disjoint brackets mean a modeler (and the parser) can tell construct from relation-assertion locally, without resolving the head. The hard error on concept-(...) turns the one genuinely-ambiguous case into a teachable diagnostic.
  • The substrate already says so. The Lean InsertForm distinguishes typedLiteral (construct, mints identity) from relation (tuple, yields unit). Brace-only construction maps cleanly onto typedLiteral; positional-relation onto relation. The reject rule is the surface honoring a distinction the substrate already draws.
  • Read-your-writes is the transactional intuition every modeler brings; committed-only reads make in-body accumulation silently wrong, which is exactly the failure mode we are eliminating.
  • Exactness must be uniform or the numeric tower (RFD 0016) is a half-truth: a value that is exact in a rule body but lossy in a mutate expression is a latent correctness bug at the read/write seam.
  • Required-field enforcement at build is the earliest, loudest signal; the runtime guard covers what build cannot see. Silent incomplete individuals are the kind of garbage-in that defeats the reasoner downstream.

Alternatives

  • (RC1) Semantic disambiguation — positional insert Concept(...) constructs. The head resolves to a concept ⇒ Construct; to a relation ⇒ tuple. This was the initially-recommended option (more ergonomic; no overlay migration). Rejected in favor of brace-only: it overloads ( ) to mean two different things depending on a name lookup, keeps two construct syntaxes, and pushes head-resolution into a load-bearing position in the parser/elaborator. Brace-only is the more principled and locally-readable surface; the overlay migration is a bounded one-time cost.
  • (RC2) Committed-only reads. Simpler evaluator; rejected because it makes coll = coll + [x] silently truncate — a silent-wrong behavior this RFD exists to remove.
  • (§6) Runtime-only required-field check. Simpler, but defers feedback to execution; rejected in favor of build-time + runtime so statically-knowable omissions fail at ox build.
  • (RC2 nav fields) Reject navigation fields in mutate bodies. Keeps scope small but entrenches the overlay’s param-passing workaround and creates a read-path/write-path asymmetry; rejected in favor of parity.

Consequences

Phased implementation (each phase a complete, CI-gated PR, proven on a running example — no hollow features):

  1. RC1 — construct vs tuple + identity. Brace-only construction; positional concept-insert → hard error (OE0212); let x = insert C { … } binds; delete the unbound-Var hash fallback. Reconcile/extend MutationSemantics for the chosen surface (the reject is a surface rule; confirm the Lean models brace-construct + relation-tuple and add the positional-reject note). Proof: a keystone test constructs distinct individuals with all fields present and round-trips; the overlay’s Create* ops migrate to brace form and persist real data.
  2. RC2 — read-your-writes + navigation fields. Buffered writes feed in-body reads (scalar + collection); from/multi-hop projections compute in mutate bodies; resolve the update target: Type annotation requirement. Proof: update account set { records = account.records + [r] } accumulates across a for; the overlay’s Materialize* ops work without the direct-account workaround.
  3. RC4 — exact Real unification. Value::Real through as_f64/eval_binary/mutate aggregates. Proof: require { value == sum(record.value …) } with value: Real passes; differential test vs AggregateExact.lean.
  4. §6 — required-field validation. Build-time checker pass + runtime guard; diagnostic-code reconciliation. Proof: reject/accept tests; the overlay’s temporal entities enforce begin/end.
  5. Loudness sweep + reload fidelity. Remaining silent-wrong paths become diagnostics; round-trip/replay tests assert event-log fidelity for every emitted event kind.
  6. Overlay integration capstone. The lease Create*/Materialize*/Recognize* mutations build a faithful instance graph; Met/BreachedAt derives fire over real data (read side coordinated with RFD 0018).

Seams with RFD 0018 (RP-009). The two term evaluators (oxc-serve::eval_compute_term, oxc-runtime::resolve_term_to_value) are un-unified; RC4 and the reasoner’s compute work both touch them. Exact-Real plumbing touches oxc-reasoning::compile::Value, shared with the reasoner. The events this write path emits must be exactly what materialize_predicates reads. Ownership boundary on the shared evaluator is agreed with the RP-009 effort before refactoring shared code; the read/execution path (incrementality, indexed joins, function-application in derive bodies, tier coverage, modal soundness) is RFD 0018’s, not this RFD’s.

Drift. Any new/changed @[language_interface] shape (a CoreIR Operation, an event variant) updates the Lean inductive and the oxc-protocol mirror in lockstep (cargo xtask check-drift). The OE0212 reject is a diagnostic, not a wire shape — no drift impact.

Reference + Lean. The surface change (brace-only construct; positional-concept reject) lands in the reference (07-rules.md §7.5 + appendix-c-diagnostic-codes.md) and is reconciled with the Lean surface, per the surface-change workflow (RFD + reference → Lean → code).

Open questions

  1. update target: Type annotation. Today update target set { … } hard-errors; the annotation is required. Should it be inferred from the binder’s type (RC2 scope)? Leaning: infer where statically known, keep the annotation optional.
  2. Diagnostic-code reconciliation for required fields. OE0207 (spec-planned, nonexistent) vs OE1908 (IntrinsicPropertyMissing) vs OE1014 (RequiredFieldUnasserted) — which is the build-time code, which the runtime channel? Settled in the §6 phase against grammar.toml.
  3. property_id_for_field interning. The blake3("field:{Type}::{field}") field-key hash is a correct-but-stand-in for a real interned NameRef (RFD 0001). Promote to interned ids as part of the loudness/fidelity phase, or defer?
  4. Term-evaluator unification ownership. Who owns the unified term evaluator across the write path (this RFD) and the compute/read path (RFD 0018)? Agree the boundary before refactoring (§Consequences seam).