RFD 0042 — The re-checkable emission boundary: a self-validating .oxbin and sound direct artifact emission
- State: discussion
- Depends on: RFD 0037 (the macro atom — this amends D2), RFD 0040 (the procedural macro layer — this extends it, and revises its shipped P1 surface), RFD 0036 (heterogeneous stores — the
MappingArtifactbuild-section gate), RFD 0031 (relation-constraint plane — cardinality), RFD 0027 (meta-property plane), RFD 0009 (MLT) - Supersedes: RFD 0041 (
pub metafact— a surface-per-metadata-kind fix; this RFD makes it unnecessary). RFD 0040 stage-2’s MLT axis-assertion surface became RFD 0041; with 0041 superseded, MLT re-homes through this RFD’s emission gate, not ametafactsurface. - Blocks: the genuine, builtin-free re-home of the MLT decorators (#483); a library
foreign!(RFD 0036 placement)
Question
The macro atom (RFD 0037 D2) committed that a macro expands to surface syntax only, never to events — so everything a macro produces is re-checked by the one trusted parse→resolve→check→instantiate→classify path. That is sound, but it has a cost the campaign made concrete: any vocabulary whose effect has no surface form must be implemented as a privileged compiler builtin (a doctrine violation — the core ships vocabulary) or be granted a new substrate surface per metadata-kind (accretion; libraries gated on compiler changes). Both are paying with the wrong currency. The MLT decorators (→ MetaProperty, no surface), the relation-property checks (→ a pasted __{rel}_{prop} head), and foreign! (→ a .oxbin MappingArtifact, RFD 0036) are all stuck on this. Can a macro emit a substrate artifact directly — keeping D2’s safety — without a builtin, an unsafe escape, or surface-accretion?
Context
D2’s deepest content (RFD 0037 D7) is the CompCert / Ullrich-&-de-Moura posture: the expander is untrusted; assurance comes from re-checking its output. D2 then made one move too many — it conflated re-checked with re-parsed, and required output to be surface. But the trust comes from the re-check, not from the surface provenance: if the substrate re-checks an emitted artifact as thoroughly as a lowered one, direct emission is exactly as safe.
A two-round adversarial review (2026-06-19), verified against the merged substrate, established the load-bearing facts:
-
Today’s substrate does NOT re-check emitted events.
classifyruns only onRuleDeclbodies;resolvevalidates names;encodevalidates nothing. The load-bearing event validations live in elaboration pre-passes /lower.rs/ the runtime write-path — not over the event stream. A directly-emitted event enters the runtime throughingest_locked/seed_from(an infallible index insert), a separate ingress from the write-path (execute_mutation/apply_operation). So a hand-built or macro-emittedRelationTuplesmuggles past endpoint-existence (OE0232), arity/endpoint-type (OE0221/OE0222) and max-cardinality (OE1341); aMetaPropertysmuggles past axis domain/literal/refinement (OE0622–OE0624); a property/iof assertion past refinement (OE0668), required-field (OE0207), value-type (OE0236) and iof-on-defined (OE0211). The lone exceptions are three hand-added load-time backstops —IofAssertionabstract-target (OE0233), axis same-precedence (OE0629), and property-id injectivity (OE0231) — one of whose comments says “a hand-built artifact must not bypass the elaborator’s gate.” So direct emission is unsound on the substrate as it stands, and the gap is broader than first thought: the authors plugged three holes and the rest remain open. This latent debt exists independently of macros — any corrupt, hand-built, or foreign.oxbinalready bypasses these checks today. -
There is a third output kind. A
MappingArtifact(placement) is a.oxbinsection (id 11), deliberately versioned independently of the schema (RFD 0036 D6) and explicitly forbidden from carrying axiom-event semantics (RFD 0036 D1). It is neitherquote!-able surface nor anAxiomEvent. So{surface | event}is not exhaustive. Crucially, build-sections have a different and simpler soundness story than events (see D4): the trusted production path is config-driven, and a macro can feed the identical struct into the identical sink — parity, not a new gate. -
The event re-check set is bounded but larger than a first pass suggested — ~11–12 event-derivable load-bearing validations, not “5–6”. The census (verified file:line in the review record) is:
- Relation tuples: endpoint-existence
OE0232, arityOE0221, endpoint-typeOE0222, max-cardinalityOE1341(incl. the#[functional][0..1]rel-cap, same mechanism). - Meta-properties: axis-value-in-domain
OE0622, axis-literal-type/refinementOE0623, axis-tierOE0624. - Iof / individuals: abstract-target
OE0233(already at load), iof-on-definedOE0211, fixed-reclassificationOE0234(see Fork A — needs an encoding discriminator), property-id injectivityOE0231(already at load). - Construct/property: refinement invariant
OE0668, required-fieldOE0207, value-typeOE0236. - Axis precedence:
OE0629(already at load).
Two clarifications the first pass got wrong:
- The relation-property checks
OE1359(irreflexive) /OE1360(asymmetric) are NOT in this set. They are enforced by thecheckrules the macros emit (surface, re-parsed), which run over the relation’s full extent regardless of how a tuple entered — so they self-enforce and need no load-time pass. - The remaining validations (arg-shape,
matchexhaustiveness, defeasibility well-formednessOE0716–OE0721, temporal-qualifier forms, in-body construction completeness, the §3.4 introducer-resolution gate, relation-subsumptionOE0150–OE0154, metarel endpoint-metatypeOE0631) gate surface declarations and stay on the lowering path — macros produce those viaquote!. A handful (OE0631,OE0150–OE0154,OE0222endpoint-category) are load-bearing but not event-derivable; they bound what is emittable as an event (such artifacts stayquote!-surface).
- Relation tuples: endpoint-existence
-
The
.oxbinvalidation framework exists but is unwired and mis-shaped for this.Layer2Invariants(oxc-oxbin/src/validation.rs) is sixalways_ok()stubs (symbol_resolution,lattice_acyclicity,provenance_well_formed,composition_consistency,tier_consistency,doc_links_resolve), drift-paired Rust↔Lean (Argon/BuildArtifact/Validation.lean). None of the six slots maps onto the census checks (the near-misses are false friends:symbol_resolution≠ “endpoint exists as an individual”;tier_consistency= “rule tier ≤ envelope” ≠ “axis binds at the target’s tier”). AndModule::loadnever calls the framework — its sole caller isoxc-serve, passing the stubs. So this RFD does not “fill stubs”: it adds net-new predicates to the framework, wires the framework intoModule::load, and thereby grows the trusted base (and the Rust↔Lean drift surface). That is the honest characterisation, and it is fine — see the Rationale.
So the missing piece is not a new trust model — it is making the substrate re-check what it already should: a self-validating .oxbin.
Decision
D1 — The re-checkable emission boundary
A procedural macro may emit, in one expansion, any mix of:
Syntax— built withquote!, re-parsed and lowered through the ordinary path (RFD 0037 D2’s mechanism, retained);- typed substrate artifacts for which the substrate has a complete re-check gate over the artifact itself —
AxiomEvents (the event-stream gate, D3) and build-sections likeMappingArtifact(its existing gate, D4).
An artifact-kind with no complete gate is not directly emittable — it must be produced as Syntax (via quote!). There is no unsafe escape: emission is sound by construction because the gate re-checks every emitted artifact to the same standard a lowered one faces. The restriction is not “flat events only” but “only what the substrate fully re-checks.”
The canonical example of the not-directly-emittable case is a raw IofAssertion: its OE0234 fixed-reclassification check is not event-derivable under the current encoding (Fork A), so iof is emitted as Syntax until the discriminator lands. No current client needs to emit a raw iof event (MLT emits MetaProperty; foreign! emits MappingArtifact; the relation-property family emits check rules), so this costs nothing today — and the boundary stays honest rather than pretending the gate is complete when it is not.
D2 — Amend RFD 0037 D2
Lift D2 from “a macro expands to surface syntax, never to events” to: “a macro expands to surface syntax or to re-checked substrate artifacts.” The invariants D2 protected are preserved — not by surface provenance but by the gate (D3/D4): the drift contract (S2 — events are still typed @[language_interface] values), the §3.4 ontology-neutrality discipline (emitted names resolve through the resolver), tier-honesty (S4 — classification runs on emitted events via the gate), and determinism (canonical encoding). The privileged axis-event-emitter carve-out D2/D8 reserved is retired: MLT and the relation-property family become library macros emitting through the gate.
D3 — The self-validating .oxbin (a second trusted checker)
Implement the event-derivable load-bearing validations of Context §3 as .oxbin load-time predicates over the decoded event stream, and wire them into Module::load (which does not call the validation framework today). This is net-new trusted code, not a stub-fill: the existing Layer2Invariants slots do not cover these checks (Context §4), so the framework grows by ~11–12 predicates, drift-paired into Lean (D7).
Reading: whole-module, not single-event. The re-checker validates each event against the fully decoded module (all RelationDecl/MetaxisDecl/ConceptDecl events + the individual set), not against the event’s own body in isolation. Module::load already materialises the entire event stream before any validation could run (decode_events → Vec<AxiomEvent>), so whole-module visibility is available. This is load-bearing: OE0232 needs the workspace individual set; OE0622–OE0624 need a join of the MetaProperty event against its MetaxisDecl on axis_id (the domain/targets/tier live on the decl, not the property) — so OE0624’s tier is event-derivable via that join, resolving the earlier open question.
Architecture — shared cores where they exist, fresh load checks where they don’t. Two shapes, do not pretend they are one:
- AST-shaped checks with an extractable core (
OE0232,OE0622,OE0623, and the value/refinement familyOE0668/OE0207/OE0236): extract the predicate core keyed on resolved ids/values, behind two thin adapters — an AST-time adapter (lowering, behavior unchanged) and an event-time adapter (load). The two MUST NOT diverge on the decision (see comparability below). - Set-level aggregations with no AST core (
OE1341max-cardinality): lowering only records the[lo..hi]bracket; the actual check is a stateful runtime tuple-count overscan_live, per(tenant, fork). There is no AST predicate to factor — the load form is a fresh whole-event-stream aggregation, partitioned by(tenant, fork). Build it as such; do not force it into the two-adapter mould.
Comparability — decision-agreement, not byte-identical diagnostics. The two checkers cannot emit byte-identical diagnostics: the AST-time error carries a source offset and source-level names; the event-time error has no source span and names individuals by interned #i ids. The honest, enforceable contract is decision-agreement: on the same logical violation, both emit the same diagnostic code (and the closest available message). Golden tests pin agreement at the decision level (code fires / does not fire on a corpus of matched cases). This is weaker than byte-identity — it cannot catch message/hint drift or pin spans — and the RFD says so plainly rather than overclaiming.
Independent value (the clincher): this makes every .oxbin self-validating against corrupt, hand-built, foreign, or macro-emitted artifacts — paying down latent soundness debt that exists today regardless of macros (Context §1). So D3 is a substrate-trust feature macros happen to need, and ships first on its own merits. (Note: min-cardinality OW1342 is recorded-but-unenforced today even in lowering; D3 may close it at load, but it is out of the load-bearing soundness set and tracked separately.)
D4 — Build-sections re-check by their own gate (the third kind)
A MappingArtifact is re-checked by validate_composition_signature (OE1213, which re-derives the composition signature from the event legs and anchors the mapping’s pin to the actual elaborated wiring) + check_against_schema (OE1245), both already invoked at load. A macro emitting a build-section is admitted iff that kind’s gate runs.
The soundness story for build-sections is parity, not a new gate. The trusted production path for a MappingArtifact is config-driven (ox.toml [store]/[placement] → MappingSections → oxbin_for_events). A macro that feeds the same PlacementDecl struct into the same MappingSections the config path feeds produces an artifact indistinguishable downstream from a config one → it gets identical compile, pin, and load treatment. The soundness bar is parity with the trusted config path, which holds by construction. (Placement mapping directives are connector-opaque and validated at scan-time for any placement; a macro-emitted one carries no more risk than a config one.) So build-section emission needs no new event-validation (D3) — it rides the existing gate, and is unblocked independently of the self-validating-.oxbin substrate lift.
Phase-ordering protocol (the one real wrinkle). The composition-signature pin is a fixpoint over the entire elaborated event set, computed at build time (oxbin_for_events), after macros run at expand. So a macro cannot and must not supply the pin. The protocol:
- Expand: the macro emits an unpinned placement payload — a
PlacementDecl(storename +mappingdirectives) keyed by the logical relation name. Everything inplacement_hash()is knowable at expand from the macro’s input;schema_signatureis left unset. - Build: the existing
MappingSections::compile(composition_signature)stamps the pin over the merged placements — identically to the config path. - Load:
OE1213+OE1245run unchanged.
Wiring task (the RFD 0036 seam). MappingSections is populated today exclusively by parse_mapping_sections from ox.toml. The emission boundary opens a second source: macro-emitted PlacementDecls, surfaced from the EXPAND phase (D5), unioned into the MappingSections the driver hands to oxbin_for_events. A placement for the same logical relation declared by both config and a macro is a loud refusal (a new OE code — never the silent BTreeMap::insert overwrite). ox.toml [placement] remains supported as an alternative. The macro emits only the PlacementDecl; the StoreDecl (store kind + @deploy: handle — a deployment concern, RFD 0036 C11) stays in ox.toml.
D5 — The emission surface (extends, and revises, RFD 0040)
A procedural macro returns an Expansion — a sequence of emissions, each either a Syntax value or a typed substrate artifact:
#[procmacro]
pub fn foreign(item: Decl, attr: Args) -> Expansion =
[ quote! { $item }, // Syntax → re-parsed (the pure `pub rel`)
Placement { rel: item.name, store: attr.store, mapping: attr.mapping } ]; // build-section → MappingArtifact gate (D4), unpinned
#[procmacro]
pub fn categorizes(item: Decl, target: Ident) -> Expansion =
[ quote! { $item },
MetaProperty { subject: item.name, axis: std::mlt::categorizes, value: target } ]; // event → re-checked at load (D3)
quote!buildsSyntax; splices ($item,$item.name) interpolate reflected values. A singleSyntaxauto-lifts to anExpansion, so a surface-only macro keeps returning-> Syntax(the shipped P1 procmacros are unchanged).- A typed-constructor literal (
MetaProperty { … },Placement { … }) builds an artifact; its names are symbolic and re-resolved by the gate. - No
emitkeyword (taken by the emit/sink feature) and nounsafe— the macro returns its emissions; soundness is the gate’s job.
This revises RFD 0040’s shipped P1 surface — own it. P1 (#581) ships quote { … } (a brace form), -> Syntax, $item as a substitution barrier, ${expr} antiquotation. This RFD re-specs quote! (a !-invoked builtin, not a keyword — already contextual from P1, consistent with format!), -> Expansion, and $item as a splice. That is a code migration of live surface, the P1 parser recognition (quote_brace_at), the shipped std::rel irreflexive/asymmetric procmacros, and their tests — not a greenfield addition. The migration may ride this RFD’s implementation or land as a focused surface-migration PR first; either way D5 owns that it changes shipped P1, and silently resolves the ${} antiquotation spelling that #575 left open (string interpolation as a runtime feature remains a separate arc, #575).
foreign! surface (Ivan, 2026-06-19): the attribute form #[foreign(...)] ships first; the item form foreign! { … } is a later addition (both spellings eventually). The attribute form requires threading ATTR_ARGS into the macro invocation as a second attr input (Rust’s #[proc_macro_attribute] fn(attr, item) shape) — today rewrite_attribute_macro parses but drops attribute args (expand.rs:236); this is a bounded, contained change. #[foreign] is consistent with the shipped #[transitive]/#[irreflexive] attribute spelling (Argon has no @-decorators; #[name] is the shipped form).
D6 — The soundness contract
An emitted artifact is admitted iff its kind’s re-check gate subjects it to the same validations a lowered/config one faces: names through the resolver; the event-stream validations (D3) for events, or the section gate (D4) for build-sections; tier classification; canonical encoding. The standing invariant that makes “no unsafe” true: no event-derivable load-bearing validation may live only in lowering or only on the write-path. That invariant is currently violated by the substrate itself across the ~8 write-path/lowering-only checks of Context §1 — D3 pays that debt down. It is then kept honest by the one-core/two-adapter discipline plus decision-agreement golden tests, and (Open Q b) a structural CI gate asserting every relocated core has an event-time adapter, at decision-agreement granularity.
D7 — The Lean line
The @[language_interface] event types are already drift-gated. What this RFD adds to the Lean substrate:
- State the new load-time validations as total predicates over the event model. These are net-new and grow the drift-paired
Layer2struct in both Rust and Lean — there is no existing slot to reuse (Context §4). - Prove the bridge lemma “a lowered event and a directly-emitted event satisfy the same validation predicate.” For the AST-shaped checks this is the formal content of “one core, two adapters.” For
OE1341the lemma is over an event set (a partitioned aggregation with order-insensitive count), not per-event — state it at that granularity; the per-event form does not type-check against the actual check. - Correction to the draft’s earlier claim: there is today no Lean model of fact/metaproperty → event lowering (
CoreIR/Lowering.leanproves only structural count-preservation; full term lowering is stubbed). So the bridge lemma rests on net-new Lean infra, not an existing lowering model. The event-side predicate scaffolding (AxisBindingValid.lean) exists and is reusable. - RFD 0040 D7’s tier-honesty theorem (
classify ∘ expand) is unaffected (classification already runs on emitted events through the gate) and remains gated on P2 makingexpanda Lean object. No expansion-preservation proof is owed (RFD 0037 D7 still holds: mechanize the re-checker, not the producer — D3 is the re-checker, now total and emission-agnostic).
Fork A — OE0234 fixed-reclassification: quote! now, the discriminator later (committed)
OE0234 rejects a re-classification — an iof that is not the construction site — onto a fixed type. The distinguishing information is the operation variant (Construct vs InsertIof), which is erased at emit: both paths call the identical encoder, and IofAssertionBody = {concept_id, individual_id} carries no construction-site marker. So a load pass over the flat event stream cannot re-check OE0234 under the current encoding.
Resolution (Ivan, 2026-06-19): for now, iof stays Syntax-only-emittable (macros quote! it; the elaborator, which has the operation variant, checks OE0234). This is principled per D1 (an artifact whose gate is incomplete is produced as Syntax) and costs nothing today (no client needs raw iof emission).
This is a committed deferral with a real prerequisite, NOT “no client.” We intend to implement the full construction-site discriminator — a wire-format change so construction and reclassification encode (and content-hash) differently, drift-gated — if it is the correct choice, which makes raw iof directly-emittable and OE0234 event-derivable. The capability is genuinely gated on that unbuilt encoding, not on a hypothetical future client. Tracked as real scope.
Rationale
The design pays in the right currency. D2’s safety came from re-checking, not from surface; we keep the re-check and drop the incidental surface requirement — and in doing so we are forced to make the re-check actually run on artifacts (the self-validating .oxbin), which is independently the correct hardening of a content-addressed artifact format whose soundness debt the review made concrete. The alternative currencies are wrong: a builtin spends ontology-neutrality; a surface-per-kind spends substrate minimality and gates libraries on compiler work; an unsafe escape spends soundness (the review showed it would genuinely be unsound, so the escape would be load-bearing, not cosmetic). “Restrict to what the substrate fully re-checks, and make the substrate re-check fully” is the only option that spends nothing — and it grows the trusted base honestly (D3 is a second trusted checker, declared as such), which is the correct place to spend, because a re-checker is exactly what the CompCert posture asks us to trust.
The scope is larger than a first pass claimed (~11–12 event checks, a wire-format discriminator owed for iof, the Layer2/Lean drift surface grows). That is not the design weakening — it is the review revealing pre-existing soundness debt that this work pays down. Build-sections (foreign!) need none of it and ship first.
Alternatives
- RFD 0041 (
pub metafactsurface). Sound (it lowers through the validating path) and neutral, but it is the surface-per-metadata-kind accretion: MLT needsmetafact,foreign!needs a placement form, the next vocabulary needs the next form — libraries perpetually gated on compiler surface work. Superseded. unsafedirect emission. Cheap, neutral, no accretion — but a genuine trust-waiver: the review proved a directly-emitted event bypasses real checks, sounsafewould be load-bearing, and a content-addressed.oxbinwould still be un-self-validating against corrupt input. Rejected in favor of fixing the substrate.- Privileged builtins (status quo). The doctrine violation this whole arc exists to kill.
Consequences
foreign!becomes a library macro (attribute form#[foreign(...)]first, item form later) emitting aMappingArtifactplacement through the build-section gate (D4). Unblocked now — it needs only theExpansionsurface (D5), theMappingSectionsunion + collisionOEcode, and the#[foreign]attr-args plumbing; noLayer2work. This is the first client and the proof of the mixed-emission boundary.- MLT (
#[categorizes(T)]…) and#[order(N)]become library#[procmacro]s emittingMetaProperty;lower_relational_mlt_decorators/lower_order_decoratorare deleted; #483 closes genuinely. Added scope the review surfaced: this depends on (a) theMetaPropertyevent checksOE0622–OE0624at load (D3), and (b) building the MLT concept-membership semanticsOE190x/OE1904(partition disjointness, order) which are currentlyreserved/unbuilt even in lowering — so “MLT emitsMetaProperty” is bigger than deleting the builtin. Also resolve #517 (MLT metarel signatures disagree across book/std::mlt/Lean) during the re-home. - The relation-property family keeps its current re-home:
#[irreflexive]/#[asymmetric]are already genuine procmacros emittingcheckrules (self-enforcing, noLayer2);#[functional]on arelstays a#[builtin]stub until P2 (its[0..1]cardinality-cap needs structural re-emission); the metarel-functionalcheck is already re-homed. .oxbinbecomes self-validating at load against any malformed artifact — a robustness win beyond macros, paying down the Context §1 debt.- No user-source migration anywhere:
#[categorizes(T)]/#[foreign(...)]are spelled identically; only the (already-required) imports and the byte-identical/parity artifacts are observable (RFD 0037 S3). - #587 / RFD 0041 closes as superseded (done).
Implementation sequence:
- The
Expansion/emission surface (D5) in the procmacro evaluator (procmacro.rs): theExpansionreturn type + auto-lift, the EXPANDVec<EmittedArtifact>buffer, surfaced fromexpand_package_workspace. Includes thequote{}→quote!P1 migration (or a preceding focused PR). foreign!(build-section track) — in parallel, noLayer2dependency: thePlacementconstructor inprocmacro.rs, theMappingSectionsunion + collisionOE, the#[foreign]attr-args plumbing. (Heterostore team owns the connector +MappingSectionsseam; macro side owns the emission buffer.)- The self-validating
.oxbin(D3, substrate-trust track, ships on its own merit): the ~11–12 event predicates, wired intoModule::load; theLayer2framework grown + drift-paired; decision-agreement golden tests. - MLT re-home (#483) — needs (1)+(3) plus building
OE190x/OE1904; delete the builtins. - The Lean predicates + bridge lemmas (D7), including the set-level
OE1341lemma. - (committed, later) the
iofconstruction-site discriminator (Fork A) → rawiofbecomes directly-emittable.
Open questions
a. OE0624 (axis tier). Resolved — event-derivable via the MetaProperty↔MetaxisDecl join on axis_id (the tier lives in MetaxisDeclBody.targets). Confirm the join is reliable at load.
b. Enforcing the D6 invariant structurally. Make “every relocated predicate core has an event-time adapter” a drift/CI gate (at decision-agreement granularity), so a future lowering-only check can’t silently re-open the hole.
c. Expansion ergonomics. The exact type for the mixed-emission return (heterogeneous list vs a named Expansion builder) and the single-Syntax auto-lift surface.
d. fixed-reclassification (OE0234). Resolved by Fork A — quote!-only for now; the construction-site discriminator is committed scope.
e. Which other build-sections become emittable (DocBlocks, IndividualNames) — each only when it carries a re-check gate (D4). Per the census, most .oxbin sections have no load gate today (not even byte-integrity recompute), so they stay non-emittable until one is added.
f. OE1341 partitioning. Confirm the load aggregation partitions per (tenant, fork) exactly as the write-path does, to avoid false rejections.