RFD 0004 — pub fact declarations
- State: discussion
- Opened: 2026-05-28
- Decides: surface syntax for declaring ground-truth facts (axioms) in source code; per-mode semantics relative to
pub mutate; the relationship between source-declared facts and runtime-asserted facts in storage; how compile-time materialization (#[comptime]per RFD 0002) feeds on these facts.
Question
Today, ground-truth facts in an Argon system can only arrive at runtime via mutation invocation. The keystone test demonstrates this: there’s no way to write Person(alice) in source code; the only path is to call register(alice). Should Argon admit a source-level fact declaration syntax? If so, what’s the right surface, what’s the relationship to mutations and storage, and how does compile-time materialization fit?
Context
What the gap looks like today
pub kind Person; declares a concept. pub rel hasAge(p: Person, n: Nat); declares a relation. pub derive adult(p) :- Person(p), hasAge(p, n), n >= 18; declares a rule. Nothing in the current surface lets a modeler write Person(alice) or hasAge(alice, 25) as a fact. The only way to get those tuples into the store is to invoke a mutation at runtime.
This creates several problems:
-
#[comptime]is starved. RFD 0002 specifies compile-time materialization for derive rules. But the build-time engine has no facts to materialize against — every fact requires a runtime mutation call. The#[comptime]flag stamps on the rule body but the lifter has nothing to evaluate. -
Test fixtures are awkward. The keystone test had to inject
hasAge(p, 25)via a test-only helper because the only fact path was through mutation execution (which until this session didn’t bind integer args). Even after Value-typed mutation args landed, declarative tests want to express “given these facts” without runtime invocation gymnastics. -
Bootstrap data is awkward. Real ontologies have built-in facts: standard units, well-known individuals, vocabulary mappings (e.g., the OWL stdlib’s
rdf:typepredicate; the temporal stdlib’sSecond,Minuteinstances). Today these would need runtime mutation calls at startup. Source-level facts let them ship as part of the.oxbin. -
The substrate’s “five atoms” model is incomplete without it. The five atoms (meta-calculus, construct, rule, trait, macro) describe declarations. But declarations alone don’t carry data — they describe shape. Facts ARE data; they should have a declarative surface.
What “fact” means semantically
A fact in Argon is an axiom event of kind iof_assertion, relation_tuple, meta_property, subsumption_axiom, or partition_axiom. Each is a positive ground proposition. Modelers writing source today can declare structure (concepts, relations) and behavior (mutations, derive rules) but cannot declare contents (ground propositions).
The minimal addition: surface syntax that emits these axiom events at build time. The .oxbin carries them just like mutation-emitted events. The reasoner sees them as live state. Compile-time materialization can finally run with input.
Decision
Surface syntax
pub fact Person(alice);
pub fact Person(bob);
pub fact hasAge(alice, 25);
pub fact hasAge(bob, 12);
The pub fact keyword sequence introduces a single ground proposition. Its body is exactly the same syntactic shape as a mutation’s insert operations — a predicate name followed by a parenthesized argument list. Each argument is either an identifier (interpreted as an Individual reference) or a literal (Int, Bool, Text).
Multiple facts can share a single declaration via a brace-delimited block:
pub fact {
Person(alice);
Person(bob);
hasAge(alice, 25);
hasAge(bob, 12);
}
This is purely a syntactic convenience — semantically identical to N independent pub fact declarations.
Identity allocation for fact arguments
When a fact references an identifier (alice, bob), the elaborator allocates a fresh IndividualId for it (per RFD 0001’s “individuals are system-allocated” rule). The mapping identifier → IndividualId is interned per workspace build, so:
- The same identifier mentioned in two
pub factdeclarations resolves to the sameIndividualId. - Different builds of the same source produce the same
IndividualId(deterministic). - Different identifiers (
alicevsbob) get distinctIndividualIds.
The identifier itself becomes available for runtime lookup via the workspace’s symbol table — callers can pass Value::Individual(alice_id) to mutations or queries. (Today the runtime hashes the variable name for unbound parameters; this convention extends naturally.)
Per-axiom-kind semantics
| Fact shape | Resulting axiom event |
|---|---|
pub fact Concept(individual) | iof_assertion { concept_id = NameRef(Concept), individual_id = IndividualId(individual) } |
pub fact relation(arg1, arg2, ...) | relation_tuple { relation_id = NameRef(relation), args = [encode(arg1), encode(arg2), ...] } |
pub fact axis(target) = value | meta_property { axis_id = NameRef(axis), target_id = NameRef(target), value } (sugar for the meta-property family per §13) |
pub fact A <: B | subsumption_axiom { sub_id = NameRef(A), super_id = NameRef(B) } (an alternative spelling of the <: in concept declarations) |
The fourth row’s surface (pub fact A <: B) is shorthand for declaring subsumption outside a concept declaration’s <: clause — useful for late-binding or library-extension idioms. Initial implementation focuses on the first two rows; meta-property and subsumption shapes land when the broader §13 / §11 substrate matures.
Relationship to mutations
A pub fact declaration is equivalent to a mutation that runs at module load with no parameters. The compile-time-equivalent expansion:
// User writes:
pub fact Person(alice);
pub fact hasAge(alice, 25);
// Semantically equivalent to (but the compiler does NOT actually
// emit this; it emits the axiom events directly):
pub mutate __bootstrap_module() {
insert iof(alice, Person);
insert hasAge(alice, 25);
}
// + invocation of __bootstrap_module at module-load time
The advantages of NOT going through a synthetic mutation:
- Directly content-addressed. The fact’s
AxiomKeyisBLAKE3-128(canonical_body), deterministic per RFD 0001. Mutations get freshEventIds at invocation; facts get content-stable identity. - No runtime invocation needed. The
.oxbincarries the events directly; the runtime sees them inStore::seed_from(&Module). - Compile-time materialization works. RFD 0002’s
#[comptime]lifter finally has inputs.
Interaction with #[comptime]
After pub fact lands, the comptime lifter has a non-trivial path:
pub fact Person(alice);
pub fact Person(bob);
pub fact hasAge(alice, 25);
pub fact hasAge(bob, 12);
#[comptime]
pub derive adult(p) :- Person(p), hasAge(p, n), n >= 18;
At build time, the lifter:
- Collects facts from
pub factdeclarations → catalog seed. - Runs the comptime rule against the seed.
- Emits the derived tuples (
adult(alice)in this case) as freshiof_assertion/relation_tupleevents with derivation provenance. - The runtime sees the derived facts directly; no re-evaluation needed.
Mutability + lifecycle
pub fact declares an immutable ground fact. The fact exists from module-load onwards. Retraction must go through:
- A mutation:
pub mutate forget_alice() { delete iof(alice, Person); }. - A retraction event at runtime (when the storage layer supports it).
The fact’s AxiomKey is content-addressed, so even after retraction the event log records both the assertion and the retraction. Replaying the log reproduces the lineage.
Diagnostic surface
OE0210 FactReferencesUnknownConcept—pub fact Person(alice)wherePersonis not a declared concept.OE0211 FactArgArityMismatch—pub fact hasAge(alice)for a binary relation.OE0212 FactArgTypeMismatch—pub fact hasAge(alice, "twenty-five")where the relation expectsNat.OE0213 FactInsideFn—pub factis a module-level declaration; not admitted inside function bodies, traits, etc.
Rationale
Why a new keyword fact, not overloaded syntax
Alternatives we ruled out:
pub Person(alice);— overloads the visibility keyword. Reads ambiguously next topub kind Person.pub iof alice : Person;— too low-level; exposes the wire-format axiom kind to the surface.#[fact] Person(alice);— attribute-driven, but attributes describe modifiers, not the kind of declaration. Confusing.
pub fact reads cleanly, parallels pub kind / pub rel, and is unambiguous about meaning.
Why allow block form
The block form (pub fact { ... }) is shorthand for repeated declarations. Real modules have lots of bootstrap facts (a stdlib of units, well-known individuals); writing 50 pub fact declarations vs. one block of 50 is just keyboard noise.
The block form’s semantic is identical to N independent declarations. No grouping invariants implied.
Why content-addressed AxiomKey, not synthetic mutation EventId
Two reasons:
- Determinism: same source → same .oxbin bytes. The fact’s identity comes from its content, not from a mutation’s invocation order.
- Lineage clarity: a
pub factdeclaration is a statement about reality, not an operation that “happens.” Treating it as an event with content-derived identity matches the modeler’s mental model.
Why immutable
pub fact describes ground truth as the modeler authored it. Allowing inline mutation (pub fact Person(alice); pub fact !Person(alice); — withdraw the prior?) would conflate declaration with operation. Cleaner: declarations are immutable; runtime mutations are the path for state change.
The .oxbin-resident facts can still be retracted by a mutation at runtime; the assertion + retraction live in the event log together.
Why not full datalog-style “extensional database” syntax
Some Datalog systems allow large facts via a separate file format (e.g., .csv of relation tuples, loaded by relation name). Argon could borrow this.
Rejected for the v1 surface: introduces a second source format with different rules. The same module would split between .ar (rules) and .csv (facts) — a coordination burden. pub fact keeps everything in one file format.
For very large fact sets (e.g., million-row knowledge bases), a tooling layer can pre-process external data into pub fact declarations. The language surface stays uniform.
Alternatives considered
Alt 1: No pub fact; require synthetic mutations
The modeler writes pub mutate bootstrap() { insert iof(alice, Person); } and a runtime layer calls it on module load.
Rejected: confuses operations (mutations) with declarations (facts). Every fact-heavy module would have boilerplate. The #[comptime] lifter would still need to recognize “this mutation is actually fact bootstrapping” — same problem at one layer of indirection.
Alt 2: data keyword (a la SQL)
pub data hasAge { alice = 25; bob = 12; }
Rejected: “data” overemphasizes table-shaped thinking. Argon facts can be heterogeneous (subsumption axioms, meta-properties); data reads like “tabular data only.”
Alt 3: Module-level expression syntax
pub Person(alice); at module scope (no leading keyword).
Rejected: parser ambiguity with function calls. pub would have to mean two different things depending on what follows; messy.
Alt 4: Inline in concept declarations
pub kind Person {
instances: alice, bob, carol;
}
Rejected: couples instance enumeration to concept declaration. The whole point of separating concept from instance is that instances can be added without modifying the concept. Argon’s metatype model assumes this separation.
Consequences
Wire format
No new axiom kinds — pub fact emits existing iof_assertion, relation_tuple, etc. events. The wire format is unchanged.
Parser changes
New keyword FACT_KW. New FactDecl AST node parallel to ConceptDecl / RelDecl. ~50 LOC in grammar.toml + argon.ungrammar + oxc-parser/src/grammar.rs.
Elaboration changes
oxc-instantiate gains a lower_fact_decl similar to lower_concept_decl. ~100-150 LOC. The fact’s args resolve through the same SymbolTableBuilder as other declarations.
Identity allocation for fact-mentioned individuals
The SymbolTableBuilder gains an intern_individual(name) -> IndividualId method. The first mention of alice allocates a fresh IndividualId; subsequent mentions return the same one. Per-build deterministic.
Comptime lifter
Per RFD 0002, the comptime lifter materializes derive rules into fresh axiom events at build time. After pub fact lands, the lifter has facts to evaluate against. The lifter implementation can finally land.
Test ergonomics
Tests like the keystone become much cleaner:
#![allow(unused)]
fn main() {
let source = r#"
mod demo;
pub kind Person;
pub rel hasAge(p: Person, n: Nat);
pub fact Person(alice);
pub fact hasAge(alice, 25);
pub derive adult(p) :- Person(p), hasAge(p, n), n >= 18;
"#;
// ...load, query, assert. No mutation calls. No test helpers.
}
Open questions
-
How are
pub factdeclarations sorted in the events section? Per the canonical-event-key sort, events sort by(tenant, fork, standpoint, kind, vt_start, tx_from). Facts gettx_from = 0(declared at the workspace’s origin time)? Or do they get a deterministictx_fromderived from source position? The latter preserves source order; defer to implementation. -
Can
pub factreference future-declared symbols?pub fact Person(alice);followed bypub kind Person;— does this work? Symmetric to howderivecan reference forward-declared concepts. Default: yes; the elaborator does a single AST walk and resolves all references. Concrete error case:pub fact UnknownConcept(alice);→ OE0210. -
pub factinsidemodblocks —pub mod sub { pub fact ... }admits per-submodule fact scoping. Symbol resolution follows module visibility rules (per §3.1). No special handling beyond what’s already established. -
Bulk-load tooling — for million-row fact sets, a CLI/tooling story is needed. Out of scope for the language surface; addressable post-MVP.
-
Interaction with standpoint scoping — when standpoints are surfaced, a fact in
standpoint Xis only visible in X (and its supertypes per FDE info-join). Defer to the standpoint substrate work track. -
Negative facts / classical negation —
pub not_fact Person(alice)to assert that alice is not a Person. Useful in OWA / classical contexts; relates to §12.2’s Truth4 bilattice. Defer; orthogonal to this RFD’s positive-only scope.