Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

RFD 0004 — pub fact declarations

  • State: discussion
  • Opened: 2026-05-28
  • Decides: surface syntax for declaring ground-truth facts (axioms) in source code; per-mode semantics relative to pub mutate; the relationship between source-declared facts and runtime-asserted facts in storage; how compile-time materialization (#[comptime] per RFD 0002) feeds on these facts.

Question

Today, ground-truth facts in an Argon system can only arrive at runtime via mutation invocation. The keystone test demonstrates this: there’s no way to write Person(alice) in source code; the only path is to call register(alice). Should Argon admit a source-level fact declaration syntax? If so, what’s the right surface, what’s the relationship to mutations and storage, and how does compile-time materialization fit?

Context

What the gap looks like today

pub kind Person; declares a concept. pub rel hasAge(p: Person, n: Nat); declares a relation. pub derive adult(p) :- Person(p), hasAge(p, n), n >= 18; declares a rule. Nothing in the current surface lets a modeler write Person(alice) or hasAge(alice, 25) as a fact. The only way to get those tuples into the store is to invoke a mutation at runtime.

This creates several problems:

  1. #[comptime] is starved. RFD 0002 specifies compile-time materialization for derive rules. But the build-time engine has no facts to materialize against — every fact requires a runtime mutation call. The #[comptime] flag stamps on the rule body but the lifter has nothing to evaluate.

  2. Test fixtures are awkward. The keystone test had to inject hasAge(p, 25) via a test-only helper because the only fact path was through mutation execution (which until this session didn’t bind integer args). Even after Value-typed mutation args landed, declarative tests want to express “given these facts” without runtime invocation gymnastics.

  3. Bootstrap data is awkward. Real ontologies have built-in facts: standard units, well-known individuals, vocabulary mappings (e.g., the OWL stdlib’s rdf:type predicate; the temporal stdlib’s Second, Minute instances). Today these would need runtime mutation calls at startup. Source-level facts let them ship as part of the .oxbin.

  4. The substrate’s “five atoms” model is incomplete without it. The five atoms (meta-calculus, construct, rule, trait, macro) describe declarations. But declarations alone don’t carry data — they describe shape. Facts ARE data; they should have a declarative surface.

What “fact” means semantically

A fact in Argon is an axiom event of kind iof_assertion, relation_tuple, meta_property, subsumption_axiom, or partition_axiom. Each is a positive ground proposition. Modelers writing source today can declare structure (concepts, relations) and behavior (mutations, derive rules) but cannot declare contents (ground propositions).

The minimal addition: surface syntax that emits these axiom events at build time. The .oxbin carries them just like mutation-emitted events. The reasoner sees them as live state. Compile-time materialization can finally run with input.

Decision

Surface syntax

pub fact Person(alice);
pub fact Person(bob);
pub fact hasAge(alice, 25);
pub fact hasAge(bob, 12);

The pub fact keyword sequence introduces a single ground proposition. Its body is exactly the same syntactic shape as a mutation’s insert operations — a predicate name followed by a parenthesized argument list. Each argument is either an identifier (interpreted as an Individual reference) or a literal (Int, Bool, Text).

Multiple facts can share a single declaration via a brace-delimited block:

pub fact {
    Person(alice);
    Person(bob);
    hasAge(alice, 25);
    hasAge(bob, 12);
}

This is purely a syntactic convenience — semantically identical to N independent pub fact declarations.

Identity allocation for fact arguments

When a fact references an identifier (alice, bob), the elaborator allocates a fresh IndividualId for it (per RFD 0001’s “individuals are system-allocated” rule). The mapping identifier → IndividualId is interned per workspace build, so:

  • The same identifier mentioned in two pub fact declarations resolves to the same IndividualId.
  • Different builds of the same source produce the same IndividualId (deterministic).
  • Different identifiers (alice vs bob) get distinct IndividualIds.

The identifier itself becomes available for runtime lookup via the workspace’s symbol table — callers can pass Value::Individual(alice_id) to mutations or queries. (Today the runtime hashes the variable name for unbound parameters; this convention extends naturally.)

Per-axiom-kind semantics

Fact shapeResulting axiom event
pub fact Concept(individual)iof_assertion { concept_id = NameRef(Concept), individual_id = IndividualId(individual) }
pub fact relation(arg1, arg2, ...)relation_tuple { relation_id = NameRef(relation), args = [encode(arg1), encode(arg2), ...] }
pub fact axis(target) = valuemeta_property { axis_id = NameRef(axis), target_id = NameRef(target), value } (sugar for the meta-property family per §13)
pub fact A <: Bsubsumption_axiom { sub_id = NameRef(A), super_id = NameRef(B) } (an alternative spelling of the <: in concept declarations)

The fourth row’s surface (pub fact A <: B) is shorthand for declaring subsumption outside a concept declaration’s <: clause — useful for late-binding or library-extension idioms. Initial implementation focuses on the first two rows; meta-property and subsumption shapes land when the broader §13 / §11 substrate matures.

Relationship to mutations

A pub fact declaration is equivalent to a mutation that runs at module load with no parameters. The compile-time-equivalent expansion:

// User writes:
pub fact Person(alice);
pub fact hasAge(alice, 25);

// Semantically equivalent to (but the compiler does NOT actually
// emit this; it emits the axiom events directly):
pub mutate __bootstrap_module() {
    insert iof(alice, Person);
    insert hasAge(alice, 25);
}
// + invocation of __bootstrap_module at module-load time

The advantages of NOT going through a synthetic mutation:

  1. Directly content-addressed. The fact’s AxiomKey is BLAKE3-128(canonical_body), deterministic per RFD 0001. Mutations get fresh EventIds at invocation; facts get content-stable identity.
  2. No runtime invocation needed. The .oxbin carries the events directly; the runtime sees them in Store::seed_from(&Module).
  3. Compile-time materialization works. RFD 0002’s #[comptime] lifter finally has inputs.

Interaction with #[comptime]

After pub fact lands, the comptime lifter has a non-trivial path:

pub fact Person(alice);
pub fact Person(bob);
pub fact hasAge(alice, 25);
pub fact hasAge(bob, 12);

#[comptime]
pub derive adult(p) :- Person(p), hasAge(p, n), n >= 18;

At build time, the lifter:

  1. Collects facts from pub fact declarations → catalog seed.
  2. Runs the comptime rule against the seed.
  3. Emits the derived tuples (adult(alice) in this case) as fresh iof_assertion / relation_tuple events with derivation provenance.
  4. The runtime sees the derived facts directly; no re-evaluation needed.

Mutability + lifecycle

pub fact declares an immutable ground fact. The fact exists from module-load onwards. Retraction must go through:

  • A mutation: pub mutate forget_alice() { delete iof(alice, Person); }.
  • A retraction event at runtime (when the storage layer supports it).

The fact’s AxiomKey is content-addressed, so even after retraction the event log records both the assertion and the retraction. Replaying the log reproduces the lineage.

Diagnostic surface

  • OE0210 FactReferencesUnknownConceptpub fact Person(alice) where Person is not a declared concept.
  • OE0211 FactArgArityMismatchpub fact hasAge(alice) for a binary relation.
  • OE0212 FactArgTypeMismatchpub fact hasAge(alice, "twenty-five") where the relation expects Nat.
  • OE0213 FactInsideFnpub fact is a module-level declaration; not admitted inside function bodies, traits, etc.

Rationale

Why a new keyword fact, not overloaded syntax

Alternatives we ruled out:

  • pub Person(alice); — overloads the visibility keyword. Reads ambiguously next to pub kind Person.
  • pub iof alice : Person; — too low-level; exposes the wire-format axiom kind to the surface.
  • #[fact] Person(alice); — attribute-driven, but attributes describe modifiers, not the kind of declaration. Confusing.

pub fact reads cleanly, parallels pub kind / pub rel, and is unambiguous about meaning.

Why allow block form

The block form (pub fact { ... }) is shorthand for repeated declarations. Real modules have lots of bootstrap facts (a stdlib of units, well-known individuals); writing 50 pub fact declarations vs. one block of 50 is just keyboard noise.

The block form’s semantic is identical to N independent declarations. No grouping invariants implied.

Why content-addressed AxiomKey, not synthetic mutation EventId

Two reasons:

  • Determinism: same source → same .oxbin bytes. The fact’s identity comes from its content, not from a mutation’s invocation order.
  • Lineage clarity: a pub fact declaration is a statement about reality, not an operation that “happens.” Treating it as an event with content-derived identity matches the modeler’s mental model.

Why immutable

pub fact describes ground truth as the modeler authored it. Allowing inline mutation (pub fact Person(alice); pub fact !Person(alice); — withdraw the prior?) would conflate declaration with operation. Cleaner: declarations are immutable; runtime mutations are the path for state change.

The .oxbin-resident facts can still be retracted by a mutation at runtime; the assertion + retraction live in the event log together.

Why not full datalog-style “extensional database” syntax

Some Datalog systems allow large facts via a separate file format (e.g., .csv of relation tuples, loaded by relation name). Argon could borrow this.

Rejected for the v1 surface: introduces a second source format with different rules. The same module would split between .ar (rules) and .csv (facts) — a coordination burden. pub fact keeps everything in one file format.

For very large fact sets (e.g., million-row knowledge bases), a tooling layer can pre-process external data into pub fact declarations. The language surface stays uniform.

Alternatives considered

Alt 1: No pub fact; require synthetic mutations

The modeler writes pub mutate bootstrap() { insert iof(alice, Person); } and a runtime layer calls it on module load.

Rejected: confuses operations (mutations) with declarations (facts). Every fact-heavy module would have boilerplate. The #[comptime] lifter would still need to recognize “this mutation is actually fact bootstrapping” — same problem at one layer of indirection.

Alt 2: data keyword (a la SQL)

pub data hasAge { alice = 25; bob = 12; }

Rejected: “data” overemphasizes table-shaped thinking. Argon facts can be heterogeneous (subsumption axioms, meta-properties); data reads like “tabular data only.”

Alt 3: Module-level expression syntax

pub Person(alice); at module scope (no leading keyword).

Rejected: parser ambiguity with function calls. pub would have to mean two different things depending on what follows; messy.

Alt 4: Inline in concept declarations

pub kind Person {
    instances: alice, bob, carol;
}

Rejected: couples instance enumeration to concept declaration. The whole point of separating concept from instance is that instances can be added without modifying the concept. Argon’s metatype model assumes this separation.

Consequences

Wire format

No new axiom kinds — pub fact emits existing iof_assertion, relation_tuple, etc. events. The wire format is unchanged.

Parser changes

New keyword FACT_KW. New FactDecl AST node parallel to ConceptDecl / RelDecl. ~50 LOC in grammar.toml + argon.ungrammar + oxc-parser/src/grammar.rs.

Elaboration changes

oxc-instantiate gains a lower_fact_decl similar to lower_concept_decl. ~100-150 LOC. The fact’s args resolve through the same SymbolTableBuilder as other declarations.

Identity allocation for fact-mentioned individuals

The SymbolTableBuilder gains an intern_individual(name) -> IndividualId method. The first mention of alice allocates a fresh IndividualId; subsequent mentions return the same one. Per-build deterministic.

Comptime lifter

Per RFD 0002, the comptime lifter materializes derive rules into fresh axiom events at build time. After pub fact lands, the lifter has facts to evaluate against. The lifter implementation can finally land.

Test ergonomics

Tests like the keystone become much cleaner:

#![allow(unused)]
fn main() {
let source = r#"
mod demo;
pub kind Person;
pub rel hasAge(p: Person, n: Nat);

pub fact Person(alice);
pub fact hasAge(alice, 25);

pub derive adult(p) :- Person(p), hasAge(p, n), n >= 18;
"#;
// ...load, query, assert. No mutation calls. No test helpers.
}

Open questions

  • How are pub fact declarations sorted in the events section? Per the canonical-event-key sort, events sort by (tenant, fork, standpoint, kind, vt_start, tx_from). Facts get tx_from = 0 (declared at the workspace’s origin time)? Or do they get a deterministic tx_from derived from source position? The latter preserves source order; defer to implementation.

  • Can pub fact reference future-declared symbols? pub fact Person(alice); followed by pub kind Person; — does this work? Symmetric to how derive can reference forward-declared concepts. Default: yes; the elaborator does a single AST walk and resolves all references. Concrete error case: pub fact UnknownConcept(alice); → OE0210.

  • pub fact inside mod blockspub mod sub { pub fact ... } admits per-submodule fact scoping. Symbol resolution follows module visibility rules (per §3.1). No special handling beyond what’s already established.

  • Bulk-load tooling — for million-row fact sets, a CLI/tooling story is needed. Out of scope for the language surface; addressable post-MVP.

  • Interaction with standpoint scoping — when standpoints are surfaced, a fact in standpoint X is only visible in X (and its supertypes per FDE info-join). Defer to the standpoint substrate work track.

  • Negative facts / classical negationpub not_fact Person(alice) to assert that alice is not a Person. Useful in OWA / classical contexts; relates to §12.2’s Truth4 bilattice. Defer; orthogonal to this RFD’s positive-only scope.