Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

RFD 0020 — The runtime data engine: a composable query + reasoning pipeline

  • State: accepted — partially implemented (Phase 1 #98 landed; Phase 2 tracking issue #100 open)
  • Opened: 2026-06-06
  • Decides: the engine architecture of Argon’s runtime — the composable LogicalPlan → optimizer → PhysicalPlan → tiered execution pipeline that makes the runtime a full-blown, highly-optimized graph/knowledge database which (a) serves arbitrary ad-hoc queries and mutations (engine-configured), (b) maintains derived state incrementally as a reasoner, and (c) is the substrate the compiler/type-checker draws on — one composable IR, several physical backends, without becoming three incompatible engines or one monolith that compromises each role. This is the umbrella that frames RFD 0003 (the tier-dispatch seam), RFD 0018 (the recursive-tier read executor — the DBSP engine), and RFD 0019 (the write path), positioning each within the whole.

This RFD is the architecture decision record for the runtime engine. It is Lean-first where it touches reasoning semantics — the IR’s meaning conforms to spec/lean/Argon/Reasoning/ and the D1 pair-encoding correspondence (Standpoint/PairEncoding.lean); divergence there is a bug. But most of this RFD — the pipeline structure, physical operators, optimizer, storage layout — is engine architecture and ergonomics, which the Lean does not mechanize (per AGENTS.md scope) and which is settled from first principles here. It commits a plan, not code; orca-era decisions (D-NN) are cited only as corroborating prior experience, never as authority.


Question

Argon’s runtime is not “a reasoner with a storage backend.” It is a graph/knowledge database whose distinguishing feature is that the data system and the inference engine are the same system (oxc-reasoning/src/lib.rs:4-11: “Argon’s reasoner IS the data system… queries are sinks… the storage layer IS the reasoner’s state”). It must simultaneously be:

  1. A world-class graph database — accepting arbitrary ad-hoc queries and mutations (when the engine is configured to allow them), with traversals, pattern matching, aggregation, and recursion, optimized to compete with purpose-built graph stores at scale;
  2. An incremental reasoner — maintaining derived predicates (rules, bilattice/AFT, stratified NAF, aggregates; later WFS/defeasibility/MTL) over the same facts, incrementally on mutation (RFD 0018);
  3. The compiler/type-checker’s substrate — subtyping, refinement (where/iff), occurrence typing, and structural checks are queries over the type/ontology graph.

What is the engine architecture that serves all three without forcing the wrong shape on any of them? Concretely: what is the shared IR, what are the physical execution substrate(s), how do queries / rules / checker-goals / mutations relate, how is it optimized, how is data laid out at scale, and how do RFD 0003/0018/0019 compose inside it?


Context

Current state (verified against origin/main @ e7711a4)

  • The composable pipeline is designed, not wired. oxc-reasoning already declares the DataFusion-shaped stack: LogicalPlan (logical/mod.rs: Scan/Filter/Map/Join/AntiJoin/Distinct/Recurse/Sink), an OptimizerRule trait + an empty pipeline (optimizer/mod.rs), PhysicalPlan = Vec<Stratum> (physical/mod.rs), and an Engine dispatching Box<dyn TierExecutor> per stratum by tier (lib.rs:97-119). The module docs state the intent explicitly: “mirrors DataFusion’s ExecutionPlan pattern” and (compile/mod.rs) LogicalPlan is “the future surface for optimizer rules… when the optimizer materializes it will rewrite LogicalPlan → optimized → CompiledRule.”
  • The MVP took a shortcut around it. Today query_derive (oxc-runtime/src/lib.rs) goes AtomIR → CompiledRule → evaluate_to_fixpoint(&[CompiledRule], …) directly (6 call sites), bypassing LogicalPlan/optimizer/PhysicalPlan/TierExecutor. SemiNaiveExecutor::execute is an empty stub; the real (semi-naive) evaluator lives as free functions in executor/eval.rs. The whole RFD-0003 dispatch layer is currently orphaned — by staging, not by design error.
  • No ad-hoc query IR. Queries today are declared rules; there is no query expression/IR distinct from rules, and no general ad-hoc surface.
  • The checker does not yet use the reasoner. oxc-check is pure syntax-driven; the only shared artifact is the Tier enum. Role (3) is a goal, not wired fact.
  • Storage is naive. Relations are BTreeMap<CBOR-tuple, i64-weight> (runtime/relation.rs), decode-per-tuple in the join loop; no index-free adjacency / CSR / columnar / arrangements (the arrangement_body slot in .oxbin is reserved + inert).

What the roles demand (and what the references teach)

  • Graph-DB rolekuzu is the playbook: a strict bind → logical-plan → optimizer (visitor passes) → physical-mapper (1:N) → vectorized processor pipeline; factorization (flat vs unflat column groups → compact storage for many-to-many path patterns); multiway INTERSECT (worst-case-optimal joins) for cyclic patterns; CSR adjacency + columnar storage with semi-mask / predicate pushdown; clean extension hooks for pluggable operators. oxigraph adds index-permutation storage (SPO/POS/OSP), lazy iterator (volcano) evaluation, and the “a query is just a rule with a distinguished head” identity.
  • Reasoner rolekora teaches staged filtering (told-subsumers → EL-saturation → DL-saturation → tableau → FOL-escalation) and “absorption is a logical→physical lowering”; nous teaches the anti-patterns to avoid: rules-as-Rust-enums (rules must be data/IR), phases hardcoded into a loop (phases must be stratification metadata), and direct state mutation (rules must produce Z-set deltas, not mutate).
  • The physical insight — the vault’s Materialization Wall: for TC/1000, the bit-parallel computation is ~5 ms but converting the answer into joinable tuples is ~1100 ms (99.5%). The optimal compute representation (bitmatrix, CSR) differs from the optimal join representation (sorted tuples); converting between them dominates. The fix is BYODS — make the evaluator polymorphic over relation representations so the compute rep is the relation. This generalizes RFD 0018’s “arrangements (D7)” into a principle.
  • The IR principle — a reasoning logical IR is declarative, set-oriented, monotone, fixpoint-oriented (not SSA); FlowLog’s “explicit relational IR per rule, recursive control separated from the logical plan” is the right shape.
  • The optimizer trajectory — DataFusion’s own path: rule-based passes first; Cascades (memo + transformation/implementation rules + cost model; CMU’s optd is a Rust Cascades for DataFusion) when the search space justifies it.

Decision

A composable, multi-tier query+reasoning engine. Twelve decisions:

D1 — The runtime is a graph/knowledge database; reasoning is a capability within it

The product is a database: a durable, queryable, mutable store of individuals and relation-edges, with reasoning (derived predicates) as a first-class capability over the same data — not a bolt-on. Every other decision serves “world-class graph database that also reasons,” not “reasoner that also stores.” This reframes RFD 0018: the incremental reasoner is the engine’s view-maintenance subsystem, one tier among several.

D2 — One composable logical IR; three front-ends lower into it

LogicalPlan (a relational + graph + recursive algebra, D5) is the single shared surface. An ad-hoc query, a declared rule, and (at the boundary) a compiler/type-checker goal all lower to the same LogicalPlan: a rule adds a Recurse (fixpoint) node; a query adds a Sink/projection; a checker-goal is a bounded query over compile-time relations; a mutation is a write node (D11). This is why one engine can “serve all three” — they share the IR, like Substrait/DataFusion’s LogicalPlan is backend-agnostic. CompiledRule is reclassified as the physical lowering of a rule body (the mapper’s output), with LogicalPlan the optimizable form.

D3 — Two physical substrates, one IR (from first principles)

The IR is shared; the physical execution substrate is not one engine. Compile-time checking runs on Salsa-tracked memoized functions (demand-driven, per-definition invalidation, the rustc/rust-analyzer regime). Runtime queries/reasoning run on the DBSP/Z-set IVM engine (event-stream-driven, per-(tenant, fork) invalidation, RFD 0018). They meet at a boundary — compile artifacts feed the runtime; provenance composes — not in one executor. First-principles justification (not deference to orca’s D-09/D-10): the two halves have fundamentally different change regimes (source edits vs fact mutations), granularity (per-definition vs per-tuple), and lifetime (a build session vs a persistent multi-tenant store). A single physical substrate would force batch-IVM semantics onto fine-grained incremental type-checking, or Salsa’s recompute-on-demand onto a streaming fact firehose — compromising one half. The shared logical IR + tier ladder + provenance model is what unifies them; the physical backends are chosen per role.

D4 — The pipeline stages, strictly separated

front-end (parse → bind → type) → LogicalPlan → optimizer → physical mapper (1:N) → executor. Strict stage boundaries with typed IRs between them (kuzu’s discipline; and the vault’s canonical-pipeline-architecture post-mortem found that conflating analysis stages was the direct cause of a 4× diagnostic-count divergence — separation is a correctness property, not just hygiene). Each stage is independently testable; extension hooks (planner/mapper) allow pluggable operators and backends without editing the core.

D5 — The logical operator algebra

LogicalPlan extends the present relational core with graph-native and mutation nodes:

  • Relational: Scan, Filter, Map/Project, Join, AntiJoin (stratified NAF), Distinct, Aggregate, Union, Sink.
  • Graph-native: Extend (single-hop edge traversal), PathExtend (variable-length / recursive path), Intersect (multiway / worst-case-optimal join for cyclic patterns).
  • Recursive: Recurse (least-fixpoint — the rule-evaluation operator; “rules are the execution unit” lives here).
  • Mutation: Insert, Delete, Set, Merge (D11; the write path’s logical surface, coordinating with RFD 0019).

Graph-native nodes desugar to joins + Recurse for correctness, but the optimizer can lower them to native physical traversal operators (index-free adjacency, factorization) when the storage rep supports it — the kuzu performance win. The IR stays declarative / set-oriented / monotone / fixpoint-oriented.

D6 — BYODS: physical relations are polymorphic over representation

A physical Relation is an interface (membership, key-scan, range-scan, join-key iteration), not a fixed BTreeMap<CBOR, weight>. Representations coexist behind it: sorted arrangement on InternalId (RFD 0018 D7, the join workhorse), bitmatrix (transitive closure — kora-reason-rl’s bit-parallel Warshall), CSR index-free adjacency (graph traversal), factorized (D7), and virtual/lazy (generate tuples on demand). The optimizer/mapper picks the rep per relation; the engine never pays the Materialization Wall — the optimal compute rep is the relation, served directly to downstream operators. This subsumes and generalizes RFD 0018’s “arrangements.”

D7 — Factorization is in scope for v1

kuzu-style factorized query processing — flat vs unflat column groups, FLATTEN operators inserted by a rewriter pass, factorized intermediate results — is part of v1, not deferred. It is the difference between linear and Cartesian memory for the many-to-many graph patterns a knowledge graph is made of; deferring it would mean rebuilding the physical layer later. It is built early, alongside the Z-set/arrangement baseline, and reconciled with the Z-set model (a factorized Z-set is a compressed multiplicity-carrying batch).

D8 — The optimizer: rule-based visitor passes now, Cascades later

The optimizer is a chain of visitor-based rewrite passes (the OptimizerRule trait, made real): predicate/projection pushdown, join reordering (cardinality-guided), magic-sets / demand transformation (Datalog — restrict bottom-up rule evaluation to the query’s demand), stratum merging, factorization rewriting (D7), and backend/tier dispatch. This matches DataFusion’s shipping design. Cascades (optd-style memo + transformation/implementation rules + cost model over ontology/graph statistics) is the planned successor once the plan search space (graph join orders × representation choice × backend choice) outgrows hand-ordered passes — adopted then, not now.

D9 — Tier dispatch with pluggable executors (generalizes RFD 0003)

The 7-tier classifier (classifier/mod.rs) routes physical (sub)plans to executors, each owning its own incrementality over the shared relation catalog: Salsa-tracked functions (compile-time, structural/closure checking), semi-naive / DBSP (recursive runtime tier — RFD 0018), SLG (WFS-over-cycles — deferred), Kora (DL/expressive tier — embedded behind the seam, kora’s staged saturation/tableau), SMT (FOL under unsafe logic). Tier is compile-time metadata on rules/plans (not a runtime profile flag — the nous anti-pattern). This is RFD 0003’s TierExecutor seam, generalized from “reasoner backends” to “any physical-plan backend.”

D10 — Storage: CQRS, and the event-log-as-sole-store risk

The store splits write model (the append-only axiom_events log — source of truth, audit, bitemporal time-travel; the Z-set delta stream RFD 0018 consumes; the write path RFD 0019 produces) from read model (a persisted, indexed, graph-optimized materialization — CSR adjacency / columnar / factorized, the durable BYODS physical relations). The event log must not sit on any hot read path. A single append-only log is an excellent write/audit model but a catastrophic primary read store at scale (re-materialization is O(database); per-query log replay dies) — so the persisted read model is the primary served store, the log is checkpointed/compacted behind it, and cold-start reads the read model, not the log. (This is the storage-performance concern raised in discussion, made a hard contract.) Cross-refs RFD 0018 D4 (which this generalizes from the reasoner’s arrangements to the whole database’s read model) and the spec §20 CQRS catalog.

D11 — Ad-hoc queries and mutations are first-class (the database API), config-gated

The engine accepts arbitrary queries and mutations at runtime, not only declared rules/procedures — when the engine is configured to permit it (a deployment may restrict to declared operations for safety/perf). Ad-hoc queries are LogicalPlan trees built at runtime (D2/D5); ad-hoc mutations are write nodes producing Z-set deltas that the IVM engine maintains derived state against (D5/D10; semantics owned by RFD 0019). This is the read/write API surface of the database — gating is an engine policy, not a language restriction (tenancy/IAM stay in the serving layer per AGENTS.md).

D12 — Relationship to RFD 0003 / 0018 / 0019, and the build order

  • RFD 0003 (backend dispatch) is the TierExecutor seam — subsumed and generalized by D9.
  • RFD 0018 (DBSP engine) is the recursive-tier runtime read executor — the physical backend for Recurse/IVM. Correctly scoped; this RFD is the layer above it.
  • RFD 0019 (mutation write-path correctness) owns the write semantics; D5/D10/D11 host its logical surface and storage contract.
  • Build order: the “executor unification” work is no longer a standalone refactor — it is Phase 1 of this RFD: route the runtime through the Engine/dispatch over the proven rule path (with Recurse as the operator), make SemiNaiveExecutor::execute real, de-magic-number convergence, fix the stale headers. Then: Phase 2 — the LogicalPlan/optimizer/physical-mapper made real (the pipeline wired end-to-end) + BYODS reps; Phase 3 — graph-native operators + factorization + the persisted read model; Phase 4 — ad-hoc query/mutation surface + Cascades when justified. RFD 0018’s own phases (the DBSP engine) proceed in parallel as the recursive-tier executor.
    • As-built realization: RFD 0021. Phase 1 (dispatch through Engine::evaluate), Phase 2 (indexed + persistent-arrangement joins, WCOJ, BYODS/CSR reps, the SIP body-reorder optimizer), and the reasoner half of Phase 3 (graph-native joins + factorization) are now landed in oxc-reasoning; 0021 records that engine as built. The persisted read model + cross-query IVM (the remaining half of Phase 3) is still open — 0021 D6/D7 show why the operator discipline already in place makes it an additive layer rather than a rewrite.

Rationale

Why composable, not monolithic. Three roles with different consumers and change regimes cannot be served well by one hand-rolled evaluator (nous proved the failure mode: hardcoded phases, rules-as-enums, an EL++ ceiling). A composable IR + pluggable optimizer passes + pluggable executors is exactly how DataFusion serves dozens of embeddings and kuzu serves graph workloads — and it is what lets a new capability (a new operator, a new tier backend, a new physical rep) land without rewriting the engine.

Why one IR but two substrates. Unifying the logical layer is what makes the three roles coherent (one algebra, one tier ladder, one provenance model). Unifying the physical layer would be a category error: compile-time checking and runtime fact-streaming have different change granularity and lifetime; the right incremental machinery differs (Salsa vs DBSP). Share the meaning; specialize the mechanism.

Why BYODS / never materialize. The Materialization Wall is empirical (99.5% of TC time is representation conversion). A world-class graph database cannot pay that. Polymorphic relations let TC stay a bitmatrix, traversal stay CSR, joins stay sorted arrangements — each served through one interface, none converted.

Why factorization now. Knowledge graphs are many-to-many. Flat tuple materialization of path/pattern results is Cartesian; factorization is linear. It is structural to the physical layer, so it is cheaper to build in than to retrofit — hence v1.

Why CQRS with the log off the hot path. Event-sourcing gives audit, time-travel, and a clean IVM delta stream — but a log is a write model. Serving reads from it is O(database). The persisted, indexed read model is the only way to hit world-class read latency at scale; the log earns its keep on the write/audit side.

Why ad-hoc, gated. A database that only runs pre-declared procedures is a stored-procedure engine, not a database. Ad-hoc queries/mutations are the product; gating is an operational policy for deployments that want it.


Alternatives considered

  • Keep the MVP shortcut (no pipeline); grow the direct rule evaluator. Rejected: it cannot host ad-hoc queries, graph-native operators, an optimizer, or pluggable backends without becoming the monolith nous warns against; it is the stepping-stone we’d replace.
  • One physical engine for both compile-time and runtime. Rejected (D3 rationale): forces the wrong incremental regime on one half.
  • Relational-only IR; treat graph queries as sugar over joins. Rejected for v1’s graph-DB ambition: loses the native-traversal / factorization / WCO-join performance that defines a graph database (D5/D7).
  • Materialize everything into sorted tuples (no BYODS). Rejected: the Materialization Wall (D6).
  • Event-log as the primary read store (no persisted read model). Rejected: O(database) reads; dies at scale (D10) — the explicit storage concern.
  • Declared-queries-only (no ad-hoc). Rejected: not a database (D11).
  • Cascades optimizer from day one. Deferred, not rejected: rule-based passes are sufficient until the search space justifies a memo/cost-model engine (D8).

Consequences

  • Code structure. oxc-reasoning’s logical/optimizer/physical/executor modules become the real pipeline (not scaffolding); the runtime routes through Engine/dispatch instead of the bare free function; a physical Relation trait (BYODS) replaces the fixed BTreeMap<CBOR,weight>; new graph-native + mutation logical nodes; a factorized-batch physical layer; a persisted read-model store behind a StorageBackend seam.
  • Spec. This RFD frames RFD 0003/0018/0019; the reference (spec/reference/src/{17,19,20}) gains an engine-architecture chapter once the design lands. RFD 0018’s “the engine” framing is contextualized as the recursive-tier executor.
  • Lean / conformance. The IR’s reasoning semantics conform to spec/lean/Argon/Reasoning/ + the D1 pair-encoding correspondence; the pipeline structure / operators / optimizer / storage are engine architecture (outside the Lean’s mechanized scope per AGENTS.md) and are conformance-tested against the semi-naive oracle + the Lean (RFD 0018 D8). The optimizer’s rewrites must be semantics-preserving — a differential-test obligation (each pass: optimized plan ≡ unoptimized plan on generated inputs).
  • Workflow. Branch off origin/main; per-PR CI-gated; the executor-unification slice (Phase 1) is the first PR; no “done” without running/benchmarked proof; commit at checkpoints.
  • Performance posture. The scaling contract (RFD 0018 D4) is hereby a database contract, not just a reasoner one: no O(database) operation on any hot path, at any scale, for queries or mutations or reasoning.

Open questions / tracked-future

  • Event-log compaction / checkpointing design — how the read model is kept primary and the log is compacted/archived off the hot path (the storage-performance concern); interacts with RFD 0018 D4 + the spec §20 schema. A research item before Phase 3.
  • Factorization ↔ Z-set interplay — the precise representation of a factorized, multiplicity-carrying, possibly bilattice-pair-encoded batch (D7 × RFD 0018 D1/D5). Needs a concrete data-model design before Phase 3.
  • The checker-uses-the-reasoner boundary — when/how oxc-check starts issuing LogicalPlan goals (subtyping / refinement / occurrence) executed on the Salsa substrate; what the shared-IR contract between compile-time and runtime looks like in code. Role (3) is design-for-now; the seam is D2/D3, the wiring is later.
  • Cost model / statistics source — what cardinality/selectivity statistics the optimizer (and eventually Cascades) consumes, and how they are maintained incrementally over a mutating graph.
  • Ad-hoc safety / config-gating model — the engine-configuration surface for permitting/restricting ad-hoc queries and mutations (resource limits, allowed operators, tier ceilings); a Phase-4 design coordinated with the serving layer (RFD 0014).
  • WCO-join scope — how far to take worst-case-optimal / multiway joins (kuzu’s Intersect is star-pattern-restricted; general WCO is more) for cyclic graph patterns.