RFD 0033 — The ad-hoc query and mutation surface
- State: accepted — implemented
- Opened: 2026-06-14
- Decides: that arbitrary, not-pre-declared (ad-hoc) queries and mutations are a
first-class, default-on capability of the Argon runtime — submitted as source text at request
time, parsed, lowered, and executed against the loaded module — with a deployment opt-out that
restricts a server to declared invocables only. Establishes that the generic submission path is
the substrate, and the declared
pub query/pub mutateforms are a thin named wrapper over it — not the only door. Builds on RFD 0014 (the serving surface), RFD 0015 (themutatebody /OperationIR), RFD 0020 / RFD 0021 (the Engine/CompiledRuleevaluation seam), and RFD 0022 (the build evaluability gate, whose runtime analogue this RFD must define). - Implements (as built): the query-provider
Schemainterface (oxc_types::Schema) with two parity-gated backends —WorkspaceSchema(build-time, over ASTs) andoxc_runtime::ModuleSchema(runtime, over a loaded.oxbin); the checker (oxc-check) fully routed through it; the runtime frontend (oxc-parser/oxc-check/oxc-instantiatenow linked intooxc-runtime);Store::eval_{query,mutation}_source(parse → full type-check → lower → run, ill-typed bodies refused and never run); thePOST /v1/{query,mutation}/adhocHTTP surface +ox query --evalCLI; theAdhocPolicyopt-out (default-on); and the build-vs-runtime agreement gate (oxc-runtime/tests/adhoc_agreement.rs) asserting byte-identical diagnostics + lowered IR. The persisted-projection-cache / IVM materialization arc is a separate follow-on, out of this RFD’s scope.
Question
A data system you cannot query ad hoc is not a database. Argon’s design intent — stated repeatedly
and recorded since 2026-05-29 — is that the runtime accepts arbitrary queries and mutations at
request time, not only the “stored-procedure” pub query / pub mutate declarations that lower
into .oxbin. The declared forms are meant to be a convenience layer over a generic ad-hoc path.
A deployment may turn ad-hoc off (lock down to declared-only) for safety, but that is a gate you
enable, not a default-closed wall.
Today that path is unbuilt at the edges, and — separately — the project’s own notes and one prior analysis have repeatedly mis-described it as “rejected by design.” It is not. This RFD settles:
- What the ad-hoc surface is (wire shape, CLI shape, semantics), for queries and mutations together.
- How a body submitted as source text is compiled at runtime, given that the compiler frontend is not currently linked into the serving binary.
- The resolution context: how names and types in an ad-hoc body resolve against the loaded module rather than a build-time Salsa workspace.
- How much type-checking an ad-hoc body receives (answer: the full amount), and what decidability-tier admittance applies at runtime (answer: build-gate parity by default, configurable).
- The security model: default-on, the deployment-level opt-out, and affordance parity — ad-hoc is governed by the same uniform capability scoping as declared invocation, never an ad-hoc-specific leash.
Context
The framing matters because it has been wrong. The corrected, code-verified picture:
The reasoner is rule-as-data, and the compile step already runs at request time. In
Store::query_body_dispatch (compiler/crates/oxc-runtime/src/lib.rs:6799–6824) the runtime
decodes a query’s AtomIR body + head Term, calls
oxc_reasoning::compile::compile_rule(short, &head, &atoms) at dispatch time, pushes the fresh
CompiledRule onto the module’s rules, and runs Engine::evaluate(&rules, &mut catalog, …). The
engine consumes &[CompiledRule] as plain data; it has no notion of “pre-declared.” The only thing
tying this to a declaration is the source of atoms/head: find_query_decls(name) looks them up
from .oxbin-loaded QueryDeclBodys rather than from the request.
The mutation interpreter is already general and decl-agnostic. Store::run_body_op
(oxc-runtime/src/lib.rs:3548+) interprets an Operation sequence (InsertIof, InsertTuple,
Update, For, If, Return, … — oxc-protocol/src/core_ir.rs:389) and does not take the
MutationDecl; the decl is consulted only for argument validation and capability checks at the
boundary. The storage write methods (emit_iof_assertion, emit_relation_tuple,
emit_individual_property_assertion) are origin-agnostic.
So the constraint is not semantic. It is three concrete wiring gaps:
-
No request field for a body.
DispatchDescriptoris{ qualified_path, args, return_type }(oxc-serve/src/lib.rs:2671); resolution isquery_decls.get(qualified_path)→ 404 if absent. There is nowhere to put a body. (The runtime refusalARGON_RUNTIME_UNSUPPORTED_QUERY_BODYatoxc-serve/src/lib.rs:6749is a narrower executor gap — field projections in bodies are not yet executable — not an ad-hoc policy.) -
The compiler frontend is not linked into the serving binary.
oxc-serveandoxc-runtimedepend onoxc-reasoning,oxc-protocol,oxc-oxbin(+ storage) — and notoxc-parser,oxc-check,oxc-instantiate,oxc-resolver, oroxc-db(verified in bothCargo.tomls). The runtime can compile pre-lowered IR but cannot turn source text into IR. -
Name/type resolution is build-time. The frontend’s full type-checker (
oxc-check) is bound to a SalsaOxcDb/Workspace/resolver.
The fourth fact reshapes the whole design and is why this is tractable:
Lowering is already decoupled from Salsa. oxc_parser::parse(source_text: &str) -> Parse
(oxc-parser/src/lib.rs:44) is standalone — string in, parse tree out, no DB. Rule-body lowering is
body_to_atoms_ctx(list: &SyntaxNode, ctx: &LowerCtx) -> Vec<AtomIR> (atom_lower.rs:49), and
LowerCtx (expr_lower.rs:116) resolves names through plain closures —
resolve_type: &dyn Fn(&str) -> Option<NameRef>, plus enum-variant and field-optionality resolvers —
not Salsa. In oxc-instantiate/src/lower.rs, every &dyn OxcDb use is parse_file(db, file): the
DB’s only job in the lowering data path is to produce the parse tree.
The genuinely Salsa-heavy component is oxc-check (reference resolution + full type inference via
resolve_path(db, workspace, file, …) and lower_type_expr(…)), and it runs separately, after
lowering. So “decouple frontend lowering from Salsa” splits into two very different tasks:
- (a) Lowering is already call-site-decoupled. The work is to build a
LowerCtxwhose closures are backed by the runtimeModule/.oxbincatalog instead of the build-time file pre-pass. Small. - (b) Type-checking is Salsa-bound. Reproducing it at runtime — or deciding ad-hoc bodies get lighter validation — is the large, separable decision.
What the runtime already exposes for (a): Module (oxc-runtime/src/lib.rs) carries concept_id,
concept_id_by_short_name, relation_id, ancestor_concept_ids_including_self,
resolve_predicate_key, resolve_rule_name, resolve_mutation_invocable, symbol_path, and (today
private) declared_field. The .oxbin declaration bodies (oxc-protocol/src/storage.rs) carry
field declarations with type expressions, refinement predicates, relation arg concepts/cardinalities,
and query/mutation parameter types — encoded as CBOR. The information needed to back the LowerCtx
closures exists; Module simply doesn’t yet expose a resolution surface over it (notably:
resolving names inside a CBOR-encoded TypeExpr, field-type lookup, and a parameter catalog).
Decision
Adopt a two-tier surface, with the generic path as substrate and declared decls as a wrapper.
Tier A — the generic submission substrate
A submitted body flows through the same runtime seam declared invocables already use:
- Queries:
(head Term, Vec<AtomIR>)→compile_rule→ appended to module rules →Engine::evaluate→ rows. (This is literally thequery_body_dispatchpath with the IR sourced from the request instead offind_query_decls.) - Mutations:
Vec<Operation>(+ params) → the existingrun_body_opinterpreter, under the same atomicity, read-your-writes, and delta-guard contract as declared mutations (RFD 0015 / RFD 0019).
Tier A is reachable in two framings, in priority order:
- Source text (the product surface): the request carries an Argon query/mutation body string.
The runtime parses and lowers it (Tier B) to the IR above, then runs it. This is what
ox query '<body>', a REPL, and an/v1/queryHTTP endpoint use. - Pre-lowered IR (the substrate boundary): the IR itself is the unit Tier A executes. It is the internal contract the source-text path compiles down to, and declared decls already produce it. Whether IR is also a public client surface is left open deliberately (§Open) — it is a performance/optimization question (a precompiled/prepared-statement analogue), and the answer should be whatever is correct once the prepared-body / caching design is worked out, not a guess made here. Note that IR submitted directly would bypass the type-checker, so if exposed it must carry its own validation story — another reason to settle it with the performance design rather than now.
Declared pub query/pub mutate become wrappers: their dispatch resolves a name to stored IR
and then enters the same Tier A execution. No second engine path.
Tier B — runtime parse + lower + check (the resolution contract)
Parsing is the easy part: oxc_parser::parse(source_text: &str) -> Parse is already standalone (no
DB). The hard part — resolving and type-checking the body against the loaded schema — is solved by a
single proven pattern, not by carrying source and not by a second checker.
The query-provider pattern (decision #3, refined 2026-06-15). Across every mature
separately-compiled language — rustc (.rmeta as a query provider: tcx.type_of(def_id) is
answered from local HIR or by decoding metadata, dispatched only by local-vs-extern), Go
(go/types’ Importer), OCaml/GHC/SML/Scala/F# (rehydrate serialized data into the same Env /
TyThing / StaticEnv / typed-tree the checker already consumes) — the dominant, unanimous design
is one type-checker whose environment access is an interface, answered either from source (local)
or from already-resolved serialized facts (imported / loaded). Nobody re-elaborates the
dependency’s source; nobody forks the checker. The PL-theory framing is the same (external prior art,
cited as ideas not authority): F-ing modules’ “signatures are views over the kernel’s type
structure, not a parallel type system,” and the .olean / .ttc interface-file precedent that the
serialized environment is the type-checking source-of-truth.
Concretely for Argon:
- Introduce a
Schemainterface — the narrow set of environment-access operations the frontend actually performs: resolve a name to a declared concept/relation/struct/enum; a concept’s fields and their types; subsumption/parent edges; relation argument arities and types; enum variants; query/mutation parameter types; and each concept’s world assumption (CWA/OWA) (so three-valued OWA refinement checking can’t silently diverge — a substrate-research caveat). oxc-parser(standalone),oxc-instantiatebody-lowering (already(&SyntaxNode, &LowerCtx), no DB), andoxc-checkall resolve throughSchema. The build-time backend answers from the Salsa workspace / ASTs (today’s code, behavior unchanged); the runtime backend answers from the loaded module. The inference and lowering logic is shared and untouched — only the environment-access surface is abstracted. This is the rustc local-vs-extern split, not a rewrite of the type system.
The runtime backend reads a projection over the event log — it serializes nothing new. This
follows from how Argon storage works today (verified in current code, not assumed): storage is a
single append-only axiom_events log (oxc-protocol’s AxiomEvent; the axiom_events table in
oxc-storage-pg), and Module already builds its concept/relation/field indexes from that event
stream at load. So the runtime Schema is a reader over the catalog projection the Module
already builds from declaration events — not an embedded copy of source and not a separate
schema section. The data it needs (resolved field TypeExprs, parent ids, relation arg types,
params, refinement predicates) is already in the .oxbin decl bodies. We do not add a redundant
representation of facts the log already holds; we expose them through the interface.
The artifact-identity and drift-guard machinery already exists; the Schema backend keys on it.
The separate-compilation literature is unanimous that cross-boundary type identity must be a
persistent content hash (rustc DefPathHash + StableCrateId; SML content-derived PIDs), never a
structural match or an allocation-order stamp. Argon’s .oxbin already implements this: per-section
BLAKE3 content hashes and a composition signature (oxc-oxbin/src/composition_signature.rs,
content_hash.rs, section.rs), a multi-axis version preamble with strict-producer/liberal-consumer
gating checked at the load site before any body section (versioning.rs; reader.rs), and a
load-time tier gate (validation.rs layer1_valid → OE1204). So the boundary is already guarded
two ways — a hard version/format header (deterministic refusal of an incompatible artifact) plus
content fingerprints over the sections. The runtime Schema backend identifies its schema by the
loaded module’s composition signature and section hashes; nothing new is invented here.
The genuine residual is narrower than “no identity”: artifact-level identity is solid, but it is not
yet threaded to per-event / per-symbol identity inside the store — the storage-side gap where
module_id is effectively constant, so two schemas’ symbol ids can collide at the event level (a
known storage defect). Schema resolution must carry module/artifact identity down to per-symbol
resolution; fixing that is shared with the storage-identity work, not additive to it.
Type-checking: full, no shortcuts (decision #1)
An ad-hoc body receives the same, complete type-checking a declared body gets — name resolution,
reference checking, full inference — via the same oxc-check logic, now resolving through Schema.
There is no “lighter validation” tier and no unchecked-but-executed path; a half-checked ad-hoc
surface would be exactly the hollow feature the house rules forbid.
Parity is enforced as a canonical-input contract + agreement test — the same discipline the repo
already runs at the Lean↔Rust boundary (the @[language_interface] drift test in oxc-protocol,
where one logical contract is checked across two representations). Schema is the only way the
frontend may touch the environment — no caller reaches around it to the AST or the catalog
(make-illegal-states-unrepresentable) — and a CI agreement test asserts that the same body checked
against the build-time and runtime Schema backends yields byte-identical diagnostics and
identical lowered IR. Drift is a defect, gated like any spec↔code drift.
Decidability-tier admittance (decision #2)
By default an opted-in deployment admits the same tier ceiling as the build evaluability gate
(RFD 0022) — ad-hoc bodies are held to the identical
decidability bar as declared ones. The load-time tier gate that enforces parity already exists
(oxc-oxbin/src/validation.rs layer1_valid, refusing max_tier_claimed beyond the runtime’s
capability with OE1204); an ad-hoc body’s classified tier is checked against the same ceiling. The
ceiling is intended to be configurable per deployment (a server may set a lower ad-hoc ceiling for
untrusted callers) — that per-call/lenient mode is not yet built (today’s gate is artifact-level
strict) — but the default is parity, and a deployment may not silently admit more than the build
gate would.
Security: affordance parity, deployment-level control only
The governing principle (decision #4): ad-hoc queries and mutations have the same affordances as everything else. Ad-hoc is not a hobbled subset of the declared surface — it is the surface, with declared forms as the named convenience layer over it. We do not special-case what ad-hoc may express, read, or write relative to a declared invocable. The Postgres test applies: a system you cannot freely query and mutate is not a useful system.
Control is therefore deployment-level, applied uniformly, never an ad-hoc-specific leash:
- Ad-hoc submission is on by default. A deployment may opt out to restrict to declared invocables only (lock-down), or run read-only (a normal database posture, not an ad-hoc penalty) — these are the same kinds of switches any database exposes.
- Whatever capability / RBAC / tenant / fork / standpoint scoping exists applies equally to declared and ad-hoc invocation. An ad-hoc mutation that a caller’s capabilities permit is exactly as permitted as the equivalent declared mutation.
- One capability exception —
forget. Physical erasure (forget) is gated on the build-time#[allow_forget]grant, which is a declaration-site capability. A runtime-submitted body has no declaration site and so cannot confer it on itself; an ad-hocforgetis therefore refused (OE0730). This is not an ad-hoc-specific leash on affordance — it is that a request cannot forge a build-time capability grant (the same reason an ad-hoc body cannot, say, mark itself#[brave]). A declared#[allow_forget] mutatestill erases; an ad-hoc body cannot. We record this as the deliberate exception to the otherwise-unqualified parity rather than pretend parity is total (originally this section asserted noForgetgate at all — that was the bug, not the code). - This is orthogonal to the generic-entity-write denial (
POST /v1/entities→ 404,oxc-serve/src/lib.rs:9565): that is an untyped-blob REST shape, a different axis. Ad-hoc writes go through the typedmutate/Operationmechanism with the fullmutateaffordance set. The two must not be conflated.
Forward compatibility: heterogeneous stores (keep this seam clean)
The stated future is specialized stores — relational / columnar / blob — that are “part of the Argon
knowledge graph,” queried uniformly, with per-data placement configured in ox.toml. That design is
not settled here, but this RFD must not foreclose it. Two principles, grounded in current-repo design
intent (RFD 0020) and external prior art (database catalog/connector SPIs; the BYODS work,
Sahebolamri et al., OOPSLA 2023):
Schemastays strictly store-agnostic.Schemaanswers type questions only; it must never know where bytes live. Physical placement is a separate layer — RFD 0020’s BYODS (D6: a physicalRelationis an interface; representations coexist) plus theRuntimeStorageBackendseam, selected per-relation byox.tomlplacement. This is the OBDA shape (data stays in place, queried through the ontology;ox.tomlplacement is the R2RML analogue), and the catalog/connector SPIs (CalciteSchema/Table.getRowType, TrinoConnectorMetadata) confirm the split: the engine owns the type system; sources map into it and never own planner type semantics.- The ad-hoc path lowers to
LogicalPlan, not to a single in-memory catalog. RFD 0020 D2 already decided that ad-hoc queries, declared rules, and the type-checker goal all lower to one sharedLogicalPlan(the IR scaffolded but currently dead inoxc-reasoning/src/logical/). Lowering ad-hoc bodies to that IR — rather than hard-wiring the currentmaterialize_predicatespull-everything-into-memory model — is what keeps the surface multi-store-ready by construction. When pushdown arrives it follows the proven contract: an optimization never an obligation, negotiated as(handle-that-absorbed-work, remainder)with the residual always re-checkable in-engine (Trino/FDW), capability modeled as binding patterns (a blob/KV store can’t free-scan, TSIMMIS), and shippability gated on determinism + identical both-sides semantics.
This RFD is, in effect, the realization of RFD 0020 D11 (“ad-hoc queries and mutations are
first-class … gating is an engine policy, not a language restriction”); its new contribution is the
runtime-frontend mechanism (the Schema query-provider, content-addressed identity, parity
discipline) that D11 left unspecified.
Rationale
- Reuse over reinvention. The execution substrate (compile-at-dispatch for queries, the general
Operationinterpreter for mutations) already exists and already runs at request time. Tier A is mostly routing: let the IR come from a request. This is why “ad-hoc is impossible by design” was always wrong. - Lowering is already where we need it. Because
parseis DB-free andLowerCtxis closure-based, the runtime lowering path is aModule-backed resolver + a dependency edge — not a rewrite of lowering. - One frontend, no drift. Reusing
oxc-instantiatelowering andoxc-checktype-checking against aModule-backed context (rather than runtime-only reimplementations) keeps build-time and runtime behavior identical, honoring the spec↔code drift discipline. Byte-for-byte diagnostic agreement is the acceptance test. - Full parity, no shortcuts. Ad-hoc bodies are type-checked exactly as declared bodies are (decision #1) and hold the same decidability ceiling by default (decision #2). A partially-checked ad-hoc surface would be a hollow feature; we do not ship one.
- Ad-hoc is the surface, not a sandbox. Declared forms are sugar over the generic path; ad-hoc has full affordance parity (decision #4). Control is deployment-level and uniform, never an ad-hoc-specific restriction.
- Default-on matches the product. Locking down is a deployment choice, not the substrate’s posture.
Alternatives considered
- Declared-only forever (status quo). Rejected: contradicts the stated design intent; “a database you can’t query ad hoc isn’t a database.”
- Source text only, IR never public. Likely, but not decided here: whether IR is also a public (prepared-statement-style) surface is folded into the performance/caching design (decision #3, §Open) so the answer is the correct one rather than a guess.
- A separate runtime-only frontend fed by an
.oxbincatalog (decision-#3 option B). Rejected: faster to stand up but creates a second lowering/checking path that drifts from the build-time one — the exact failure mode the intent-node/drift-gate discipline exists to prevent. - Ship ad-hoc with reduced/“lighter” validation first, full type-checking later. Rejected
(decision #1): a half-checked surface is a hollow feature. Full
oxc-checkparity is in scope from the start, which is what pulls the checker into the runtime frontend. - A special capability leash on ad-hoc writes (extra gates on
Update/retract because they are ad-hoc). Rejected (decision #4): ad-hoc has affordance parity; control is uniform and deployment-level. The lone exception isforget, refused for ad-hoc — but that is not a leash on affordance, it is thatforget’s#[allow_forget]capability is conferred at a declaration site a request doesn’t have, so the request can’t forge it (see Security). - A generic untyped entity-write endpoint (
POST /v1/entities). Rejected/kept-absent: ad-hoc writes belong to the typedmutate/Operationmechanism, not an untyped blob surface.
Consequences
- New runtime dependencies:
oxc-serve/oxc-runtimegain the frontend —oxc-parser,oxc-instantiate, and (per decision #1)oxc-check/oxc-resolver/oxc-types, once their environment access is routed throughSchema. This is a substantial change to the runtime’s relationship to the frontend (theruntime/AGENTS.md“the reasoner was not built here” tombstone framing and theoxc-runtime/oxc-serveintent nodes all need updating). IntroducingSchemaas the sole environment-access contract — with the build-time backend over ASTs and the runtime backend over the event-log projection — is the bulk of the engineering and lands as its own arc before the surface is wired. - Artifact identity + drift guard already exist; per-symbol identity is the residual. Artifact
identity (composition signature + per-section content hashes) and the version/tier load gates are
already built (
oxc-oxbin:composition_signature.rs,content_hash.rs,versioning.rs,validation.rs). The runtime backend reuses them. What remains is threading that identity to per-event/per-symbol resolution (the storage-sidemodule_idcollision gap) so two schemas’ symbol ids can’t alias — shared with the storage-identity fix, not additive. - New
Schema-backingModulesurface: name/type/parameter/world-assumption/refinement resolution over the CQRS catalog projection (additive; the facts are already in the.oxbindecl bodies — no new serialized representation, no embedded source). - New wire + CLI surface: a generic submission request shape and
ox query '<body>'/ REPL entry (exact shapes in the implementing PRs). - Spec/Lean: per the repo workflow, this is language-surface — RFD + reference draft → Lean →
code. The reference (
spec/reference/) gains an ad-hoc-submission section; the Lean substrate is unaffected in its semantics (an ad-hoc rule is just a rule), but the storage/runtime contract may need to record that evaluation admits request-sourced rules, and the security/opt-out posture should be described where the serving contract lives. - The “rejected by design” framing is retired in code comments, AGENTS nodes, and project memory.
Open questions
Decisions #1–#4 are settled above, and the resolution mechanism is settled as the query-provider
Schema interface (one checker, build-time backend over ASTs, runtime backend over the event-log
projection — the rustc/Go model). What remains genuinely open:
- The exact
Schemaoperation set. The minimal trait surface (it must cover name→declaration resolution, field/param types, subsumption edges, enum variants, world-assumption, and refinement metadata) and how much it reuses the indexesModulealready builds (concept_ids,relation_signatures, etc.) vs. adds. Identity/fingerprint is not open — the artifact already carries it (composition signature + section hashes); the backend keys on that. Lazy per-name materialization (the Idris.ttcpattern) is a future optimization, not needed for v1 sinceModulealready eagerly indexes the (small) schema. - The performance / prepared-body design (decision #3). The load-bearing open thread: compile-caching of recurring ad-hoc bodies (keyed by body hash + composition signature — the content-hash machinery already exists), whether a public prepared-IR fast-path is the correct surface, and how Salsa incrementality is reused at runtime. The IR-submission question is answered here, not in isolation.
- Materialization model. Ad-hoc reads today inherit the full in-memory
materialize_predicatesbuild (oxc-reasoning;SemiNaiveExecutor). The intended replacement — a content-addressed, generation-invalidated projection cache (the read-model section is already reserved in.oxbinand invalidation exists inoxc-storage-pgget_projection_cache, but it is not populated; the DBSP/IVM executor is drop-in-ready but gated) — is a real forward arc. The ad-hoc path should target that Engine/projection-cache seam rather than entrench the full-scan, and this overlaps the external/foreign-relation (“market oracle”) thread. - Standpoint / fork / bitemporal scoping. Ad-hoc bodies need the same
as_of/ standpoint / fork context as declared dispatch;query_body_dispatchcurrently refusesacross-standpoint parameterized bodies (oxc-runtime/src/lib.rs:6787). The ad-hoc path must reach full parity here, so that refusal is a gap to close, not a boundary.