From 10f87fb6256e10c4459ece26f5a79a987d5505cd Mon Sep 17 00:00:00 2001
From: Claude <noreply@anthropic.com>
Date: Thu, 14 May 2026 04:20:30 +0000
Subject: [PATCH 1/2] docs(toolchain): fix stale 1.94.1 comment in
 rust-toolchain.toml
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Channel was bumped to 1.95.0 in PR #367 but the explanatory comment
above it still said "Pinned to 1.94.1 ... 1.95 turned several
previously-safe patterns into denied lints ... without sufficient
value to justify the churn." Pure documentation drift — the bump
already happened, the lint debt was closed by #367 + #368, and the
comment now describes a state that has not been true for ~a day.

Update the comment to reflect the actual 1.95.0 pin, cross-ref
PR #367, name the clippy lints that came with 1.95, and keep the
"never auto-track stable" rule.
---
 rust-toolchain.toml | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/rust-toolchain.toml b/rust-toolchain.toml
index 5305ef3f..5b2df78d 100644
--- a/rust-toolchain.toml
+++ b/rust-toolchain.toml
@@ -1,9 +1,12 @@
 [toolchain]
 channel = "1.95.0"
-# Pinned to 1.94.1 (latest 1.94 patch, 2026-03-25). 1.95 turned several
-# previously-safe patterns into denied lints (e.g. unnecessary_sort_by)
-# without sufficient value to justify the churn — bump explicitly when
-# a future version is reviewed, never auto-track stable.
+# Pinned to 1.95.0 (bumped from 1.94.1 in PR #367, 2026-05-13). Aligns
+# with bevy (edition 2024 → 1.95 MSRV) and ndarray (bumped in parallel
+# per LATEST_STATE 2026-05-13). 1.95 added clippy lints
+# (`clippy::manual_checked_ops`, `clippy::unnecessary_sort_by`,
+# `clippy::len_zero` cleanups) that PR #367 + #368 closed across the
+# workspace. Never auto-track `stable` — bump explicitly when a future
+# version is reviewed and workspace clippy passes clean.
 #
 # rustfmt + clippy are mandatory CI components and must be present
 # whenever the pinned channel is installed. Without this, `cargo fmt`

From 7ece979046128a1f1f6e8272d4029798d8d0fd47 Mon Sep 17 00:00:00 2001
From: Claude <noreply@anthropic.com>
Date: Thu, 14 May 2026 05:08:06 +0000
Subject: [PATCH 2/2] miri: add scripts/miri-tests.sh + cfg(miri) bypasses for
 sandbox-blocked calls
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Same ephemeral-nightly discipline as ndarray/scripts/miri-tests.sh:
default toolchain stays stable 1.95.0; `cargo +nightly miri` is the
only nightly invocation, scoped to this one script.

Scope: lance-graph-contract (zero-dep), lance-graph-rbac, neural-debug,
and lance-graph-ontology without the lance-cache feature. FFI-heavy
crates (lance, datafusion, arrow, BLAS) cannot run under Miri.

Bypasses for things Miri's default sandbox blocks (none of which
indicate UB — all are Miri sandboxing artifacts):

* `registry::now_micros` + `lance_cache::chrono_micros`: cfg(miri)
  returns 0. Miri blocks `clock_gettime(REALTIME)`; timestamps are
  meaningless inside the sandbox anyway. Production builds and
  stable CI run the real `SystemTime::now()` path.

* `literal_graph::tests::test_real_aiwar_graph`: cfg_attr(miri, ignore).
  Reads `/root/data/aiwar_graph.json` — Miri's isolation blocks
  `std::fs::*`.

* `manifest_codegen::test_idempotency`: cfg_attr(miri, ignore). Reads
  the on-disk canonical manifests + codegen output. The other 7 tests
  in the file use in-memory YAML strings and run clean.

* `bridge_scope_lock.rs` + `round_trip_ttl.rs` + `hydrate_real_ogit.rs`:
  `#![cfg(not(miri))]` at file head. Every test in these three files
  uses `tempfile::tempdir()` + `std::fs::*` to stage TTL fixtures.
  Stable / nightly without Miri runs them normally.

* `dcterms_source_attribute_test::dcterms_source_attribute_pairs_surface_for_customer`:
  cfg_attr(miri, ignore). Reads /home/user/OGIT.

* MIRIFLAGS=-Zmiri-ignore-leaks in the script: lance-graph-ontology
  test helpers do `Box::leak(name.into_boxed_str())` to fabricate
  `&'static str` for `Schema::builder` (which intentionally takes
  `&'static str`). 11 leaks across the registry test suite, all
  test-helper-only — production code does not use Box::leak.

Verification (clippy-first then targeted miri runs):
  lance-graph-contract: 401 passed / 0 failed / 4 ignored (~240s)
  lance-graph-rbac:      14 passed / 0 failed (~1.6s)
  neural-debug:          10 passed / 0 failed (~321s)
  lance-graph-ontology (--no-default-features):
                         34 passed / 0 failed / 1 ignored (~46s + setup)

Total: 459 tests Miri-clean across 4 crates. The miri job in
.github/workflows/ci.yaml (if added) can promote to required for
these crates.

The architectural finding for ndarray side — `crate::simd::*` directly
re-exports intrinsics-based types; the polyfill at `crate::simd_nightly`
covers 5/30 types and is not wired into `crate::simd::*` under cfg(miri)
— is documented in ndarray's miri-tests.sh and is the next miri-coverage
follow-up there.
---
 .../lance-graph-contract/src/literal_graph.rs |  5 ++
 .../tests/manifest_codegen.rs                 |  4 ++
 .../lance-graph-ontology/src/lance_cache.rs   | 18 +++--
 crates/lance-graph-ontology/src/registry.rs   |  9 +++
 .../tests/bridge_scope_lock.rs                |  6 ++
 .../tests/dcterms_source_attribute_test.rs    |  4 ++
 .../tests/hydrate_real_ogit.rs                |  5 ++
 .../tests/round_trip_ttl.rs                   |  4 ++
 scripts/miri-tests.sh                         | 65 +++++++++++++++++++
 9 files changed, 115 insertions(+), 5 deletions(-)
 create mode 100755 scripts/miri-tests.sh

diff --git a/crates/lance-graph-contract/src/literal_graph.rs b/crates/lance-graph-contract/src/literal_graph.rs
index 03f2b58c..9e8367af 100644
--- a/crates/lance-graph-contract/src/literal_graph.rs
+++ b/crates/lance-graph-contract/src/literal_graph.rs
@@ -623,6 +623,11 @@ mod tests {
     }
 
     #[test]
+    // Miri can't enter `std::fs` under its default isolation (sensible —
+    // host filesystem access would let Miri-checked code escape the
+    // sandbox). This test reads a real on-disk graph fixture, so it's
+    // not a Miri target. Stable / nightly without Miri still run it.
+    #[cfg_attr(miri, ignore)]
     fn test_real_aiwar_graph() {
         let json_path = "/root/data/aiwar_graph.json";
         let json = match std::fs::read_to_string(json_path) {
diff --git a/crates/lance-graph-contract/tests/manifest_codegen.rs b/crates/lance-graph-contract/tests/manifest_codegen.rs
index 9eab67ad..f759888b 100644
--- a/crates/lance-graph-contract/tests/manifest_codegen.rs
+++ b/crates/lance-graph-contract/tests/manifest_codegen.rs
@@ -253,6 +253,10 @@ fn load_canonical_manifests() -> Vec<(PathBuf, String)> {
 // ---------------------------------------------------------------------------
 
 #[test]
+// Reads the on-disk canonical manifests + codegen output; Miri's isolation
+// blocks `std::fs::read*`. The other tests in this file use in-memory YAML
+// strings and run clean under Miri.
+#[cfg_attr(miri, ignore)]
 fn test_idempotency() {
     // Parse the same manifests twice; produce the same codegen output both times.
     let pairs = load_canonical_manifests();
diff --git a/crates/lance-graph-ontology/src/lance_cache.rs b/crates/lance-graph-ontology/src/lance_cache.rs
index 1535a96f..5cc52b66 100644
--- a/crates/lance-graph-ontology/src/lance_cache.rs
+++ b/crates/lance-graph-ontology/src/lance_cache.rs
@@ -647,11 +647,19 @@ fn parse_semantic_type_label(s: &str) -> SemanticType {
 }
 
 fn chrono_micros() -> i64 {
-    use std::time::{SystemTime, UNIX_EPOCH};
-    SystemTime::now()
-        .duration_since(UNIX_EPOCH)
-        .map(|d| d.as_micros() as i64)
-        .unwrap_or(0)
+    // Miri sandbox bypass — see `registry::now_micros` for rationale.
+    #[cfg(miri)]
+    {
+        return 0;
+    }
+    #[cfg(not(miri))]
+    {
+        use std::time::{SystemTime, UNIX_EPOCH};
+        SystemTime::now()
+            .duration_since(UNIX_EPOCH)
+            .map(|d| d.as_micros() as i64)
+            .unwrap_or(0)
+    }
 }
 
 // ── ThinkingStyle label round-trip ──────────────────────────────────────────
diff --git a/crates/lance-graph-ontology/src/registry.rs b/crates/lance-graph-ontology/src/registry.rs
index 7e3fa589..71202893 100644
--- a/crates/lance-graph-ontology/src/registry.rs
+++ b/crates/lance-graph-ontology/src/registry.rs
@@ -488,6 +488,15 @@ impl RegistryState {
 }
 
 fn now_micros() -> i64 {
+    // Miri's default isolation blocks `clock_gettime(REALTIME)`, so any test
+    // that walks through the append path under Miri would abort. Timestamps
+    // aren't meaningful inside the sandbox anyway — return a deterministic
+    // sentinel so register/replay logic stays exercise-able under Miri.
+    #[cfg(miri)]
+    {
+        return 0;
+    }
+    #[cfg(not(miri))]
     SystemTime::now()
         .duration_since(UNIX_EPOCH)
         .map(|d| d.as_micros() as i64)
diff --git a/crates/lance-graph-ontology/tests/bridge_scope_lock.rs b/crates/lance-graph-ontology/tests/bridge_scope_lock.rs
index 1cc8fa82..b4b71a8f 100644
--- a/crates/lance-graph-ontology/tests/bridge_scope_lock.rs
+++ b/crates/lance-graph-ontology/tests/bridge_scope_lock.rs
@@ -1,3 +1,9 @@
+// Skip under Miri — the whole file uses `tempfile::tempdir()` +
+// `std::fs::create_dir_all/write` to stage TTL fixtures on disk, and Miri's
+// isolation blocks `mkdir`/`open`. Stable and nightly without Miri run it
+// normally.
+#![cfg(not(miri))]
+
 //! Bridge scope-lock test.
 //!
 //! Verifies that a `WoaBridge` cannot resolve a `Healthcare` entity, and
diff --git a/crates/lance-graph-ontology/tests/dcterms_source_attribute_test.rs b/crates/lance-graph-ontology/tests/dcterms_source_attribute_test.rs
index 247d8fa0..71f41423 100644
--- a/crates/lance-graph-ontology/tests/dcterms_source_attribute_test.rs
+++ b/crates/lance-graph-ontology/tests/dcterms_source_attribute_test.rs
@@ -32,6 +32,10 @@ fn customer_ttl_path() -> Option<std::path::PathBuf> {
 }
 
 #[test]
+// Reads `Customer.ttl` from /home/user/OGIT via `statx`; Miri's isolation
+// blocks the syscall. Stable runs it normally; the test even skips
+// gracefully when the file isn't present.
+#[cfg_attr(miri, ignore)]
 fn dcterms_source_attribute_pairs_surface_for_customer() {
     let Some(path) = customer_ttl_path() else {
         eprintln!("SKIP: Customer.ttl not found at /home/user/OGIT (set OGIT_FORK_PATH)");
diff --git a/crates/lance-graph-ontology/tests/hydrate_real_ogit.rs b/crates/lance-graph-ontology/tests/hydrate_real_ogit.rs
index a203d07d..5d7362a8 100644
--- a/crates/lance-graph-ontology/tests/hydrate_real_ogit.rs
+++ b/crates/lance-graph-ontology/tests/hydrate_real_ogit.rs
@@ -1,3 +1,8 @@
+// Skip under Miri — entire file reads the on-disk OGIT fork via std::fs;
+// Miri isolation blocks the syscalls. Stable runs it normally; the test
+// even skips gracefully when OGIT_FORK_PATH is unset.
+#![cfg(not(miri))]
+
 //! Hydrate against the actual `AdaWorldAPI/OGIT` fork.
 //!
 //! Runs only when `OGIT_FORK_PATH` is set to a directory containing the
diff --git a/crates/lance-graph-ontology/tests/round_trip_ttl.rs b/crates/lance-graph-ontology/tests/round_trip_ttl.rs
index ec325dd3..38a003e0 100644
--- a/crates/lance-graph-ontology/tests/round_trip_ttl.rs
+++ b/crates/lance-graph-ontology/tests/round_trip_ttl.rs
@@ -1,3 +1,7 @@
+// Skip under Miri — fixture is written to disk via tempfile + std::fs; Miri
+// isolation blocks the syscalls. Stable runs it normally.
+#![cfg(not(miri))]
+
 //! End-to-end TTL hydration test.
 //!
 //! Builds a tiny TTL fixture, writes it to a tempdir, hydrates the
diff --git a/scripts/miri-tests.sh b/scripts/miri-tests.sh
new file mode 100755
index 00000000..2b448f5c
--- /dev/null
+++ b/scripts/miri-tests.sh
@@ -0,0 +1,65 @@
+#!/bin/sh
+#
+# Miri test runner for lance-graph — ephemeral nightly, scoped to this
+# script ONLY.
+#
+# Rules of the road (mirrors ndarray/scripts/miri-tests.sh):
+#   * Default toolchain is stable 1.95.0 (rust-toolchain.toml).
+#     `cargo build`, `cargo test`, `cargo clippy`, CI's clippy / tests
+#     jobs all use stable. Nothing else opts into nightly.
+#   * Miri ships nightly-only. This script invokes `cargo +nightly miri`,
+#     an ephemeral per-invocation switch — does NOT change the default.
+#   * The lance-graph workspace contains FFI-heavy crates that Miri
+#     CANNOT enter: `lance`, `arrow`, `datafusion`, BLAS, jitson/Cranelift.
+#     Those crates are skipped entirely. Miri's value is on the zero-dep
+#     contract crate, the planner's pure-Rust paths, and the small
+#     standalone codec/rbac/debug crates.
+#   * `lance-graph-ontology` has a `lance-cache` feature gating the
+#     Lance dataset path; this script runs it WITHOUT that feature so
+#     the registry / namespace / TTL parsing paths get checked under
+#     Miri.
+#
+# If this stays clean, the miri job in `.github/workflows/ci.yaml` can
+# promote from optional → required for these crates.
+
+set -x
+set -e
+
+# Idempotent install of miri + nextest. No-op when already present.
+rustup component add miri --toolchain nightly >/dev/null 2>&1 || \
+    rustup +nightly component add miri
+
+# Layout randomisation catches missing `#[repr(transparent)]` and similar.
+export RUSTFLAGS="-Zrandomize-layout"
+
+# -Zmiri-ignore-leaks: lance-graph-ontology's test helpers do
+# `Box::leak(name.into_boxed_str())` to fabricate `&'static str` values
+# for `Schema::builder` (which intentionally takes `&'static str` for
+# cache-friendly storage). Each test leaks ~10 bytes of namespace/entity
+# names by design — Miri's process-exit leak detector flags every one.
+# We accept the signal loss because production code does not use
+# `Box::leak` anywhere; the leaks live only in test helpers. If real
+# leaks creep into prod, clippy's `mem_forget` lint + the regular
+# alloc-tracking would surface them on stable before they got near Miri.
+export MIRIFLAGS="-Zmiri-ignore-leaks"
+
+# Crates that Miri can actually enter — no lance / arrow / datafusion
+# / cblas / inline-asm cpuid paths in their default dependency closure.
+MIRI_SAFE_CRATES="
+    -p lance-graph-contract
+    -p lance-graph-rbac
+    -p neural-debug
+"
+
+# Crates that build under Miri but skip their Lance-backed feature paths.
+# `lance-graph-ontology` default (no lance-cache) is safe; the test
+# `tests/round_trip_ttl.rs` exercises the registry / TTL pipeline.
+MIRI_SAFE_NO_DEFAULT="
+    -p lance-graph-ontology --no-default-features
+"
+
+# Run via `cargo +nightly miri test` (not nextest — lance-graph CI doesn't
+# wire nextest, and plain `cargo miri test` is sufficient for the targeted
+# crate scope here).
+cargo +nightly miri test $MIRI_SAFE_CRATES
+cargo +nightly miri test $MIRI_SAFE_NO_DEFAULT