forked from lance-format/lance-graph
-
Notifications
You must be signed in to change notification settings - Fork 0
feat: Grammar unlocks — verb table, Disambiguator trait, Quantum mode, Animal Farm harness #283
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
AdaWorldAPI
merged 3 commits into
claude/grammar-fixes-r2-2026-04-29
from
claude/grammar-unlocks-r2-2026-04-29
Apr 29, 2026
Merged
Changes from all commits
Commits
Show all changes
3 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,98 @@ | ||
| //! Quantum mode — holographic phase-tagged variant of trajectory bundling. | ||
| //! | ||
| //! PR #279 outlook E8: same 16384-dim substrate as Crystal mode (Markov SPO | ||
| //! bundling), but with a phase-tag field on Trajectory and a 4th | ||
| //! WeightingKernel variant `Holographic`. Holographic mode trades structured | ||
| //! recoverability for higher-capacity superposition. | ||
| //! | ||
| //! Crystal vs Quantum is a knob, not a separate stack. | ||
| //! | ||
| //! META-AGENT: `pub mod quantum_mode;` in deepnsm/lib.rs. | ||
|
|
||
| /// 128-bit phase tag for holographic addressing. | ||
| /// See ladybug-rs hologram types + cross-repo harvest doc H7. | ||
| #[derive(Debug, Clone, Copy, PartialEq, Eq, Default)] | ||
| pub struct PhaseTag(pub u128); | ||
|
|
||
| impl PhaseTag { | ||
| pub fn pi() -> Self { | ||
| PhaseTag(u128::MAX / 2) | ||
| } | ||
|
|
||
| pub fn from_angle(theta: f32) -> Self { | ||
| // Normalize to [0, 1) first. | ||
| let normalized = ((theta / std::f32::consts::TAU).rem_euclid(1.0)) as f64; | ||
| // f64 has ~15 digits — enough headroom for u64 precision; we cast | ||
| // through u64 (u128::MAX as f32 overflows to infinity). | ||
| let scaled = (normalized * (u64::MAX as f64)) as u128; | ||
| // Place the u64 value in the low half, leave high half zero. | ||
| // For higher resolution, a future PR can fold an additional 64-bit | ||
| // entropy source into the upper half. | ||
| PhaseTag(scaled) | ||
| } | ||
|
|
||
| pub fn to_angle(self) -> f32 { | ||
| // Use the low 64 bits (the high 64 are reserved for future precision). | ||
| let low = (self.0 & u64::MAX as u128) as u64; | ||
| let normalized = (low as f64) / (u64::MAX as f64); | ||
| (normalized * std::f64::consts::TAU as f64) as f32 | ||
| } | ||
|
|
||
| pub fn distance(self, other: Self) -> u32 { | ||
| // Hamming on the 128-bit tag = phase distance proxy. | ||
| (self.0 ^ other.0).count_ones() | ||
| } | ||
| } | ||
|
|
||
| /// Holographic kernel variant. Use this when you want phase-coherent | ||
| /// superposition rather than amplitude-bundled accumulation. | ||
| #[derive(Debug, Clone, Copy)] | ||
| pub enum HolographicMode { | ||
| /// Single-phase carrier — one phase tag per trajectory. | ||
| SinglePhase, | ||
| /// Multi-phase per-role — each role slice carries its own phase. | ||
| /// Future: when role-keys grow phase-tagged variants. | ||
| PerRole, | ||
| } | ||
|
|
||
| #[cfg(test)] | ||
| mod tests { | ||
| use super::*; | ||
|
|
||
| #[test] | ||
| fn pi_phase_is_half_max() { | ||
| let p = PhaseTag::pi(); | ||
| assert!(p.0 > u128::MAX / 4 && p.0 < 3 * (u128::MAX / 4)); | ||
| } | ||
|
|
||
| #[test] | ||
| fn phase_distance_zero_for_self() { | ||
| let p = PhaseTag(12345); | ||
| assert_eq!(p.distance(p), 0); | ||
| } | ||
|
|
||
| #[test] | ||
| fn from_angle_round_trips_approximately() { | ||
| let theta = 1.5f32; | ||
| let p = PhaseTag::from_angle(theta); | ||
| let recovered = p.to_angle(); | ||
| // f64 intermediate gives sub-1e-3 round-trip; f32 final cast caps | ||
| // precision around 1e-6 of TAU (~6e-6 absolute). | ||
| let diff = (recovered - theta).abs(); | ||
| assert!(diff < 0.001, "round-trip diff {} exceeds tolerance 0.001", diff); | ||
| } | ||
|
|
||
| #[test] | ||
| fn default_is_zero_phase() { | ||
| let p: PhaseTag = Default::default(); | ||
| assert_eq!(p.0, 0); | ||
| } | ||
|
|
||
| #[test] | ||
| fn holographic_mode_is_copy() { | ||
| // Smoke test: enum is Copy so we can pass by value freely. | ||
| let m = HolographicMode::SinglePhase; | ||
| let m2 = m; | ||
| let _ = (m, m2); | ||
| } | ||
| } | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,107 @@ | ||
| //! Trajectory-as-statement-hash bridge. | ||
| //! | ||
| //! PR #279 outlook E4: a Trajectory's binarized fingerprint replaces audit.rs's | ||
| //! `statement_hash: u64` with a SEMANTIC hash. Two grammatically-equivalent | ||
| //! queries collide; queries that share grammatical structure with attack | ||
| //! patterns become Hamming-near-neighbor on the audit log. | ||
| //! | ||
| //! META-AGENT: add `pub mod trajectory_audit;` to deepnsm/lib.rs gated by | ||
| //! `feature = "audit-bridge" = ["dep:lance-graph-callcenter"]`. | ||
|
|
||
| use crate::trajectory::Trajectory; | ||
|
|
||
| /// Binarized 256-word (16384-bit) fingerprint of a trajectory. | ||
| /// Suitable as an `AuditEntry` semantic hash key. | ||
| pub type TrajectoryHash = [u64; 256]; | ||
|
|
||
| impl Trajectory { | ||
| /// Project the trajectory's continuous fingerprint into a 16384-bit | ||
| /// binary fingerprint via signed thresholding (+ → 1, - or 0 → 0). | ||
| pub fn binarize(&self) -> TrajectoryHash { | ||
| let mut bits = [0u64; 256]; | ||
| for (word_idx, chunk) in self.fingerprint.chunks(64).enumerate().take(256) { | ||
| let mut w = 0u64; | ||
| for (bit_idx, v) in chunk.iter().enumerate().take(64) { | ||
| if *v > 0.0 { | ||
| w |= 1u64 << bit_idx; | ||
| } | ||
| } | ||
| bits[word_idx] = w; | ||
| } | ||
| bits | ||
| } | ||
|
|
||
| /// 64-bit syntactic-fallback hash for use when the consumer wants a | ||
| /// scalar instead of a fingerprint. Folds the binarized fingerprint | ||
| /// into u64 via XOR-of-words. | ||
| pub fn audit_hash_u64(&self) -> u64 { | ||
| let bits = self.binarize(); | ||
| bits.iter().fold(0u64, |acc, w| acc ^ w) | ||
| } | ||
| } | ||
|
|
||
| /// Hamming distance between two trajectory hashes — how grammatically | ||
| /// similar two queries / sentences are. | ||
| pub fn trajectory_distance(a: &TrajectoryHash, b: &TrajectoryHash) -> u32 { | ||
| a.iter().zip(b.iter()).map(|(x, y)| (x ^ y).count_ones()).sum() | ||
| } | ||
|
|
||
| /// Threshold for "grammatically similar" — used by the audit log to flag | ||
| /// queries that share structure with known-attack patterns. | ||
| pub const GRAMMATICAL_SIMILARITY_THRESHOLD: u32 = 256; // ~1.5% of 16384 bits | ||
|
|
||
| #[cfg(test)] | ||
| mod tests { | ||
| use super::*; | ||
|
|
||
| fn make_trajectory(seed: f32, len: usize) -> Trajectory { | ||
| let fp: Vec<f32> = (0..len) | ||
| .map(|i| { | ||
| let x = (i as f32) * 0.1 + seed; | ||
| x.sin() | ||
| }) | ||
| .collect(); | ||
| Trajectory { | ||
| fingerprint: fp, | ||
| radius: 5, | ||
| } | ||
| } | ||
|
|
||
| #[test] | ||
| fn binarize_is_deterministic_for_same_trajectory() { | ||
| let t = make_trajectory(0.3, 16384); | ||
| let a = t.binarize(); | ||
| let b = t.binarize(); | ||
| assert_eq!(a, b); | ||
| } | ||
|
|
||
| #[test] | ||
| fn distance_is_zero_for_self() { | ||
| let h = [0u64; 256]; | ||
| assert_eq!(trajectory_distance(&h, &h), 0); | ||
| } | ||
|
|
||
| #[test] | ||
| fn distance_max_is_16384() { | ||
| let zeros = [0u64; 256]; | ||
| let ones = [u64::MAX; 256]; | ||
| assert_eq!(trajectory_distance(&zeros, &ones), 16384); | ||
| } | ||
|
|
||
| #[test] | ||
| fn audit_hash_u64_changes_with_content() { | ||
| let a = make_trajectory(0.0, 16384); | ||
| let b = make_trajectory(1.7, 16384); | ||
| assert_ne!(a.audit_hash_u64(), b.audit_hash_u64()); | ||
| } | ||
|
|
||
| #[test] | ||
| fn similar_trajectories_are_hamming_close() { | ||
| // Same shape, tiny shift — should be far below the threshold. | ||
| let a = make_trajectory(0.0, 16384); | ||
| let a2 = make_trajectory(0.0, 16384); | ||
| let ha = a.binarize(); | ||
| let ha2 = a2.binarize(); | ||
| assert_eq!(trajectory_distance(&ha, &ha2), 0); | ||
| } | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,135 @@ | ||
| //! Animal Farm forward-validation harness — D10 from the original plan. | ||
| //! | ||
| //! Scaffold only. The actual book text + ground-truth labels live in a | ||
| //! follow-up data PR. This file fixes the harness API + asserts the metric | ||
| //! shape we'll measure. | ||
| //! | ||
| //! META-AGENT: integration-test scaffold; no lib.rs wiring required — | ||
| //! `cargo test -p deepnsm --test animal_farm_harness` runs it. | ||
|
|
||
| #![cfg(test)] | ||
|
|
||
| #[derive(Debug, Clone)] | ||
| pub struct EpiphanyPrediction { | ||
| pub at_chapter: u32, | ||
| pub direction_phase: f32, | ||
| pub initial_truth_freq: f32, | ||
| pub initial_truth_conf: f32, | ||
| } | ||
|
|
||
| #[derive(Debug, Clone)] | ||
| pub struct GroundTruthBeat { | ||
| pub at_chapter: u32, | ||
| pub confirmed_direction: f32, | ||
| pub matched_predictions: Vec<u32>, | ||
| } | ||
|
|
||
| #[derive(Debug)] | ||
| pub struct HarnessMetrics { | ||
| pub epiphany_precision: f32, | ||
| pub epiphany_recall: f32, | ||
| pub arc_shift_f1: f32, | ||
| pub direction_accuracy: f32, | ||
| } | ||
|
|
||
| pub fn evaluate( | ||
| predictions: &[EpiphanyPrediction], | ||
| ground_truth: &[GroundTruthBeat], | ||
| ) -> HarnessMetrics { | ||
| if predictions.is_empty() { | ||
| return HarnessMetrics { | ||
| epiphany_precision: 0.0, | ||
| epiphany_recall: 0.0, | ||
| arc_shift_f1: 0.0, | ||
| direction_accuracy: 0.0, | ||
| }; | ||
| } | ||
| let confirmed: u32 = predictions | ||
| .iter() | ||
| .enumerate() | ||
| .filter(|(i, _)| { | ||
| ground_truth | ||
| .iter() | ||
| .any(|b| b.matched_predictions.contains(&(*i as u32))) | ||
| }) | ||
| .count() as u32; | ||
| let precision = confirmed as f32 / predictions.len() as f32; | ||
| let recall = confirmed as f32 / ground_truth.len().max(1) as f32; | ||
| HarnessMetrics { | ||
| epiphany_precision: precision, | ||
| epiphany_recall: recall, | ||
| arc_shift_f1: 2.0 * precision * recall / (precision + recall + 1e-9), | ||
| direction_accuracy: 0.0, // populated when phase-direction comparison lands | ||
| } | ||
| } | ||
|
|
||
| #[test] | ||
| fn harness_metrics_zero_for_empty_predictions() { | ||
| let m = evaluate(&[], &[]); | ||
| assert_eq!(m.epiphany_precision, 0.0); | ||
| assert_eq!(m.epiphany_recall, 0.0); | ||
| } | ||
|
|
||
| #[test] | ||
| fn harness_metrics_perfect_for_all_confirmed() { | ||
| let preds = vec![EpiphanyPrediction { | ||
| at_chapter: 3, | ||
| direction_phase: 0.0, | ||
| initial_truth_freq: 0.6, | ||
| initial_truth_conf: 0.5, | ||
| }]; | ||
| let gt = vec![GroundTruthBeat { | ||
| at_chapter: 5, | ||
| confirmed_direction: 0.0, | ||
| matched_predictions: vec![0], | ||
| }]; | ||
| let m = evaluate(&preds, >); | ||
| assert_eq!(m.epiphany_precision, 1.0); | ||
| assert_eq!(m.epiphany_recall, 1.0); | ||
| } | ||
|
|
||
| #[test] | ||
| fn harness_metrics_zero_when_no_predictions_match() { | ||
| let preds = vec![EpiphanyPrediction { | ||
| at_chapter: 1, | ||
| direction_phase: 0.0, | ||
| initial_truth_freq: 0.5, | ||
| initial_truth_conf: 0.5, | ||
| }]; | ||
| let gt = vec![GroundTruthBeat { | ||
| at_chapter: 2, | ||
| confirmed_direction: 0.0, | ||
| matched_predictions: vec![], // no matches | ||
| }]; | ||
| let m = evaluate(&preds, >); | ||
| assert_eq!(m.epiphany_precision, 0.0); | ||
| assert_eq!(m.epiphany_recall, 0.0); | ||
| } | ||
|
|
||
| #[test] | ||
| fn harness_f1_is_harmonic_mean() { | ||
| // 2 predictions, 1 confirmed -> precision 0.5 | ||
| // 1 ground-truth beat -> recall 1.0 | ||
| // F1 = 2 * 0.5 * 1.0 / (0.5 + 1.0) = 0.6666... | ||
| let preds = vec![ | ||
| EpiphanyPrediction { | ||
| at_chapter: 1, | ||
| direction_phase: 0.0, | ||
| initial_truth_freq: 0.5, | ||
| initial_truth_conf: 0.5, | ||
| }, | ||
| EpiphanyPrediction { | ||
| at_chapter: 2, | ||
| direction_phase: 0.0, | ||
| initial_truth_freq: 0.5, | ||
| initial_truth_conf: 0.5, | ||
| }, | ||
| ]; | ||
| let gt = vec![GroundTruthBeat { | ||
| at_chapter: 4, | ||
| confirmed_direction: 0.0, | ||
| matched_predictions: vec![0], | ||
| }]; | ||
| let m = evaluate(&preds, >); | ||
| assert!((m.arc_shift_f1 - 2.0 / 3.0).abs() < 1e-3); | ||
| } |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PhaseTag::pi()currently sets the full tag tou128::MAX / 2, butto_angle()only decodes the low 64 bits; that low half is all 1s foru128::MAX / 2, soPhaseTag::pi().to_angle()yields approximately2πinstead ofπ. Any caller usingpi()as a canonical opposite-phase constant will get the wrong angle and phase comparisons will be skewed. This should be encoded via the same low-half scheme asfrom_angle(e.g., half ofu64::MAXin the low 64 bits).Useful? React with 👍 / 👎.