feat(F3): LanceAuditSink with temporal timestamps + full schema round-trip by AdaWorldAPI · Pull Request #302 · AdaWorldAPI/lance-graph

AdaWorldAPI · 2026-04-30T02:41:12Z

Summary

LanceAuditSink implementing AuditSink trait — writes AuditEntry rows to a Lance dataset via Arrow RecordBatch append
Temporal timestamp: DataType::Timestamp(Millisecond, Some("UTC")) — DataFusion temporal predicates work (WHERE ts > now() - 7d)
Full schema: tenant_id (Utf8), actor_id (Utf8), statement_hash (UInt64), timestamp (Timestamp/ms/UTC), action (Utf8), rls_predicates_added (UInt16), rewritten_plan (Utf8 nullable) — no silent column drops
Limit pushdown: scan_back(n) uses scanner.limit(Some(n), Some(skip)) instead of full-scan-then-slice. O(1) not O(N).
Feature-gated behind audit-log; tokio/arrow/lance/futures deps only activate under the flag

Review notes

Initial implementation had Int64 timestamp, missing columns (rls_predicates_added, rewritten_plan silently zeroed), and full-scan footgun. All three fixed.
Lance v4 Scanner::limit(Option<i64>, Option<i64>) verified at source line 1344.

Test plan

test_timestamp_column_is_temporal_type — schema assertion (failing-first proven)
test_round_trip_preserves_rls_predicates_added_and_rewritten_plan — non-default values survive (failing-first proven)
flush_10_then_scan_back_roundtrip — enriched with non-default rls_predicates_added
14 audit tests pass, 77 callcenter tests pass

https://claude.ai/code/session_01NYGrxVopyszZYgLBxe4hgj

Generated by Claude Code

…writer (PR-F3) Implements LanceAuditSink behind the `audit-log` feature flag (now pulls lance + arrow + futures + tokio). Entries are buffered via append(), flushed to a Lance dataset in append mode, and read back via scan_back(n). Schema: tenant_id (Utf8), actor_id (Utf8), statement_hash (UInt64), timestamp (Int64), action (Utf8). Tests: flush-10 roundtrip, flush-1000 count, multi-flush accumulation, empty-flush noop. All 75 tests pass (4 new Lance tests + 71 existing). https://claude.ai/code/session_01NYGrxVopyszZYgLBxe4hgj

…lance limit-pushdown

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 928c5d7f8e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-30T02:43:44Z

+        let entries: Vec<AuditEntry> = {
+            let mut guard = self.buffer.lock().unwrap_or_else(|e| e.into_inner());
+            std::mem::take(&mut *guard)
+        };


Preserve buffered entries until flush write succeeds

flush() drains the in-memory buffer with std::mem::take before any I/O, so if Dataset::write(...) later fails (e.g., transient object-store/network error), those audit records are dropped permanently instead of being retried. This creates silent data loss in exactly the failure scenarios an audit sink must tolerate.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-04-30T02:43:44Z

+        let ds = Dataset::open(&self.dataset_path)
+            .await
+            .map_err(|e| format!("Lance open error: {e}"))?;


Handle missing dataset as empty in scan_back

scan_back() immediately errors if the dataset path has not been created yet, which happens before the first successful flush(). In that common startup path, callers cannot read history and must special-case errors, even though there are simply zero persisted entries; returning an empty vector for not-found would match the method contract better and avoid spurious failures.

Useful? React with 👍 / 👎.

Sprint C agent (PR #311) flagged five staleness items in the vision doc that were out of its §7-only scope. Closing the debt now: Header DRAFT - pending review (2026-04-28) -> Status: F1 parity shipped 2026-04-30. F1 latency benchmark not yet started. F2 is a posture, not a delivery. §2 anchor as of 2026-04-28 -> as of 2026-04-30 (post-F1 parity ship) §2 latency cell Designed to match; F1 numbers (forward tense) -> Designed to match; benchmark pending §2 caveat F1 publishes the first numbers (forward tense) -> F1 parity has shipped (correctness); the separately-scoped F1 latency benchmark has not been started. Distinguishes the two sub-deliverables explicitly. §3 F1 We stand up a Foundry instance... (forward) -> Shipped 2026-04-30. Cross-link to §7's as-shipped architecture. §3 F2 gated upstream by lance-graph PR-1 / PR-2 -> lance-graph PR #278 + #280 + #284 (RLS) and PR #278 + #302 (audit). Status today: lance-graph in production; medcare-rs adopter not yet open. Posture, not delivery. §3 F3 gated upstream by lance-graph PR-4 -> lance-graph PR #278 + #280 (parser + hardening). Status today: parser stub on lance-graph main; medcare-rs adopter is future round-2 work. §4 benchmark harness lands as part of F1 F1 numbers are published (both forward tense) -> F1 parity (correctness) shipped; F1 latency benchmarking has not been started. The two are separately-scoped F1 sub-deliverables. What this PR does NOT touch: - F4, F5, §5 (risks), §6 (NOT promising), §7 (next deliverable just landed in PR #311 - clean already). - The vision doc's tone rule. Every change cites a concrete PR number or file path; no marketing language introduced. - Performance numbers. None claimed; the §4 'do not quote unbenchmarked numbers' rule is preserved verbatim. Diff: +41 / -26 across 1 file. Markdown renders cleanly. Cross-link: PR #311 (the §7 fix that motivated this cleanup).

claude added 2 commits April 29, 2026 20:10

refactor(F3): timestamp temporal type + missing columns round-trip + …

928c5d7

…lance limit-pushdown

chatgpt-codex-connector Bot reviewed Apr 30, 2026

View reviewed changes

AdaWorldAPI merged commit 830c266 into main Apr 30, 2026
1 of 5 checks passed

AdaWorldAPI mentioned this pull request Apr 30, 2026

feat(G4): verb_table tense modulation (Quirk CGEL grounded) #306

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(F3): LanceAuditSink with temporal timestamps + full schema round-trip#302

feat(F3): LanceAuditSink with temporal timestamps + full schema round-trip#302
AdaWorldAPI merged 2 commits into
mainfrom
claude/pr-f3-lance-audit-writer

AdaWorldAPI commented Apr 30, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Apr 30, 2026

Uh oh!

chatgpt-codex-connector Bot Apr 30, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

AdaWorldAPI commented Apr 30, 2026

Summary

Review notes

Test plan

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants