feat: Foundry parity — RLS rewriter, audit log, PostgREST, with_registry#278
Conversation
Adds the Medcare variant + "medcare" prefix mapping to from_step_type, plus a smoke test verifying the variant is reachable. Also gitignores .claude/worktrees/ which contains ephemeral sub-agent isolation copies. https://claude.ai/code/session_01SbYsmmbPf9YQuYbHZN52Zh
…rane-plugins-{rls,audit})
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: da18308553
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| // Append to scan.filters — DataFusion AND-s these | ||
| // during execution and predicate-pushdown sees them as | ||
| // filterable expressions. | ||
| scan.filters.push(predicate); | ||
| Ok(Transformed::yes(LogicalPlan::TableScan(scan))) |
There was a problem hiding this comment.
Enforce RLS with a residual Filter node
Appending predicates only to TableScan.filters is not sufficient for correctness in DataFusion: those filters are pushdown candidates, and when a provider reports Unsupported or Inexact, DataFusion relies on a separate Filter node to enforce row filtering. This rewrite path adds no residual Filter, so RLS predicates can be partially or entirely skipped depending on table provider capabilities, which can leak cross-tenant rows.
Useful? React with 👍 / 👎.
| #[cfg(all(feature = "membrane-plugins-audit", feature = "audit-log"))] | ||
| audit: Option<Arc<dyn crate::audit::AuditSink>>, |
There was a problem hiding this comment.
Require Debug for audit trait object or drop derive
MembraneRegistry derives Debug while containing Option<Arc<dyn AuditSink>>, but AuditSink does not require Debug. When membrane-plugins-audit and audit-log are enabled together, this field makes the derive fail to compile (dyn AuditSink has no Debug impl), breaking that feature combination.
Useful? React with 👍 / 👎.
Review #1 — Brutally Honest Code ReviewPR landed. Now the honest pass. 2,688 LOC across 9 files, 35 tests, all The one thing that has to change before anyone builds on this🔴 CRITICAL — Fail-open RLS contradicts deny-by-default roadmap
These cannot both be true. The current implementation lets any This is the entire point of LF-3/DM-7. Both downstream consumers (SMB F8, MedCare RBAC) will build on top of this assumption. If we ship the fail-open behaviour, we are shipping the wrong thing. Solution: pub enum RegistryMode { FailOpen, Sealed }
impl RlsPolicyRegistry {
pub fn sealed(self) -> Self { ... } // any unregistered scan → DataFusionError
pub fn fail_open(self) -> Self { ... } // current behaviour, must be explicit
}Default constructor returns High-severity: ship-blockers for production use🟠 HIGH —
|
| # | File | Issue | Fix |
|---|---|---|---|
| 1 | rls.rs:169 |
RlsRewriter has pub fields; mutable after construction |
Privatize, add getters |
| 2 | audit.rs |
DefaultHasher is NOT stable across Rust versions |
Switch to FNV/xxhash, or document this as known-broken across builds |
| 3 | audit.rs |
rls_predicates_added: u8 caps at 255, wraps silently |
u16 |
| 4 | lance_membrane.rs:338 |
intent.role as u8 truncates if ExternalRole grows past 255 variants |
u8::try_from |
| 5 | lance_membrane.rs |
Ordering::Relaxed on current_scent reads |
Acquire/Release pair, or document single-thread invariant |
| 6 | postgrest.rs |
Table name not validated — parse_path("../../../etc/passwd") succeeds |
Reject non-[A-Za-z0-9_] chars |
| 7 | postgrest.rs:191 |
ParseError::new is private but struct field is public — inconsistent |
Either pub new or From<String> |
| 8 | lib.rs:79 |
drain module unconditionally public, others gated |
Add feature gate or document |
| 9 | medcare-foundry-vision.md:130 |
"Confidence-based RLS column masking" — not implemented; aspirational | Mark as F2+ explicitly, not F1 |
Low — cleanup
rls.rsno test for "policy with both columns None" (degenerate path covered in code, untested)postgrest.rsno integration tests forilike,in,is,likeops inparse_path(onlyeq/gte)postgrest.rsJSON escape doesn't handle surrogate pairs above U+FFFF (edge case, document)lib.rs:76commented-outphoenixmodule — remove or link tracking issueCargo.tomldatafusion 52 vs CLAUDE.md says 51 — update CLAUDE.mdorchestration.rsnoDisplayimpl onStepDomain— every other enum has one
What's actually good
- DataFusion
OptimizerRuleintegration is correct.ApplyOrder::TopDown,Transformed::yes/nosemantics, predicate AND-injection onTableScan— all by the book. - Parameterized literals via
lit(...)mean no SQL injection is possible at the rewriter layer. This is the one thing that actually had to be right and it is. - 35 tests pass, including JOIN with two policies, existing-filter preservation, and registry CRUD. Test coverage on the rewriter's actual query-rewrite logic is adequate.
- Audit log poisoned-mutex recovery is correctly implemented (F-09 pattern from the August fix).
- PostgREST hand-rolled JSON emitter handles control chars, quotes, backslashes correctly.
MembraneRegistrybuilder pattern is clean — idempotentwith_rls/with_audit, second call replaces,registry()accessor.- Feature flag layering on
lance-graph-callcenteris correct:audit-log,postgrest,membrane-plugins-rls,membrane-plugins-auditcompose without conflict.
Recommended PR sequence to fix
- PR-278a — Sealed RLS registry (CRITICAL). Add
RegistryMode::Sealeddefault. Update roadmap to match implementation. Add CI check that no production wiring usesfail_open()without explicit annotation. - PR-278b — Hardening (HIGH). Empty-string check on
RlsContext::new. SwapVec→VecDequein audit. Percent-decode in postgrest. Resolvegate_f/free_eduplication. - PR-278c — Polish (MEDIUM). All 9 medium items in one batch.
- PR-278d — Test gaps (LOW). Missing test cases.
Estimated total: ~600 LOC across 4 small PRs over 1 week.
Verdict
The skeleton is right. The DataFusion integration is right. The feature-flag layout is right. The fail-open default and the duplicated gate_f field are the two things that have to change before SMB F8 or MedCare RBAC builds on top of this. Everything else is medium-impact cleanup that can ride on follow-up PRs.
This is solid foundation work with one wrong default. Fix the default; ship it.
Review #2 — Foundry Parity Future Outlook + EpiphaniesThe honest review (above) is the floor. Here's the ceiling — what this PR actually unlocks if we follow it through, and where the genuine architectural wins are hiding. The Foundry Parity Map (where we actually are)
The parity map shows three honest tiers:
A consumer reading the merged PR + roadmap will hit the third tier first because the vision doc dwells on it. Fix the doc to mark aspirational items explicitly so SMB/MedCare don't plan around features that don't exist. Epiphanies of PotentialE1 — RLS-as-Optimizer-Rule lets us unify ALL access policyThe DataFusion
This means PR #278's E2 —
|
| Capability | Status | What's missing | Estimated PRs |
|---|---|---|---|
| Ontology-as-data | Partial (orchestration.rs Pearl mask) | Schema-bound triple types, type-inference query planner | 3-4 |
| Object-typed access control | Not started | Object-graph traversal predicates (not just row predicates) | 2-3 |
| Pipeline lineage | Not started | DAG capture from Lance dataset version + RLS rewriter trace | 2 |
| Code-as-data hooks | Aspirational | UDF registration + sandboxing | 4-5 |
| Operational ontology | Aspirational | Real-time triple update propagation + invalidation | 5-6 |
| Foundry Workshop UI | Out of scope | Cypher cockpit (other PR track) | — |
| Foundry AIP | Aspirational | LLM tail integration with audit-aware RAG | 3-4 |
Roughly 20-25 PRs to genuine Foundry parity at the data/policy layer. The scaffolding in #278 reduces that count from "infinite" to "tractable." That's the win.
Risk register update
- R1 (calibration drift): RLS confidence thresholds inherit from PR feat: grammar/crystal contract + AriGraph episodic unbundling #208's
UNBUNDLE_HARDNESS_THRESHOLD = 0.88. That number was tuned for episodic memory unbundling, NOT access control. It will be wrong for Medcare. Track as F1 calibration task. - R2 (DataFusion version churn): We just bumped 51→52 in PR feat: bump lance 2→4 + datafusion 51→52 + deltalake 0.30→0.31 #273. Foundry parity demands long-term API stability; if DataFusion 53 ships breaking changes to
OptimizerRule, every downstream consumer breaks. Pin or vendor. - R3 (audit log retention costs): O(n) ring buffer at 1024 entries in-memory is the current floor. At 1k QPS for 6 years, retention requires a Lance-backed durable writer that doesn't exist yet. The roadmap mentions it; the code doesn't.
- R4 (PostgREST surface drift): PostgREST itself is a moving target. Pinning to a specific API version + documenting unsupported features (RPC, Realtime) is a chore that nobody on the SMB or MedCare side has signed up for yet.
Next 4 PRs (in priority order)
- PR-278a Sealed RLS registry — un-block downstream consumers from inheriting fail-open default.
- PR-Lance-audit — Lance-backed durable
AuditSinkimpl. Replaces the in-memory ring for production. - PR-RLS-column-mask — extend
RlsRewriterto handle column masking (E1 epiphany). One file, ~200 LOC. - PR-Postgrest-dispatch — wire
PostgRestHandlerto a real DataFusion query executor. Brings the stub to MVP.
These four PRs collectively turn #278 from "interface frozen, not production-ready" to "F1-deployable on a single tenant." Estimated 3 weeks of work.
The bottom line
PR #278 is the right shape with the wrong default and a few duplicate fields. Fix the default, ship the four follow-ups, and we have something Palantir doesn't: a policy layer that's swappable across storage backends and inspectable down to the optimizer-rule level.
The substrate is more important than the mistakes. The mistakes are easy.
Sprint C agent (PR #311) flagged five staleness items in the vision doc that were out of its §7-only scope. Closing the debt now: Header DRAFT - pending review (2026-04-28) -> Status: F1 parity shipped 2026-04-30. F1 latency benchmark not yet started. F2 is a posture, not a delivery. §2 anchor as of 2026-04-28 -> as of 2026-04-30 (post-F1 parity ship) §2 latency cell Designed to match; F1 numbers (forward tense) -> Designed to match; benchmark pending §2 caveat F1 publishes the first numbers (forward tense) -> F1 parity has shipped (correctness); the separately-scoped F1 latency benchmark has not been started. Distinguishes the two sub-deliverables explicitly. §3 F1 We stand up a Foundry instance... (forward) -> Shipped 2026-04-30. Cross-link to §7's as-shipped architecture. §3 F2 gated upstream by lance-graph PR-1 / PR-2 -> lance-graph PR #278 + #280 + #284 (RLS) and PR #278 + #302 (audit). Status today: lance-graph in production; medcare-rs adopter not yet open. Posture, not delivery. §3 F3 gated upstream by lance-graph PR-4 -> lance-graph PR #278 + #280 (parser + hardening). Status today: parser stub on lance-graph main; medcare-rs adopter is future round-2 work. §4 benchmark harness lands as part of F1 F1 numbers are published (both forward tense) -> F1 parity (correctness) shipped; F1 latency benchmarking has not been started. The two are separately-scoped F1 sub-deliverables. What this PR does NOT touch: - F4, F5, §5 (risks), §6 (NOT promising), §7 (next deliverable just landed in PR #311 - clean already). - The vision doc's tone rule. Every change cites a concrete PR number or file path; no marketing language introduced. - Performance numbers. None claimed; the §4 'do not quote unbenchmarked numbers' rule is preserved verbatim. Diff: +41 / -26 across 1 file. Markdown renders cleanly. Cross-link: PR #311 (the §7 fix that motivated this cleanup).
Summary
OptimizerRule— tenant/actor predicate injection on everyTableScan. Unblocks SMB F8 + MedCare RBAC simultaneously.AuditSinktrait +InMemoryAuditSinkring buffer with poison recovery).parse_path()+EchoHandlerdispatcher, 20 tests, no HTTP deps.foundry-roadmap.md+medcare-foundry-vision.mddrafts under.claude/.New Cargo features on
lance-graph-callcenteraudit-log— gatespub mod auditpostgrest— gatespub mod postgrestmembrane-plugins-rls/membrane-plugins-audit— gates builder integrationAll feature combos pass
cargo check; 35 lib tests pass underauth-rls-lite.Diff stats
Test plan
cargo check -p lance-graph-callcenter(default, query-lite, auth-rls-lite, audit-log, postgrest, all-features)cargo test -p lance-graph-callcenter --features auth-rls-lite --lib→ 35 passedhttps://claude.ai/code/session_01SbYsmmbPf9YQuYbHZN52Zh