Skip to content

feat(mem_wal): cache opened L0 flushed-generation datasets#6816

Merged
jackye1995 merged 6 commits into
lance-format:mainfrom
hamersaw:feature/cache-l0-reads
May 20, 2026
Merged

feat(mem_wal): cache opened L0 flushed-generation datasets#6816
jackye1995 merged 6 commits into
lance-format:mainfrom
hamersaw:feature/cache-l0-reads

Conversation

@hamersaw

Copy link
Copy Markdown
Contributor

Problem

In the LSM scanner, every query against an L0 (frozen/flushed) generation re-opens that generation's Lance dataset from object storage. There are three identical cold-open sites — scan (planner.rs), point lookup (point_lookup.rs), and vector search (vector_search.rs) — each doing DatasetBuilder::from_uri(path).load() with no session. Per query, per flushed generation, this pays: manifest version discovery + manifest read + decode, file-metadata decode, and scalar/vector index load. For an LSM tree, frozen generations are the single best caching target, yet they were the only data source paying full cold-open cost on every query.

Key invariant

Flush writes each generation once to a globally-unique, content-addressed path (memtable/flush.rs). Same path ⟹ same bytes, forever — a cache hit can never be stale. This is the rare cache that needs no TTL and no invalidation for correctness; pruning is desirable only to reclaim memory.

Changes (OSS lance)

Two complementary, independently-useful, opt-in pieces — defaults preserve existing behavior exactly:

  1. with_session plumbing — thread an existing Arc<Session> into the scanner/planners so the first open of each generation populates and reuses the shared index + file-metadata caches. LsmScanner::new defaults this to the base table's session; without_base_table defaults to None.

  2. FlushedDatasetCache — a moka-backed, single-flight cache of Arc<Dataset> keyed by resolved flushed path, owned and sized by the consumer and injected per-request. After the first open, every subsequent query for that generation is a pure Arc::clone with zero object-store I/O. retain_paths(live_paths) prunes retired generations at compaction (memory-only; correctness never depends on it).

A single shared open_flushed_dataset(path, session, cache) helper replaces all three cold-open sites (repo rule: dedupe logic in 2+ places). None/None reproduces the original behavior precisely, so no existing test changes.

data_source.rs / collector.rs are untouched — opening stays lazy inside the planner, preserving bloom-filter pruning on point lookups. Planner wiring uses chainable with_session/with_flushed_cache builder methods rather than constructor changes, keeping new() signatures (and every existing test/bench) untouched.

Testing

  • New unit tests for FlushedDatasetCache: miss opens once; hit returns the same Arc (pointer eq); 16-way concurrent get_or_open opens exactly once (single-flight); retain_paths drops the right keys; no-cache path cold-opens each call.
  • Regression: full mem_wal::scanner suite (78 tests) passes untouched.
  • cargo clippy -p lance --tests --benches clean; cargo fmt clean.

Notes

The sophon consumer side (process-bootstrap cache ownership, scanner wiring, compaction retain_paths) is out of scope for this PR. Phase 1 (with_session) is independently shippable ahead of the cache.

🤖 Generated with Claude Code

In the LSM scanner, every query against an L0 flushed generation
re-opened that generation's Lance dataset from object storage at three
identical sites (scan, point lookup, vector search), paying manifest
read + metadata decode + index load each time.

Add two opt-in, non-breaking pieces:

- `with_session` plumbing on the scanner/planners so the first open of
  each generation populates and reuses the shared index/metadata
  caches (defaults to the base table's session).
- `FlushedDatasetCache`: a moka-backed, single-flight cache of
  `Arc<Dataset>` keyed by resolved flushed path, injected by the
  consumer. After the first open, subsequent queries are a pure
  `Arc::clone` with zero object-store I/O.

Flushed generations are written once to a globally-unique immutable
path, so cached entries are never stale and need no TTL; `retain_paths`
pruning at compaction is memory-only and correctness never depends on
it. A single shared `open_flushed_dataset` helper covers all three
sites; `None`/`None` reproduces the original cold-open exactly, so all
existing tests pass untouched.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

@claude claude Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

@github-actions github-actions Bot added the enhancement New feature or request label May 17, 2026
The cache-l0-reads change added `moka` to rust/lance and updated the
root Cargo.lock, but python/ is a separate cargo workspace with its
own lock. CI's "Lint Rust" step runs `cargo clippy --locked` from
python/ and failed at lock resolution before clippy could run.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot added the A-python Python bindings label May 17, 2026
/// The key is the resolved absolute flushed path
/// (`{base}/_mem_wal/{shard}/{folder}`), which is globally unique, so a single
/// cache can safely span multiple tables.
pub struct FlushedDatasetCache {

@jackye1995 jackye1995 May 18, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this. I think the Session plumbing is absolutely right, but I’m less convinced we should add FlushedDatasetCache to the Lance SDK.

My mental model is that each flushed memtable is already naturally a standalone Lance dataset. When that dataset is opened with the same Lance Session, the SDK-level caches should already cover the Lance-internal reusable state: object store registry/store reuse, file metadata cache, index cache, and index extensions. That part feels like the right SDK responsibility.

Caching the opened Dataset object itself feels like an application-level concern. The right owner of that cache is the calling service/application that knows the lifecycle of the L0 generations, compaction timing, memory budget, tenant/table boundaries, and whether a cache should be per-process, per-table, per-session, or scoped in some other way. I’d prefer not to make Lance SDK own that policy.

What do you think?

@jackye1995 jackye1995 merged commit e808eb1 into lance-format:main May 20, 2026
27 of 28 checks passed
@hamersaw hamersaw deleted the feature/cache-l0-reads branch May 20, 2026 20:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-python Python bindings enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants