feat(namespace-dir): implement update_table and delete_from_table#6923
Open
XuQianJin-Stars wants to merge 1 commit into
Open
feat(namespace-dir): implement update_table and delete_from_table#6923XuQianJin-Stars wants to merge 1 commit into
XuQianJin-Stars wants to merge 1 commit into
Conversation
Adds two DML methods that previously fell back to the trait default 'Not supported' implementations: - update_table: builds an UpdateBuilder by setting each [column, expression] pair from the request, applies the optional predicate via update_where, and returns the actual rows_updated and new dataset version. Validates the updates list (non-empty, two-element pairs, non-empty/duplicate-free column names) up front to surface clean InvalidInput errors. - delete_from_table: rejects empty/whitespace predicates with InvalidInput, delegates to Dataset::delete, and returns the new dataset version. The trait response model has no deleted_rows field, so we only surface the version (the Lance core DeleteResult.num_deleted_rows is intentionally dropped at the namespace boundary). Error mapping: - Predicate / expression parsing failures -> InvalidInput - Underlying lance_core::Error::InvalidInput from delete -> InvalidInput - Missing table -> TableNotFound (via load_dataset) - Other failures -> Internal Tests (7) added in dir::tests::e2e_table_version_tracking: - test_update_full_table - test_update_with_predicate - test_update_invalid_expression_returns_invalid_input - test_update_rejects_duplicate_columns - test_delete_with_predicate - test_delete_empty_predicate_returns_invalid_input - test_delete_table_not_found cargo fmt / clippy / test all green (dir::tests 122 passed, 0 failed).
Contributor
|
ACTION NEEDED The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification. For details on the error please inspect the "PR Title Check" action. |
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #6922
Summary
DirectoryNamespace(the filesystem-backedLanceNamespaceimplementation inrust/lance-namespace-impls/src/dir.rs) currently falls back to the trait'sdefault "Not supported" implementation for two DML operations. This PR wires
them up to Lance core, on par with what
Dataset::update/Dataset::deletealready provide for direct dataset access.
update_table(UpdateTableRequest) -> UpdateTableResponselance::dataset::write::UpdateBuilderdelete_from_table(DeleteFromTableRequest) -> DeleteFromTableResponseDataset::delete(predicate)Both methods are already exposed on the Java side (
DirectoryNamespace.javaupdateTable/deleteFromTable) and bylance-namespace-reqwest-client, sothis PR is Rust-only and unblocks the existing client surface.
Changes
update_tableUpdateBuilderby setting each[column, expression]pair fromthe request, applies the optional
predicateviaupdate_where, andreturns the actual
rows_updatedand the new datasetversion.updateslist up-front (non-empty list, exactly twoelements per pair, non-empty / duplicate-free column names) so callers get
a clean
InvalidInputinstead of an opaque DataFusion parse error.delete_from_tableInvalidInput.Dataset::deleteand returns the new datasetversion.deleted_rowsfield, so theDeleteResult.num_deleted_rowsreturned by Lance core is intentionallydropped at the namespace boundary (documented in code).
Error mapping
DataFusionError)InvalidInputlance_core::Error::InvalidInputfromdeleteInvalidInputload_dataset)TableNotFoundInternalTests
7 new unit tests in
dir::tests::e2e_table_version_tracking:test_update_full_table— full-table update bumps version by 1test_update_with_predicate— only matching rows updatedtest_update_invalid_expression_returns_invalid_inputtest_update_rejects_duplicate_columnstest_delete_with_predicate— row count decreases, version + 1test_delete_empty_predicate_returns_invalid_inputtest_delete_table_not_found