feat: sql catalog support update table#862
Conversation
liurenjie1024
left a comment
There was a problem hiding this comment.
Thanks @Li0k for this pr, but I have concerning introducing update table at this moment as there are many missing features such as conflict detection, commit retry.
|
|
||
| /// Returns snapshot references. | ||
| #[inline] | ||
| pub fn snapshot_refs(&self) -> &HashMap<String, SnapshotReference> { |
There was a problem hiding this comment.
Why we add this method? We already have lookup method for snapshot
| /// TableCommit represents the commit of a table in the catalog. | ||
| #[derive(Debug, TypedBuilder)] | ||
| #[builder(build_method(vis = "pub(crate)"))] | ||
| #[builder(build_method(vis = "pub"))] |
There was a problem hiding this comment.
The reason we make TableCommit crate only is that we don't want to allow user to build it manually, all table commits construction should go through transaction api.
…/catalog_sql_update_table
…/catalog_sql_update_table
8dbbf3b to
8d0f168
Compare
| update_table_metadata_builder = table_update.apply(update_table_metadata_builder)?; | ||
| } | ||
|
|
||
| for table_requirement in requirements { |
There was a problem hiding this comment.
Shouldn't the requirements be checked in a transaction (that executes the update statement)? Otherwise a conflicting concurrent commit can update first and we end up in a broken table state.
The table metadata that's used to validate the requirements would also need to be loaded within the transaction.
There was a problem hiding this comment.
I believe it would also make sense to explicitly set a transaction isolation level of repeatable read. Postgres for example, defaults to read committed which can similarly get us into a broken table state:
read committed allows us to see different versions of the same row between the SELECT statement (that we use to validate the commit requirements) and the UPDATE statement. Effectively, a concurrently running conflicting update operation that commits between SELECT and UPDATE will still allow our UPDATE to succeed. We were not able to re-check the new table requirements but only checked the old ones -> we end up in a broken state.
With repeatable read on the other hand, the UPDATE should safely fail with a serialization error.
|
This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the dev@iceberg.apache.org list. Thank you for your contributions. |
|
This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If you think that is incorrect, or the pull request requires review, you can revive the PR at any time. |
This PR support
update_tableinterface for sql catalogupdate_tableOther PRs for reference:
After these PRs have been merged, we can use sql database as the catalog backend