-
Notifications
You must be signed in to change notification settings - Fork 3.4k
BigQuery: Eliminate redundant table load by using ETag for conflict detection #14940
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
875e191
f5cf2ec
a0e358e
2ce022f
dd85b96
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -185,26 +185,25 @@ public void failWhenEtagMismatch() throws Exception { | |
| } | ||
|
|
||
| @Test | ||
| public void failWhenMetadataLocationDiff() throws Exception { | ||
| public void failWhenConcurrentModificationDetected() throws Exception { | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. do you verify table is only loaded once?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thank you for the review Manu. Sorry about that, I have added verification to confirm table is loaded only once in this commit. |
||
| Table tableWithEtag = createTestTable().setEtag("etag"); | ||
| Table tableWithNewMetadata = | ||
| new Table() | ||
| .setEtag("etag") | ||
| .setExternalCatalogTableOptions( | ||
| new ExternalCatalogTableOptions() | ||
| .setParameters(ImmutableMap.of(METADATA_LOCATION_PROP, "a/new/location"))); | ||
|
|
||
| reset(client); | ||
| // Two invocations, for loadTable and commit. | ||
| when(client.load(TABLE_REFERENCE)).thenReturn(tableWithEtag, tableWithNewMetadata); | ||
| when(client.load(TABLE_REFERENCE)).thenReturn(tableWithEtag); | ||
|
|
||
| org.apache.iceberg.Table loadedTable = catalog.loadTable(IDENTIFIER); | ||
|
|
||
| when(client.update(any(), any())).thenReturn(tableWithEtag); | ||
| // Simulate concurrent modification detected via ETag mismatch | ||
| when(client.update(any(), any())) | ||
| .thenThrow(new CommitFailedException("Cannot commit: Etag mismatch")); | ||
|
|
||
| assertThatThrownBy( | ||
| () -> loadedTable.updateSchema().addColumn("n", Types.IntegerType.get()).commit()) | ||
| .isInstanceOf(CommitFailedException.class) | ||
| .hasMessageContaining("is not same as the current table metadata location"); | ||
| .hasMessageContaining("Cannot commit"); | ||
|
|
||
| // Verify table is loaded only once | ||
| verify(client, times(1)).load(TABLE_REFERENCE); | ||
| } | ||
|
|
||
| @Test | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this check removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your review Manu.
This check becomes redundant with caching.
Before:
With caching:
The ETag check in tables.patch catches the same conflict, so this check no longer adds value.