HIVE-26882: Allow transactional check of Table parameter before altering the Table#3888
Conversation
| private void alter_table_core(String catName, String dbname, String name, Table newTable, | ||
| EnvironmentContext envContext, String validWriteIdList, List<String> processorCapabilities, String processorId) | ||
| EnvironmentContext envContext, String validWriteIdList, List<String> processorCapabilities, | ||
| String processorId, String expectedPropertyKey, String expectedPropertyValue) |
There was a problem hiding this comment.
Did you consider allowing a list of table properties here?
There was a problem hiding this comment.
Iceberg does not explicitly need this, and it would somewhat complicate the code. So I voted down this. Still I do not feel too strongly either way, so if you think it could be useful, I am open to add multiple checks based on a map (do we have map in thrift?)
There was a problem hiding this comment.
After some more thought, I would opt for keeping the PR as it is - checking only for a single property.
If someone would need more properties to check, then they still could create an uber property - as a hash of the relevant properties -, and use the EXPECTED_PARAMETER_KEY and EXPECTED_PARAMETER_VALUE to check for the changes of this uber property. And if it is that hard for them, they could still add a new feature later.
There was a problem hiding this comment.
Though I still think its cleaner to support multi-properties transactional check, given Table has properties map, I'm leave it to you and @prasanthj
| environmentContext.getProperties().get(hive_metastoreConstants.EXPECTED_PARAMETER_VALUE) : null; | ||
| if (expectedKey != null && expectedValue != null | ||
| && !expectedValue.equals(oldt.getParameters().get(expectedKey))) { | ||
| throw new MetaException("The table has been modified. The parameter value for key '" + expectedKey + "' is '" |
There was a problem hiding this comment.
For my understanding, this only works if a user has table lock while calling alter right?
If user does not have lock, Hive has no internal lock to prevent two users from both getting their expected value and proceeding right?
There was a problem hiding this comment.
Thankfully this is not true.
HMS does these changes using the underlying DB’s transaction.
See:
There was a problem hiding this comment.
It turns out, that you have been right. We use READ_COMMITTED isolation level for our commits.
Because of this, we could commit concurrently, and lose data.
I had to set the isolation level for these transactions to REPEATABLE_READ explicitly.
@szehon-ho and @prasanthj could you please check it if you are back from holidays?
Thanks,
Peter
There was a problem hiding this comment.
Thanks for checking. Is this a setting user can configure in Hive? If so, it sounds like we should document it as a pre-requisite for user to enable this feature
There was a problem hiding this comment.
Users does not have to do anything. We set the isolation level per transaction automatically from the code. Similarly as we do in the TxnHandler methods where the default isolation level is not enough.
There was a problem hiding this comment.
This is the relevant code part:
msdb.openTransaction(Constants.TX_REPEATABLE_READ)
|
Pushed to master. |
…ing the Table (apache#3888) (Peter Vary reviewed by Prasanth Jayachandran and Szehon Ho)
…ing the Table (apache#3888) (Peter Vary reviewed by Prasanth Jayachandran and Szehon Ho)
…ing the Table (apache#3888) (Peter Vary reviewed by Prasanth Jayachandran and Szehon Ho)
…ing the Table (apache#3888) (Peter Vary reviewed by Prasanth Jayachandran and Szehon Ho)
…ing the Table (apache#3888) (Peter Vary reviewed by Prasanth Jayachandran and Szehon Ho)
…ing the Table (apache#3888) (Peter Vary reviewed by Prasanth Jayachandran and Szehon Ho)
…ing the Table (apache#3888) (Peter Vary reviewed by Prasanth Jayachandran and Szehon Ho)
…ing the Table (#3888) (Peter Vary reviewed by Prasanth Jayachandran and Szehon Ho) (#3946) * HIVE-17981 Create a set of builders for Thrift classes. This closes #274. (Alan Gates, reviewed by Peter Vary) * HIVE-18355: Add builder for metastore Thrift classes missed in the first pass - FunctionBuilder (Peter Vary, reviewed by Alan Gates) * HIVE-18372: Create testing infra to test different HMS instances (Peter Vary, reviewed by Marta Kuczora, Vihang Karajgaonkar and Adam Szita) * HIVE-26882: Allow transactional check of Table parameter before altering the Table (#3888) (Peter Vary reviewed by Prasanth Jayachandran and Szehon Ho) Co-authored-by: Alan Gates <gates@hortonworks.com> Co-authored-by: Peter Vary <pvary@cloudera.com> Co-authored-by: Peter Vary <peter_vary4@apple.com>
…ing the Table (#3888) (Peter Vary reviewed by Prasanth Jayachandran and Szehon Ho) (#3947) * HIVE-17981 Create a set of builders for Thrift classes. This closes #274. (Alan Gates, reviewed by Peter Vary) * HIVE-18355: Add builder for metastore Thrift classes missed in the first pass - FunctionBuilder (Peter Vary, reviewed by Alan Gates) * HIVE-18372: Create testing infra to test different HMS instances (Peter Vary, reviewed by Marta Kuczora, Vihang Karajgaonkar and Adam Szita) * HIVE-26882: Allow transactional check of Table parameter before altering the Table (#3888) (Peter Vary reviewed by Prasanth Jayachandran and Szehon Ho) Co-authored-by: Alan Gates <gates@hortonworks.com> Co-authored-by: Peter Vary <pvary@cloudera.com> Co-authored-by: Peter Vary <peter_vary4@apple.com>
…ing the Table (apache#3888) (Peter Vary reviewed by Prasanth Jayachandran and Szehon Ho)
…ing the Table (apache#3888) (Peter Vary reviewed by Prasanth Jayachandran and Szehon Ho)
…ing the Table (apache#3888) (Peter Vary reviewed by Prasanth Jayachandran and Szehon Ho) (apache#3944) (#5) Co-authored-by: pvary <peter.vary.apache@gmail.com>
…ing the Table (apache#3888) (Peter Vary reviewed by Prasanth Jayachandran and Szehon Ho) (apache#3944) (#5) Co-authored-by: pvary <peter.vary.apache@gmail.com>
…ing the Table (apache#3888) (Peter Vary reviewed by Prasanth Jayachandran and Szehon Ho) (apache#3944)
HIVE-26882: Allow transactional check of Table parameter before altering the Table (apache#3888) (Peter Vary reviewed by Prasanth Jayachandran and Szehon Ho) (apache#3944)
What changes were proposed in this pull request?
Adding the possibility to transactionally check if a Table parameter is changed before altering the table in the HMS.
Why are the changes needed?
This would provide an alternative, less error-prone and faster way to commit an Iceberg table, as the Iceberg table currently needs to:
After the change these 4 HMS calls could be substituted with a single alter table call.
Also we could avoid cases where the locks are left hanging by failed processes
Does this PR introduce any user-facing change?
HMS API changes:
expectedParameterKey, andexpectedParameterValueto theAlterTableRequestobjecthive_metastoreConstants.EXPECTED_PARAMETER_KEYandhive_metastoreConstants.EXPECTED_PARAMETER_VALUEthrough theEnvironmentContextHow was this patch tested?
Added 2 new unit tests