Hive: Refactor commit lock mechanism from HiveTableOperations#6648
Conversation
| public interface HiveLock { | ||
| void lock() throws LockException; | ||
|
|
||
| void ensureActive() throws LockException; | ||
|
|
||
| void unlock(); | ||
| } |
There was a problem hiding this comment.
There's already a LockManager interface with acquire/release and a BaseLockManager which has heartbeating mechanisms would it make sense to leverage the existing abstractions?
There was a problem hiding this comment.
In #6570 I have described my reasoning behind not using the existing LockManager interface:
We already have LockManager interface defined in the iceberg-api module. After some back-and-forth I decided against using it, because of the following reasons:
- I do not think anyone would like to use HiveLockManager without HiveCatalog
- The interface is not that useful for us:
- We would need to keep track of the HMS lockId internally
- We would need to update the LockManager if the
setConfmethod is called on the HiveCatalog - We would need to add something like
ensureActiveto the interface which is needed for HiveTableOperations
- The BaseLockManager does not provide too much of a functionality
- The current configuration keys are different from the ones used by the LockManager implementations
@szehon-ho and @amogh-jahagirdar, I am happy to change the PR if we can find a good solution. Could you please help me what should be the main approach?
- Shall we keep the
LockManageras it is and create/useHiveLockManagerwith extra methods, or addensureAcitvemethod to theLockMangerinterface? - Shall we keep the current config for the hive locks or shall we use the new ones instead, or keep both configs and create a deprecation handling for it?
- Shall we use the
schedulerfor defined with theBaseLockManagerand accept the possibility that if the pool is not big enough for the JVM then there will be no heartbeat for the new commits?
Thanks,
Peter
There was a problem hiding this comment.
@amogh-jahagirdar, @szehon-ho: What do you think? Are any of the options above attractive enough to pursue? Or any other ideas?
There was a problem hiding this comment.
@pvary sorry for the late reply so here are my thoughts:
1.) Shall we keep the LockManager as it is and create/use HiveLockManager with extra methods, or add ensureAcitve method to the LockManger interface?
So after reading the code it seems like LockManager was designed in a manner where the heartbeating is assumed to be happening under the hood for the implementations which require heartbeats. If I'm not mistaken ensureActive is used at certain points to check the heartbeat status which is specific for HiveLock.
What we could do here is have the separate HiveLockManager interface but it still extends BaseLockManager so we inherit acquire/release, and then the ensureActive is specific to HiveLockManager. We shouldn't have to have a separate lock/unlock imo because acquire/release are already on the LockManager itself, and those are the same operations.
let me know if that makes sense!
2.) Shall we keep the current config for the hive locks or shall we use the new ones instead, or keep both configs and create a deprecation handling for it?
I'm not super opinionated on this tbh, the current ones make sense to me. If we do decide to add new ones, having a deprecation path then should happen, rather than changing it all at once.
3.) Shall we keep the LockManager as it is and create/use HiveLockManager with extra methods, or add ensureAcitve method to the LockManger interface?
The current LockManager implementation silently accepts the job, and does not run the heartbeat until there is an empty thread. HiveTableOperations has its own thread and handles this correctly.
This seems like incorrect behavior in the BaseLockManager then, I think we should always guarantee that if the heartbeat thread cannot start we fail. So I think ideally if possible we fix it in the BaseLockmanager itself. In my mind we also add a method withHeartbeatTask(Runnable heartbeatTask).
I think @jackye1995 did the original implementation for BaseLockManager, I'll wait for him to comment though if the silent acceptance of the heartbeat thread is intentional.
There was a problem hiding this comment.
I think @jackye1995 did the original implementation for BaseLockManager, I'll wait for him to comment though if the silent acceptance of the heartbeat thread is intentional.
Yeah this is a good point, seems like a miss of the original implementation, we should fix it if it's a known issue
There was a problem hiding this comment.
I did some drafting myself to see how hard would this be, and here are my thoughts:
- There is a big difference between the
LockManagerwhich aims to be the single shared instance which is called by multipleTableOperationsinstances (created by the sameCatalog), and theMetastoreLockwhich is handling a single instance of the lock. With the current code it is possible to have a different locking configuration for different tables created through the same HiveCatalog implementation (with the use ofHiveCatalog.setConf), and we aim to keep that, so some tables will use the old locking and some tables will use the new locking mechanism (See: Hive: Use EnvironmentContext instead of Hive Locks to provide transactional commits after HIVE-26882 #6570).
For the concrete methods:
ensureActive- this is currently Hive specific, but if you need to have a heartbeat then it means that you run the risk that one way or another the heartbeat fails (temporary connection issue, or whatever). I think this means that failing heartbeat is not Hive specific, just an issue not addressed by other LockManager instances (maybe not that pronounced there but still a valid issue)LockManageracquireHiveLock needsdatabaseandtableto acquire a lock - whereas LockManager has onlyentityId- we can get over with it by splitting theentityIdon., but this is not that nicerelease- we can use thisinitialize- we can get away with pushing the needed stuff through the constructor - but remember that we have to create the LockManager for every operation to ensure that the configurations are honored.
BaseLockManageronly implements the following methods:scheduler- needs fixinginitialize- only reads the cofigurations which are not needed anyway for Hive as we plan to keep the old config
So as I see now, the Hive lock handling would not benefit too much from the existing LockManager infra.
All-in-all this would be another big change above this big refactor.
If we want to do this we should split it out to multiple steps to help the reviewers.
Also, I am overwhelmed with other work, so I will not have too much time working on this.
@amogh-jahagirdar: I propose to merge this change for now, and if you have time I would be happy to review your PR from the Hive perspective which merges the 2 Locking behaviour.
Would that work for you @amogh-jahagirdar?
Thanks,
Peter
There was a problem hiding this comment.
@pvary Thanks for the explanation, yeah it should be fine to merge! In this case it does seem complex to reconcile the two. Didn't intend to block there, just wanted to see if we were making the change in the right manner! Just one thing before though, can we double check the classes which are public need to be public? It makes future refactoring easier since we can avoid deprecations.
| // Starting heartbeat for the HMS lock | ||
| hiveLockHeartbeat = | ||
| new HiveLockHeartbeat(metaClients, hmsLockId.get(), lockHeartbeatIntervalTime); | ||
| hiveLockHeartbeat.schedule(exitingScheduledExecutorService); |
There was a problem hiding this comment.
Here I wonder again if we can leverage the existing LockManager (maybe extend it to allow passing in a custom heartbeat runnable)
There was a problem hiding this comment.
If we create a pool with 5 threads and we have 6 concurrent commits, then 1 thread will not have heartbeat.
Shall we fail the commit in this case? The current LockManager implementation silently accepts the job, and does not run the heartbeat until there is an empty thread. HiveTableOperations has its own thread and handles this correctly.
| // getting a process-level lock per table to avoid concurrent commit attempts to the same table | ||
| // from the same | ||
| // JVM process, which would result in unnecessary and costly HMS lock acquisition requests | ||
| acquireJvmLock(); |
There was a problem hiding this comment.
I think maintaining a in process lock and acquiring it before using the distributed locking mechanism is something we probably want in all cases (it makes a ton of sense to avoid expensive calls). if we use a lock manager abstraction I think we can move it to the parent and then after that delegate to acquireLock which acquires the distributed lock. We're actually missing this in the DynamoDB lock manager which I think is a perfect application of this
There was a problem hiding this comment.
I think the most expensive part after this is acquiring the Lock itself. Is this what is expensive for the DynamoDB lock manager too?
There was a problem hiding this comment.
Right, it's also the same. For DynamoDB lock manager case, it's doing the conditional put on the dynamodb table.
szehon-ho
left a comment
There was a problem hiding this comment.
Thanks for the refactor and work @pvary . Also agree with @amogh-jahagirdar , if some of this logic can be refactored even to be shared across Iceberg, that would be great. Not sure the difficulty there.
| public void lock() throws LockException { | ||
| // getting a process-level lock per table to avoid concurrent commit attempts to the same table | ||
| // from the same | ||
| // JVM process, which would result in unnecessary and costly HMS lock acquisition requests |
There was a problem hiding this comment.
Nit: can we fix this comment now to be more balanced (maybe remove either 'unneccessary' or 'costly' as they are redundant)? I guess its written before the 80 char line spotless rules
There was a problem hiding this comment.
Done, please double check the new wording
| lockComponent.setTablename(tableName); | ||
| final LockRequest lockRequest = | ||
| new LockRequest( | ||
| Lists.newArrayList(lockComponent), System.getProperty("user.name"), hostName); |
There was a problem hiding this comment.
Agree with @haizhou-zhao to use UserGroupInformation if possible
There was a problem hiding this comment.
Wasn't sure if we should keep this as a refactor only, or we can add changes.
Added the change, and used HiveHadoopUtil.currentUser().
@haizhou-zhao - is this as it should be?
| return null; | ||
| } | ||
|
|
||
| private void unlock(Optional<Long> lockId) { |
There was a problem hiding this comment.
Style: I see an Intellij warning that Optional is used as an argument. Optional javadoc states it is only used for return types, is there any way to simplify it? Maybe lockId can also just be null/non-null? Reference again: https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/util/Optional.html (API Note)
There was a problem hiding this comment.
I am heavily building on that we either have a lockId or not in the method. I can use a parameter which could be null and check that, and I can convert the Optional from the caller, but it would be awkward. Also since it is a private method, I would keep it his way. WTYT?
There was a problem hiding this comment.
Yea I'm ok, I guess the alternative is to have just lockId outside be a possible null object. Just wondering, do you see the intellij warnings, or is it just me? Its just a bit annoying to see those in a file when trying to add/modify unrelated code, hence reason for comment.
There was a problem hiding this comment.
I see the warnings - actually a few of them:
- 'Optional' used as type for field 'hmsLockId'
- Result of 'Thread.interrupted()' is ignored
- 'Optional' used as type for parameter 'lockId'
- Result of 'Thread.interrupted()' is ignored
- Method invocation 'lock' may produce 'NullPointerException'
There was one more which was absolutely correct (null check was missing in ensureActive) and I fixed that, but I think the others are fine/intentional.
We can use annotations to turn off them, like @SuppressWarnings("OptionalUsedAsFieldOrParameterType"). Does it worth to do it?
amogh-jahagirdar
left a comment
There was a problem hiding this comment.
Left my thoughts on how we can reuse LockManager, pass in a custom heartbeat thread, and a fix in the base lock manager. Let me know if that makes sense @pvary!
| public interface HiveLock { | ||
| void lock() throws LockException; | ||
|
|
||
| void ensureActive() throws LockException; | ||
|
|
||
| void unlock(); | ||
| } |
There was a problem hiding this comment.
@pvary sorry for the late reply so here are my thoughts:
1.) Shall we keep the LockManager as it is and create/use HiveLockManager with extra methods, or add ensureAcitve method to the LockManger interface?
So after reading the code it seems like LockManager was designed in a manner where the heartbeating is assumed to be happening under the hood for the implementations which require heartbeats. If I'm not mistaken ensureActive is used at certain points to check the heartbeat status which is specific for HiveLock.
What we could do here is have the separate HiveLockManager interface but it still extends BaseLockManager so we inherit acquire/release, and then the ensureActive is specific to HiveLockManager. We shouldn't have to have a separate lock/unlock imo because acquire/release are already on the LockManager itself, and those are the same operations.
let me know if that makes sense!
2.) Shall we keep the current config for the hive locks or shall we use the new ones instead, or keep both configs and create a deprecation handling for it?
I'm not super opinionated on this tbh, the current ones make sense to me. If we do decide to add new ones, having a deprecation path then should happen, rather than changing it all at once.
3.) Shall we keep the LockManager as it is and create/use HiveLockManager with extra methods, or add ensureAcitve method to the LockManger interface?
The current LockManager implementation silently accepts the job, and does not run the heartbeat until there is an empty thread. HiveTableOperations has its own thread and handles this correctly.
This seems like incorrect behavior in the BaseLockManager then, I think we should always guarantee that if the heartbeat thread cannot start we fail. So I think ideally if possible we fix it in the BaseLockmanager itself. In my mind we also add a method withHeartbeatTask(Runnable heartbeatTask).
I think @jackye1995 did the original implementation for BaseLockManager, I'll wait for him to comment though if the silent acceptance of the heartbeat thread is intentional.
szehon-ho
left a comment
There was a problem hiding this comment.
Looks good in terms of correctness, thanks.
Added a handful of style suggestions I think will make it clearer for any person reading the code.
| } | ||
|
|
||
| if (!lockInfo.lockState.equals(LockState.ACQUIRED)) { | ||
| // timeout and do not have lock acquired |
There was a problem hiding this comment.
Nit: up to you, but feel we can remove these inline comments now that code is refactored to be relatively self-documenting?
There was a problem hiding this comment.
Removed the first 2 comments, renamed the error, but kept the last comment where it is not straightforward why it is there
|
Thanks for the detailed explanations @pvary I agree it does seem difficult to reconcile the two abstractions at this point. The only thing on my side is can we confirm if all the public classes/interfaces really need to be public? |
|
Merged to master. |
…#6648) Co-authored-by: Adam Szita <40628386+szlta@users.noreply.github.com> Co-authored-by: Peter Vary <peter_vary4@apple.com>
|
|
||
| commitStatus = CommitStatus.SUCCESS; | ||
| } catch (LockException le) { | ||
| throw new CommitStateUnknownException( |
There was a problem hiding this comment.
Hi @pvary, I think here we should set the commitStatus to UNKNOWN before throwing the CommitStateUnknownException. Otherwise the commitStatus is still FAILURE which could lead to the metadata being deleted in cleanupMetadataAndUnlock.
There was a problem hiding this comment.
Yea, I think thats a good catch
There was a problem hiding this comment.
Sounds right to me.
@ConeyLiu: would you mind creating a PR for fixing this?
…#6648) Co-authored-by: Adam Szita <40628386+szlta@users.noreply.github.com> Co-authored-by: Peter Vary <peter_vary4@apple.com>
Reviewal/rebase of #5877