KAFKA-16977: Reapply dynamic remote configs after broker restart#16353
KAFKA-16977: Reapply dynamic remote configs after broker restart#16353satishd merged 1 commit intoapache:trunkfrom
Conversation
The below remote log configs can be configured dynamically: 1. remote.log.manager.copy.max.bytes.per.second 2. remote.log.manager.fetch.max.bytes.per.second and 3. remote.log.index.file.cache.total.size.bytes If those values are dynamically configured, then during the broker restart, it load the static value from the config file instead of the dynamic values.
|
|
||
| indexCache = new RemoteIndexCache(rlmConfig.remoteLogIndexFileCacheTotalSizeBytes(), remoteLogStorageManager, logDir); | ||
| RemoteLogManagerConfig rlmConfig = config.remoteLogManagerConfig(); | ||
| indexCache = new RemoteIndexCache(config.remoteLogIndexFileCacheTotalSizeBytes(), remoteLogStorageManager, logDir); |
There was a problem hiding this comment.
not sure whether I have caught the context, so please feel free to correct me.
- IIRC, the dynamical configs are loaded by another thread, and hence we may NOT see the latest configs, which were updated dynamically, in creating
RemoteLogManager, right? - those configs (
remote.log.manager.copy.max.bytes.per.second,remote.log.manager.fetch.max.bytes.per.second) can be updated byreconfigureprocess, so it should be fine to initialize them with "stale" (static) configs after broker restart, right?
There was a problem hiding this comment.
- IIRC, the dynamical configs are loaded by another thread, and hence we may NOT see the latest configs, which were updated dynamically, in creating RemoteLogManager, right?
No, KafkaServer/BrokerServer does config.dynamicConfig.initialize before creating the RemoteLogManager instance so the dynamic configs gets updated in the KafkaConfig object but not in the KafkaConfig.remoteLogManagerConfig().
I have tested the patch only with ZooKeeper. I think the behavior should be similar for KRaftMetadataCache/ConfigRepository.
- those configs (remote.log.manager.copy.max.bytes.per.second, remote.log.manager.fetch.max.bytes.per.second) can be updated by reconfigure process, so it should be fine to initialize them with "stale" (static) configs after broker restart, right?
This is correct, the reconfigure updates the dynamic value but we are referring to the static value as explained above.
There was a problem hiding this comment.
I have tested the patch only with ZooKeeper. I think the behavior should be similar for KRaftMetadataCache/ConfigRepository.
That is a good point, and maybe they do have something difference.
in kraft, zkClientOpt is none so it does not update it with dynamical parts.
There was a problem hiding this comment.
Yes, I confirmed in KRaft, it won't have this issue.
There was a problem hiding this comment.
Yes, I confirmed in KRaft, it won't have this issue.
Sorry that I'm not sure which issue you confirmed. If we are taking about dynamic configs in starting. According to above comments, it seems to me this fix which tries to return latest (dynamic) configs works well only if kafka is in zk. In kraft, this fix is no-op as it still return static configs.
Please correct me If I'm lost
There was a problem hiding this comment.
@chia7712 , yes, you're right! To KRaft, this fix is no-op.
| LOGGER.debug("Received leadership changes for leaders: {} and followers: {}", partitionsBecomeLeader, partitionsBecomeFollower); | ||
|
|
||
| if (this.rlmConfig.isRemoteStorageSystemEnabled() && !isRemoteLogManagerConfigured()) { | ||
| if (config.remoteLogManagerConfig().isRemoteStorageSystemEnabled() && !isRemoteLogManagerConfigured()) { |
There was a problem hiding this comment.
this is unrelated to this PR, but config.remoteLogManagerConfig always return the same config even though the config is updated dynamically. That seems to be error-prone, since it means the "updated" config can return "remoteLogManagerConfig" with "previous" configs.
There was a problem hiding this comment.
yes, this is the main issue. We access the remote configurations using KafkaConfig.remoteLogManagerConfig.xyz() but it reflects to the value in the static server.properties file not the dynamically updated one.
So, changed the usages to KafkaConfig.xyz() to get the dynamically updated value.
There was a problem hiding this comment.
As we have great RemoteLogManagerConfig for remote storage, could we avoid moving configs out of RemoteLogManagerConfig? Especially, we are trying to reduce the size of KafkaConfigs.
If this PR aims to fix it for zk mode, maybe we can make sure the remoteLogManagerConfig returned by config always has the latest configs. For example:
config.remoteLogManagerConfigalways create newremoteLogManagerConfigbased oncurrentConfigKafkaConfig#updateCurrentConfigshould update inner_remoteLogManagerConfigalso
@kamalcph WDYT?
There was a problem hiding this comment.
Can we take this refactoring as part of KAFKA-16976 ticket?
There was a problem hiding this comment.
@chia7712 Good point. A similar issue is raised in the comment. We can fix the bug with the current approach in 3.8 and finish cleaning up the raised issues in a followup as part of KAFKA-16976
satishd
left a comment
There was a problem hiding this comment.
Thanks @kamalcph for finding this issue and the offline discussion on the changes.
The root cause here is that the RemoteLogManagerConfig instance is already intialized with the earlier values and it is not getting updated when server is initialized with KafkaConfig.updateCurrentConfig is invoked.
We should avoid passing KafkaConfig to RemoteLogManagerConfig and we need to make sure whenever KafkaConfig gets updated RemoteLogManagerConfig gets updated. With the current changes, it is happening in two different ways that we should avoid.
We should address it with the below behavior
- Update
RemoteLogManagerConfigwith the respective dynamic configs when a broker is initialized. In this case, the respective configs inRemoteLogManagerConfigshould get updated and there won't be any dangling old config references. - Update the respective runtime components(RLM) based on the modified dynamic configs. This can continue happen as part of
DynamicRemoteLogConfig.
there was a discussion about "can we see the (latest) dynamic configs when a broker is initialized" (#16353 (comment)). In kraft, we create Those remoteLogManager configs [3] can be updated later when publishing metadata (latest configs), so that is possibly be fine (?). Fixing the startup order seems be a big issue, and maybe it is hard to guarantee "all" components can see latest configs in kraft. [0] https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/server/BrokerServer.scala#L208 |
|
yes, kraft config updates looks good to me. The DynamicConfigPublisher will eventually call the BrokerReconfigurable#reconfigure, then the configs gets updated on those respective components. |
This require a good amount of refactoring in RemoteLogManagerConfig class. Shall we continue with this patch to backport it to 3.8 branch? Or, do we have to refactor the code and land the changes only in trunk/3.9 branch? |
showuon
left a comment
There was a problem hiding this comment.
LGTM! Could you mention that this issue only happened in ZK mode in the PR description? Thanks.
done. |
We can make the suggested improvements as a followup in trunk. Filed a followup KAFKA-16976. |
|
Retriggered the CI job as one of the test runs timedout. |
|
A few unrelated test failures, merging it to trunk for now. |
|
@jlprat This is a bug fix that should be pushed to 3.8. wdyt? |
|
Yes, it can be ported to 3.8. Thanks @satishd |
) The below remote log configs can be configured dynamically: 1. remote.log.manager.copy.max.bytes.per.second 2. remote.log.manager.fetch.max.bytes.per.second and 3. remote.log.index.file.cache.total.size.bytes If those values are configured dynamically, then during the broker restart, it ensures the dynamic values are loaded instead of the static values from the config. Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Satish Duggana <satishd@apache.org>, Luke Chen <showuon@gmail.com>
|
Thanks @jlprat , merged to 3.8. |
…che#16353) The below remote log configs can be configured dynamically: 1. remote.log.manager.copy.max.bytes.per.second 2. remote.log.manager.fetch.max.bytes.per.second and 3. remote.log.index.file.cache.total.size.bytes If those values are configured dynamically, then during the broker restart, it ensures the dynamic values are loaded instead of the static values from the config. Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Satish Duggana <satishd@apache.org>, Luke Chen <showuon@gmail.com>
The below remote log configs can be configured dynamically:
If those values are dynamically configured, after the broker restart, then it loads the static value from the config file instead of the dynamic value. Note that the issue happens only when running the server with ZooKeeper.
Committer Checklist (excluded from commit message)