Fix compiler optimize thread local variable access#2918
Merged
Conversation
Contributor
|
为什么不使用 BAIDU_GET_VOLATILE_THREAD_LOCAL 宏的方式? |
Contributor
Author
这里我的考虑是,使用 BAIDU_GET_VOLATILE_THREAD_LOCAL 的方式,在非LTO模式下,多了一次函数调用。 当然,这只是我的考量,如果你们更建议以 BAIDU_GET_VOLATILE_THREAD_LOCAL 的方式,那就还是用 BAIDU_GET_VOLATILE_THREAD_LOCAL。@wwbmmm |
Contributor
|
我觉得问题的根本原因是编译器对TLS变量访问的优化,而不是内联优化,所以使用BAIDU_GET_VOLATILE_THREAD_LOCAL,是更根本的改法。 |
`bthread_usleep`/`bthread_yield` contains access to TLS variables. In LTO mode, exceptions may occur due to cross-module optimization. For example, bthread_usleep is inlined into WatchConnections, the compiler(clang-17.0.6) cache the address outside the loop, triggering the error mentioned in apache#2156.
5fcdca5 to
e951ffb
Compare
Contributor
Author
|
重新提交了一版。 |
Contributor
|
LGTM |
Contributor
Author
|
还有几个问题请教下。
|
Contributor
不确定,主要编译器的逻辑会不断迭代,不确定后面还会不会在别的场景触发问题,只能case by case解决了。
不用了,等其他开发者看看还有没有问题,没问题就合入 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What problem does this PR solve?
Issue Number:
Problem Summary:
bthread_usleep/bthread_yield包含对 tls_task_group 的访问,在LTO模式下,会触发 #2156 提到的问题。在example下会出现bthread=4294967808 sched_to itself!的错误,实际服务中会导致程序出core。复现方式
./parallel_echo_client --use_bthread=true后Ctrl+C退出,server可以不需要启动。汇编对比
clang-11 + thinlto。会内联bthread_usleep,但不会做tls变量访问的优化,此时不会出现问题。
clang-17 + thinlto, 在循环外缓存了tls变量的地址,此时会出现问题
What is changed and the side effects?
Changed:
bthread_usleep 和 bthread_yield 禁用inline优化。
替代方案
使用 BAIDU_GET_VOLATILE_THREAD_LOCAL 宏的方式,访问 tls_task_group,但只要避免bthread_usleep,被inline,就不会做后续的优化。看各位有什么建议。
Side effects:
对于非LTO模式下,无影响;
对于LTO模式下,会让bthread_usleep 和 bthread_yield 不再被内联。
Check List: