Closed
Conversation
copybara-service bot
pushed a commit
that referenced
this pull request
Mar 27, 2024
This CL contains two optimizations that were measured together. 1) InsertMiss (i. e. successful insert) optimization: The idea that in case there is no kDeleted, we already know 99% of the information. So we are finding the position to insert with 2 asm instructions (or 3 in case of ARM or portable) and passing that as a hint into `prepare_insert`. `prepare_insert` is out of the line in order to minimize effect on InsertHit (the most important case). `prepare_insert` may use the hint in case we still have growth and no kDeleted is guaranteed. In case of kDeleted, we still call `find_first_non_full` in order to potentially find kDeleted slot earlier. We may consider different ways to do it faster for kDeleted later. 2) `find_first_non_full` optimization: After optimization #1 `find_first_non_full` is used: 1. during resize and copy 2. right after resize 3. during DropDeletedWithoutResize 3. in InsertMiss for tables with kDeleted In cases 1-3 the table is quite sparse. 1. After resize it is 7/16 sparse 2. During resize it is 7/16 maximum, but during first inserts it is much sparser. 3. During copy it may be up to 7/8, but at the beginning it is way sparser. 4. During DropDeletedWithoutResize it is 25/32 sparse, but at the beginning it is way sparser. The only case, where the table is not known to be sparse, with `find_first_non_full` usage is a table with deleted elements. But according to hashz, we have <3% such tables. Adding an extra branch wouldn't hurt much there. PiperOrigin-RevId: 619681885 Change-Id: Id3e2852cc6d85f6c8f90982d7aeb14799696bf39
copybara-service bot
pushed a commit
that referenced
this pull request
Sep 19, 2024
…ark::DoNotOptimize` on both inputs and outputs and by removing the unnecessary and wrong `ABSL_RAW_CHECK` condition (`check != 0`) of `BM_ByteStringFromAscii_Fail` benchmark. Relevant comment: ``` // The DoNotOptimize(...) function can be used to prevent a value or // expression from being optimized away by the compiler. This function is // intended to add little to no overhead. // See: http://stackoverflow.com/questions/28287064 // // The specific guarantees of DoNotOptimize(x) are: // 1) x, and any data it transitively points to, will exist (in a register or // in memory) at the current point in the program. // 2) The optimizer will assume that DoNotOptimize(x) could mutate x or // anything it transitively points to (although it actually doesn't). // // To see this in action: // // void BM_multiply(benchmark::State& state) { // int a = 2; // int b = 4; // for (auto s : state) { // testing::DoNotOptimize(a); // testing::DoNotOptimize(b); // int c = a * b; // testing::DoNotOptimize(c); // } // } // BENCHMARK(BM_multiply); // // Guarantee (2) applied to 'a' and 'b' prevents the compiler lifting the // multiplication outside of the loop. Guarantee (1) applied to 'c' prevents the // compiler from optimizing away 'c' as dead code. ``` To see #1 and #2 in action, see: https://godbolt.org/z/ned1578ve PiperOrigin-RevId: 676588185 Change-Id: I7ed3e4bed8274b54ac7877316f2d82c33d68f00f
GerHobbelt
pushed a commit
to GerHobbelt/abseil-cpp
that referenced
this pull request
Nov 1, 2024
copybara-service bot
pushed a commit
that referenced
this pull request
Dec 20, 2024
I find that this makes it much easier to understand the problems. Example error without this change: Hash expansion of #0(24-byte object <00-00 00-00 00-00 00-00 00-00 00-00 00-00 00-00 00-00 00-00 00-00 00-00>) is a suffix of the hash expansion of #1(24-byte object <00-CB E3-BF FB-12 00-00 F0-FF BE-9D FD-7F 00-00 B8-0D 04-1C 59-7F 00-00>). Example error with this change: Hash expansion of #0(24-byte object <00-00 00-00 00-00 00-00 00-00 00-00 00-00 00-00 00-00 00-00 00-00 00-00>);[ 0x0000: 0000 0000 0000 0000 ] is a suffix of the hash expansion of #1(24-byte object <00-EC E3-FF F8-31 00-00 50-10 90-94 FF-7F 00-00 40-98 E8-33 5E-7F 00-00>);[ 0x0000: 2ebc 2196 d3ab 37da 0x0000: 0000 0000 0x0000: 0000 0000 0000 0000 ]. PiperOrigin-RevId: 708356078 Change-Id: Iab9060d70eeb051c5a28786e4542f49629165ce0
copybara-service bot
pushed a commit
that referenced
this pull request
Sep 8, 2025
In case of two nested back-to-back signals (such as what happens in NestedSignal test) we could end up erroneously using the frame pointer from ucontext_t twice, leading to premature backtrace termination. In the situation where this happens, the call stack looks like #0 <unwinder frames> #1 SigUsr2Handler #2 __kernel_rt_sigreturn #3 raise #4 SigUsr1Handler #5 __kernel_rt_sigreturn #6 raise #7 RaiseSignal ... When unwinding from #2, we get the fp value from the ucontext (as we should). However, because raise does not modify the fp and because SigUsr1Handler is also a signal handler, when we try to unwind from #4 (#3 is skipped), NextStackFrame ends up looking at the ucontext fp again, and comparing it with the previous (identical) FP value. Non-strict equality accepts this as a valid frame, but the unwinder later bails out due to a zero-sized frame. Using a strict equality causes NextStackFrame to reject the ucontext fp and use the FP from FP chain instead. This causes us to skip a few more frames, but at least we continue to unwind instead of giving up. In this case, the computed backtrace skips functions #3, #4 and #6. PiperOrigin-RevId: 804308754 Change-Id: I5d43e6bea80e4abff1075ada03782ae11c599161
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.