Faster properties#60
Merged
Merged
Conversation
This benchmark should help diagnose #58. On my machine with GCC 9.3, compilation for LargePod with 100 properties used 3.4 GiB RAM and ran in 48s. Complation for LargePod with 200 properties (PROPERTIES_2X) used all of the system memory available to the VM (8 GiB), and terminated after 5m30s. This points to horrendeous scalability issues with get_reader/has_writer, likely caused by the theoretically quadratic factor blowup of the compilation of ProcessClass(). (Would have been quadratic if this was executed at runtime, not sure how to properly calculate and describe it in this case).
Added an optimisation to the get|has_reader|writer methods which checks the neighbouring members first, before deferring to a linear search (very slow, see #58). Also adds the REFL_DISALLOW_SEARCH_FOR_RW define. If defined, linear search in the above situations is not permitted and compilation will fail.
The implementation of the linear search used by the property utilities was refactored to reduce compilation times and reduce the number of template instantiations that the compiler has to make.
7547d16 to
fb8f817
Compare
Owner
Author
|
With the latest changes, compilation times for bench-large-pod.cpp (100 properties) have improved as follows:
Time to compile was reduced by at least 88% (in the case of gcc) and peak memory usage was reduced by at least 84% (in the case of gcc). bench-large-pod-search.cpp saw even more improvement. This version of the above benchmark exercises the slower code path in get_reader/writer. It could not compile in my VM due to OOM (8 GiB RAM available). It now compiles happily and a little slower than the regular version (which has getters and setters defined next to each other - a heuristic for which was added in 3547e6e).
|
Owner
Author
|
Closes #58. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR aims to fix the slow compilation uncovered in #58.
An optimisation is added to property-related utilities. The optimisation is based on the presumption that getters and setters are reflected (via the RELF_AUTO or REFL_FUNC macros) one after the other. The optimisation consists of checking the neighbouring members first, before resorting to a linear search.
Applies to:
But does NOT apply to:
I have added a
bench/tree, which will be used for benchmarks. Only one benchmark exists at the moment -bench-large-pod.cpp, which iterates over the members of a large POD with getters and setters and matches property getters to setters via get_reader/writer.Results of compilation of
bench-large-pod.cpp:Without the optimisation:
With the optimisation: