[WIP] Benchmarks POC#381
Closed
Erotemic wants to merge 18 commits into
Closed
Conversation
line_profiler/_line_profiler.pyx
- Added macro `hash_bytecode()` for hashing bytestrings (doesn't
seem to have performance benefits so it's currently unused;
TODO: more tests)
- Replaced repeated retrival of `prof._c_last_time[ident]` with a
stored reference thereto
Further optimizations
I *really* hope this is not why the Windows CI is failing lol
This reverts commit c5f95e2.
Collaborator
|
That looks nice! One suggestion I have would be to make use of pytest-benchmark and its pytest-benchmark[histogram] addon (though it doesn't make those nice plots by default, that can be implemented on top of its data). This is what I was planning to do for regression tests, and I have an example script making use of it at jsonpickle's repository and run instructions here. Sample output is included in the attached pictures. Also, I don't think this took into account my most recent commit for fixing your GPT review suggestions 2 and 4 over on #376.
|
Member
Author
|
Going to close this as I'm not going to get to it any time soon. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.


Adding onto #376 to start to measure the amount of overhead line-profiler adds as a function of python version and line-profiler version (and other factors).
The idea is to have a script that can generate a report for a specific style of benchmarks, then multiple of these result files can be aggregated to view statistics over different contexts in which the benchmarks are run. This way we can slice/dice the numbers in a way to gain more insight.
This is VERY rough right now (hard coded paths and whatnot), but I have the initial proof of concept, which shows the impact of the changes in #376
These also rely on some of my utility libraries: scriptconfig / kwutil /ubelt.
Plot aggregating all versions of python the tests are run against:
Plot splitting out by python versions:
It looks like the sys.monitoring adds quite a bit more overhead than the legacy way of handling the trace callbacks, but in all cases @376 does seem to be a speed improvement.
The jump up from 4.0 to 4.1 is stark, I'm wondering what's happening there or maybe there is a mistake in my benchmark script. I do think the code I wrote here can serve as a decent launch point to get an automated way of measuring how we are doing on overhead and if new patches are causing significant regressions or not.