DHAT: Optimize import time for large profiles by flowln · Pull Request #4640 · firefox-devtools/profiler

flowln · 2023-05-28T22:54:22Z

This commit changes the way in which the stackTable tree is constructed in DHAT's import algorithm.

Previously, the algorithm would look a bit like this:

    For each PP:
        For each frame in the PP, from top to bottom:
            Loop over all existing stackTable entries, until we find the
            one we want, or go through all of them.

            If we found an entry, we are good.
            If not, create a new entry.
        [...]
    [...]

This approach, while correct, isn't optimal, since we would potentially loop over a wide range of stackTable entries that couldn't possibly be the one we were looking for.

With this commit, the algorithm becomes more or less like this:

    For each PP:
        For each frame in the PP, from top to bottom:
            Loop over the stackTable entries whose prefix is the same as
            the one we're looking for.

            If we found an entry, we are good.
            If not, create a new entry, also updating the list of
            entries with our prefix.
        [...]
    [...]

With this, we avoid looking over a very big number of entries whose prefix are different from ours. The effect is that of switching from a linear search on a list to following pointers on a tree structure.

The list of entries with the same prefix is implemented as a JS Map of prefixes to lists of indexes into the stackTable.

With this change, big DHAT profiles take a lot less time to load (e.g. a 20MB profile would take ~2min before, now it takes ~2.5s :D).

This is the profile I tested with: dhat.out.3106 🙂

Sorry if I messed something up, I'm not particularly acquainted with JS (I even had a fight with the type checker, despite we having the same name! :P )

This commit changes the way in which the stackTable tree is constructed in DHAT's import algorithm. Previously, the algorithm would look a bit like this: For each PP: For each frame in the PP, from top to bottom: Loop over all existing stackTable entries, until we find the one we want, or go through all of them. If we found an entry, we are good. If not, create a new entry. [...] [...] This approach, while correct, isn't optimal, since we would potentially loop over a wide range of stackTable entries that couldn't possibly be the one we were looking for. With this commit, the algorithm becomes more or less like this: For each PP: For each frame in the PP, from top to bottom: Loop over the stackTable entries whose prefix is the same as the one we're looking for. If we found an entry, we are good. If not, create a new entry, also updating the list of entries with our prefix. [...] [...] With this, we avoid looking over a very big number of entries whose prefix are different from ours. The effect is that of switching from a linear search on a list to following pointers on a tree structure. The list of entries with the same prefix is implemented as a JS Map of prefixes to lists of indexes into the stackTable. With this change, big DHAT profiles take a lot less time to load (e.g. a 20MB profile would take ~2min before, now it takes ~2.5s :D). Signed-off-by: flow <flowlnlnln@gmail.com>

codecov · 2023-05-28T23:01:31Z

Codecov Report

Patch coverage: 100.00% and no project coverage change.

Comparison is base (c3f2c3f) 88.61% compared to head (bed5b29) 88.62%.

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #4640   +/-   ##
=======================================
  Coverage   88.61%   88.62%           
=======================================
  Files         294      294           
  Lines       26095    26101    +6     
  Branches     7035     7036    +1     
=======================================
+ Hits        23125    23131    +6     
  Misses       2765     2765           
  Partials      205      205

Impacted Files	Coverage Δ
src/profile-logic/import/dhat.js	`98.31% <100.00%> (+0.08%)`	⬆️

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

julienw

Thanks, this looks like a very good addition indeed, thanks for the dramatic speedup! I see we're mostly getting rid of the stack table loop by using this tree instead. It makes sense for this case where we're scanning the stack table repeatidly, even though this duplicates the data in these 2 structures.

I left a few comments to make the code a bit easier to follow. That shouldn't change how the algorithm works, rather just put together the things that work together.
Tell me what you think!

julienw · 2023-05-30T13:16:53Z

-          break;
+      if (candidateStackTables) {
+        // Start searching for a stack index.
+        stackIndex = stackTable.length;


nit: I think I'd convert this to let stackIndex = null right before the if. Then in the if below we can check for if (stackIndex === null) { stackIndex = stackTable.length; ... }. I believe this would make this code more explicit. What do you think?

While implementing your other suggestion (#4640 (comment)), I found out that it will only work if we use stackIndex = stackTable.length every time, not just when entering this loop.

This is due to the old value of stackIndex, from a previous iteration of the loop, not being equal to stackTable.length, while candidateStackTables is undefined (so, we are in this prefix for the first time). So, it will neither enter this loop, nor create a new stackTable entry, which causes issues.

Because of that, I think it's best to simply move this stackIndex = stackTable.length line up, and add a comment explaining that that is a fallback when we don't enter this loop, or we enter it but don't find a match.

Mmm I feel like my suggestion here would still work but what you did works too, and the comment makes it clear enough, so I don't mind.

julienw · 2023-05-30T13:20:12Z

-    let stackIndex = -1;
+    let stackIndex = 0;


I think this value isn't used outside of the loop now, so it can be moved below. Also look at the other comment about the same stackIndex variable below.

This variable is still used one time after the loop, to tell the ending stackIndex of the PP entry to the allocationsTable!

julienw · 2023-05-30T13:22:15Z

+        // Since we just created a stack index, the next frames necessarily don't have an existing stack index.
+        candidateStackTables = [];
+      } else {
+        candidateStackTables = postfix.get(stackIndex);


I think you can run candidateStackTables = postfix.get(prefix) at the start of the loop (right before if (candidateStackTables)), and get rid of this line as well as the previous line and the line let candidateStackTables = postfix.get(null); before the loop.

julienw · 2023-05-30T13:27:43Z

        stackTable.category.push(otherSubCategory);
        stackTable.prefix.push(prefix);
+
+        const candidateList = postfix.get(prefix);


I believe this will already be candidateStackTables, with the changes I suggest, right? So we'll be able to reuse it directly.
(with your current patch, it's not always the case, especially when using the empty array line 310)

mstange · 2023-05-30T15:08:23Z

It looks like a hashmap with string keys of the type ${prefix}-${frame} is even faster, and also simpler.

mstange · 2023-05-30T15:36:42Z

It looks like a hashmap with string keys of the type ${prefix}-${frame} is even faster, and also simpler.

Ah, this approach would change behavior because we would no longer be de-duplicating stacks for frames of the same func. So I think it's fine to go ahead with the approach in this PR. I can create a follow-up PR later to implement the hashmap approach. What you have now is a huge improvement and worth shipping.

It seems like there are other problems with the importer on the provided profile - the regular expression doesn't match due to missing column numbers (for the functions with file+line information) and due to missing filenames (for the functions from libraries without debug info).

This makes some misc. changes to make it a bit easier to understand the code added in the previous commit. This doesn't (i hope) change any behaviours! Signed-off-by: flow <flowlnlnln@gmail.com>

julienw

Thanks for the updates, this looks good to me now!

julienw requested changes May 30, 2023

View reviewed changes

refactor(DHAT): improve code readability of the stack tree builder

bed5b29

This makes some misc. changes to make it a bit easier to understand the code added in the previous commit. This doesn't (i hope) change any behaviours! Signed-off-by: flow <flowlnlnln@gmail.com>

flowln force-pushed the optimize_dhat_import branch from 6c42a8c to bed5b29 Compare May 30, 2023 21:16

julienw self-requested a review May 31, 2023 09:29

julienw approved these changes Jun 6, 2023

View reviewed changes

Merge branch 'main' into optimize_dhat_import

9726cce

julienw enabled auto-merge (squash) June 6, 2023 16:01

julienw merged commit bf0526d into firefox-devtools:main Jun 6, 2023

canova mentioned this pull request Jun 20, 2023

Deploy June 20, 2023 #4675

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DHAT: Optimize import time for large profiles#4640

DHAT: Optimize import time for large profiles#4640
julienw merged 3 commits into
firefox-devtools:mainfrom
flowln:optimize_dhat_import

flowln commented May 28, 2023

Uh oh!

codecov Bot commented May 28, 2023 •

edited

Loading

Uh oh!

julienw left a comment

Uh oh!

julienw May 30, 2023

Uh oh!

flowln May 30, 2023

Uh oh!

julienw Jun 6, 2023

Uh oh!

julienw May 30, 2023

Uh oh!

flowln May 30, 2023

Uh oh!

julienw May 30, 2023

Uh oh!

julienw May 30, 2023

Uh oh!

mstange commented May 30, 2023

Uh oh!

mstange commented May 30, 2023

Uh oh!

julienw left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

flowln commented May 28, 2023

Uh oh!

codecov Bot commented May 28, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

julienw left a comment

Choose a reason for hiding this comment

Uh oh!

julienw May 30, 2023

Choose a reason for hiding this comment

Uh oh!

flowln May 30, 2023

Choose a reason for hiding this comment

Uh oh!

julienw Jun 6, 2023

Choose a reason for hiding this comment

Uh oh!

julienw May 30, 2023

Choose a reason for hiding this comment

Uh oh!

flowln May 30, 2023

Choose a reason for hiding this comment

Uh oh!

julienw May 30, 2023

Choose a reason for hiding this comment

Uh oh!

julienw May 30, 2023

Choose a reason for hiding this comment

Uh oh!

mstange commented May 30, 2023

Uh oh!

mstange commented May 30, 2023

Uh oh!

julienw left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

codecov Bot commented May 28, 2023 •

edited

Loading