Deploy Dec 5, 2023 by julienw · Pull Request #4832 · firefox-devtools/profiler

julienw · 2023-12-05T10:30:42Z

Important performance changes:
[Markus Stange] Replace child count with "has children" (Merge PR #4829)
[Markus Stange] Reduce the time of switching to the stack chart tab for the first time (Merge PR #4825)
[Markus Stange] Speed up getCallNodeInfo (#4826)
[Markus Stange] Improve flame graph performance. (Merge PR #4820)
[Markus Stange] Improve findHeavyPathToSameFunctionAfterInversion. (Merge PR #4813)
[Markus Stange] Optimize call node table traversals (Merge PR #4807)
[Markus Stange] Make traced durations honor the preview selection. (Merge PR #4817)
[Markus Stange] Use a call node path to make the stack copy string. (Merge PR #4811)
[Markus Stange] Simplify heightFunction. (Merge PR #4810)
[Markus Stange] Stop using sampleCallNodes in ThreadSampleGraph. (Merge PR #4809)
[Markus Stange] Only set fillStyle once in drawSamples. (Merge PR #4816)
[Markus Stange] Make the flame graph and stack chart always non-inverted (Merge PR #4804)

Compute the inverted selected call node in the action creator, not in the reducer. This means we don't have to pass the call tree as far down. Also move the computation into the CallTree class. And fix it to stop at nodes with heaviest self time, even if they're not leaf nodes.

…anded call paths in the redux state.

This means that the flame graph and the stack chart will now always display un-inverted. Fixes #3961. Fixes #4803.

…nverted and non-inverted views.

…rt tab.

This makes it faster to switch back and forth between inverted and non-inverted mode. It also makes it faster to switch tabs between inverted call tree and the chart views which are always non-inverted.

)

Pass it just the sample index, and have it return the depth. The yPixelsPerHeight multiplication can happen in the caller.

This is a minor simplification and removes some raw callNodeTable usage.

Move the fillStyle setting outside of a loop in the SampleGraph implementation. This was likely moved by mistake in f8631ae . And don't set the fillStyle at all if there's nothing to draw with this color.

Fixes #3557.

This will be used to make various uses of the call node table more efficient.

This is a bit faster because we don't need to create children arrays and sort them. It also fixes a bug with diff profiles. (Test coming up in the next commit.)

In the past, the indexes would be incorrect for the diffed thread because we were using the dict for one of the input threads, which has a different funcTable.

…ry (#4819)

…onments.

Before: https://share.firefox.dev/47ydcmp After: https://share.firefox.dev/40Y48og Fixes #844. The old code was saving time by sorting siblings only for the nodes that were displayed based on the current (preview) range selection. This made it cheaper to compute the flame graph for a small range, but it meant that we had to re-do the sort every time the selection changed. Now we do the sorting once, based on the entire call node table. This is expensive but it is a one-time cost (and it's still cheaper than if you were looking at the entire call tree with the old implementation). Then we cache the "ordered rows" and don't have to sort again on each range selection change.

…astChild instead of a Map to find existing call nodes with the same prefix and func.

Rather than talking about "0-based" and "1-based" depths, I think it would be less confusing to always treat depths as zero-based and rename these variables. When I was reading the stack chart code, I was initially quite confused because we were creating an array with length maxDepth, and that seemed too short.

…r-row arrays in computeFlameGraphRows. We don't use computeCallNodeMaxDepthPlusOne here because that one would only include the depth of call nodes that are used by any of the filtered samples. But computeFlameGraphRows looks at the entire call node table, and needs to reserve enough rows to have space even for call nodes which aren't used by any samples.

…able ordering. Before: https://share.firefox.dev/47CvQto (1471ms) After: https://share.firefox.dev/47z8Ewb (429ms, 3.4x faster)

…e (Merge PR #4825)

Bumps [@adobe/css-tools](https://github.com/adobe/css-tools) from 4.3.1 to 4.3.2. - [Changelog](https://github.com/adobe/css-tools/blob/main/History.md) - [Commits](https://github.com/adobe/css-tools/commits) --- updated-dependencies: - dependency-name: "@adobe/css-tools" dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

We no longer need the child count, because the call tree's getChildren() method now uses nextSibling to find the children, so it doesn't need the child count any more for ending the search early. And the new flame graph implementation has dropped the other use of the child count. Having just one byte per call node rather than four bytes might improve the performance of computeCallTreeCountsAndSummary slightly, because it needs to touch fewer bytes in memory.

This avoids calls to getChildren for nodes which are visible but not expanded. The tree view calls hasChildren in order to determine whether to show the disclosure arrow. Avoiding the call to getChildren means that we don't have to allocate the array and sort it. This is especially useful in the inverted tree, which has lots of non-expanded nodes at the root level.

mstange and others added 30 commits November 23, 2023 15:38

Store both the inverted and the non-inverted selected call path / exp…

bd97b75

…anded call paths in the redux state.

Only respect the invertCallstack state in the call tree tab.

6dd845e

This means that the flame graph and the stack chart will now always display un-inverted. Fixes #3961. Fixes #4803.

Add a test which checks that selection is tracked independently for i…

ac4a1ac

…nverted and non-inverted views.

Remove non-functional "Invert call stack" checkbox from the stack cha…

f259c8c

…rt tab.

Cache the inverted thread separately with a separate selector.

8bf6a05

This makes it faster to switch back and forth between inverted and non-inverted mode. It also makes it faster to switch tabs between inverted call tree and the chart views which are always non-inverted.

Add a comment to MaybeFlameGraph saying it's not needed any more.

3c0c07c

Make the flame graph and stack chart always non-inverted (Merge PR #4804

efbe020

)

Simplify heightFunction.

1cf6a1c

Pass it just the sample index, and have it return the depth. The yPixelsPerHeight multiplication can happen in the caller.

Use a call node path to make the stack copy string.

8814cda

This is a minor simplification and removes some raw callNodeTable usage.

Only set fillStyle once in drawSamples.

0cee302

Move the fillStyle setting outside of a loop in the SampleGraph implementation. This was likely moved by mistake in f8631ae . And don't set the fillStyle at all if there's nothing to draw with this color.

Make traced durations honor the preview selection.

e1efde6

Fixes #3557.

Only set fillStyle once in drawSamples. (Merge PR #4816)

22e80e7

Stop using sampleCallNodes in ThreadSampleGraph.

5fc4a4b

Stop using sampleCallNodes in ThreadSampleGraph. (Merge PR #4809)

7b888cd

Merge branch 'main' into simplify-height-function

13393be

Simplify heightFunction. (Merge PR #4810)

cfc40b5

Merge branch 'main' into simplify-copied-stack

ce52d95

Use a call node path to make the stack copy string. (Merge PR #4811)

fba3b58

Merge branch 'main' into preview-filtered-traced-timing

c9b23ab

Make traced durations honor the preview selection. (Merge PR #4817)

3e646c7

Only look up windowID when it's used.

866595d

Add firstChild and nextSibling columns to the call node table.

80f3d60

This will be used to make various uses of the call node table more efficient.

Use firstChild/nextSibling to speed up getChildren().

e0edc96

Make CallNodeTable sorted in depth-first traversal order.

7b4247e

Replace firstChild with nextAfterDescendants.

f512007

Take advantage of sorted call nodes in getSamplesSelectedStates.

1b4aba2

Take advantage of sorted call nodes in getTimingsForCallNodeIndex.

32cbefe

Take advantage of nextSibling in getCallNodeIndexFromParentAndFunc.

2e21b23

Take advantage of sorted call node table in getTreeOrderComparator.

9d2c18d

mstange and others added 29 commits November 28, 2023 00:21

Expand the comment above getTreeOrderComparator.

71f54aa

Rename nextAfterDescendants to subtreeRangeEnd.

868e6c6

Optimize call node table traversals (Merge PR #4807)

5610835

Update all Yarn dependencies (2023-11-27)

d8a1a67

Only match js files for prettier and fix style changes

880a77a

Update all Yarn dependencies (2023-11-27) (#4799)

1f9191e

Don't call getChildren in findHeavyPathToSameFunctionAfterInversion.

a774f34

This is a bit faster because we don't need to create children arrays and sort them. It also fixes a bug with diff profiles. (Test coming up in the next commit.)

Compute a correct func names dict for merged threads.

e850ec4

In the past, the indexes would be incorrect for the diffed thread because we were using the dict for one of the input threads, which has a different funcTable.

Add a test for heavy paths in diff profiles.

17931f1

Improve findHeavyPathToSameFunctionAfterInversion. (Merge PR #4813)

5270c86

Don't ignore all of the fixtures directory, just the upgrades directory

d22246c

Run prettier on these files

2ea4245

Don't ignore all of the fixtures directory, just the upgrades directo…

7e2bb74

…ry (#4819)

Work around Math.abs being slow in Firefox in React development envir…

0b537cb

…onments.

Improve flame graph performance. (Merge PR #4820)

469a598

Apply missed fixes for PR #4820.

cb21da5

Apply missed fixes for PR #4820. (Merge PR #4824)

625135b

Speed up getCallNodeInfo by using firstChild / nextSibling / currentL…

ebabce7

…astChild instead of a Map to find existing call nodes with the same prefix and func.

Speed up getCallNodeInfo (#4826)

77840bf

Speed up getStackTimingByDepth by taking advantage of the call node t…

b21b421

…able ordering. Before: https://share.firefox.dev/47CvQto (1471ms) After: https://share.firefox.dev/47z8Ewb (429ms, 3.4x faster)

Reduce the time of switching to the stack chart tab for the first tim…

358a1c7

…e (Merge PR #4825)

Rename CallTreeCountsAndSummary to CallTreeTimings.

d3e0420

Replace child count with "has children" (Merge PR #4829)

1b08327

julienw merged commit f553df1 into production Dec 5, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deploy Dec 5, 2023#4832

Deploy Dec 5, 2023#4832
julienw merged 63 commits into
productionfrom
main

julienw commented Dec 5, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

julienw commented Dec 5, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants