Skip to content

[release/10.0] Port dump collection perf improvements#128023

Merged
hoyosjs merged 3 commits into
dotnet:release/10.0from
hoyosjs:juhoyosa/use-heap2-10
May 13, 2026
Merged

[release/10.0] Port dump collection perf improvements#128023
hoyosjs merged 3 commits into
dotnet:release/10.0from
hoyosjs:juhoyosa/use-heap2-10

Conversation

@hoyosjs
Copy link
Copy Markdown
Member

@hoyosjs hoyosjs commented May 11, 2026

Fixes #122459

main PRs:

Description

Backports three DAC performance improvements for minidump collection:

  1. Use SHash as DAC instance hash (Use SHash as DAC instance hash #125631): Replaces the hand-rolled hash table in DacInstanceManager with an SHash-based implementation. The previous fixed-bucket hash degraded quickly for Find and insertion operations under high load. Measured ~9.5x speedup for minidump collection against a repro app with 2.5k-frame deep stacks over 50 threads.

  2. Cache debugger patches (Cache debugger patches to speed up x64 stackwalk epilogue/prologue scanning #125459): Caches the list of debugger breakpoint patches so that x64 stack unwinding doesn't re-scan the 1,000-bucket patch hash table on every frame. The cache is populated once on first access and invalidated on Flush(). Measured reduction from 55s to ~7s for minidump collection (10,000 iterations across 10 threads).

  3. Enable CLRDATA_ENUM_MEM_HEAP2 via environment variable: When the target process has DOTNET_EnableFastHeapDumps set, the DAC promotes CLRDATA_ENUM_MEM_HEAP to CLRDATA_ENUM_MEM_HEAP2, which dumps loader heap pages in bulk instead of walking individual runtime structures.

Customer Impact

Customers collecting minidumps of large .NET applications (many threads, deep stacks) experience extremely slow dump collection times - on the order of minutes for what should take seconds. This directly impacts incident response time in production environments. Without these fixes, dump collection through Watson/dotnet-dump/createdump remains unacceptably slow for large workloads.

Regression

Yes, with respect to framework. Customers doing migrations have noticed them - framework used non-portable variants of the MSVC library.

Testing

Risk

Low.

  • The SHash and patch cache changes are confined to DAC-only code paths (daccess.cpp, dacfn.cpp, dacimpl.h) that execute only during diagnostic operations (dump collection, debugging). They do not affect runtime execution.
  • The DOTNET_EnableFastHeapDumps env var is opt-in and does not change default behavior.
  • All three changes have been running in main since March without reported issues.

hoyosjs added 3 commits May 10, 2026 23:35
Replaces the hand-rolled hash table implementation in DacInstanceManager
withan `SHash` based implementation. This hash is more efficient for the
high volume of find operations the DAC issues when verifying cached
cross-process reads. During mini dump collection, DacInstanceManager is
the central cache for all memory read from the target process. The
hand-rolled hash table used a fixed bucket array that degraded quickly
for Find and insertion operations.

Measured minidump collection against a repro app with 2.5k frame deep
stacks over 50 threads and the speedup was roughly 9.5x.
…anning (dotnet#125459)

Second partial fix for dotnet#122459

Caches the list of debugger breakpoint patches in the DAC so that x64
stack unwinding doesn't re-scan the patch hash table on every frame.
During mini dump collection, each stack frame triggers
DacReplacePatchesInHostMemory to restore original opcodes before reading
memory — even though there are typically zero active patches during a
dump. The patch hash table has 1,000 fixed buckets, so each call walked
all of them regardless. The cache is populated once on first access and
invalidated only on Flush().

Measured minidump collection against the same repro app with 10,000
iterations across 10 threads. The baseline was 55s, this change alone
brings it to ~7s
@dotnet-policy-service
Copy link
Copy Markdown
Contributor

Tagging subscribers to this area: @steveisok, @tommcdon, @dotnet/dotnet-diag
See info in area-owners.md if you want to be subscribed.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Backport of CoreCLR DAC minidump-collection performance improvements to release/10.0, focused on reducing overhead during heap dump enumeration and x64 stack unwinding, plus an opt-in switch to use the HEAP2 enumeration path for faster heap dumps.

Changes:

  • Replace the DAC instance cache’s prior map implementation with an SHash-based table to improve lookup/insert scalability during dump generation.
  • Add a DAC-side patch cache so x64 unwinding doesn’t repeatedly rescan the debugger patch table on every frame.
  • Introduce DOTNET_EnableFastHeapDumps (via EXTERNAL_EnableFastHeapDumps) to let the target process opt into promoting HEAP dumps to HEAP2 inside the DAC.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
src/coreclr/vm/vars.hpp Declares a new global (g_EnableFastHeapDumps) for communicating the opt-in to the DAC.
src/coreclr/vm/vars.cpp Defines/initializes g_EnableFastHeapDumps.
src/coreclr/vm/ceemain.cpp Reads EXTERNAL_EnableFastHeapDumps on startup and stores it in g_EnableFastHeapDumps.
src/coreclr/inc/dacvars.h Exposes g_EnableFastHeapDumps to the DAC via DEFINE_DACVAR.
src/coreclr/inc/clrconfigvalues.h Adds the EXTERNAL_EnableFastHeapDumps config value (env var surface).
src/coreclr/debug/daccess/enummem.cpp Promotes HEAP → HEAP2 when g_EnableFastHeapDumps != 0 during heap dump enumeration.
src/coreclr/debug/daccess/dacimpl.h Switches the instance cache to SHash traits and introduces the DacPatchCache type/member.
src/coreclr/debug/daccess/dacfn.cpp Uses the patch cache in DacReplacePatchesInHostMemory and implements cache population.
src/coreclr/debug/daccess/daccess.cpp Updates instance cache operations to SHash APIs and flushes the new patch cache on DAC Flush().

Comment thread src/coreclr/debug/daccess/dacfn.cpp
@hoyosjs hoyosjs merged commit 5aab1c8 into dotnet:release/10.0 May 13, 2026
108 of 109 checks passed
@hoyosjs hoyosjs deleted the juhoyosa/use-heap2-10 branch May 13, 2026 01:11
@snakefoot
Copy link
Copy Markdown
Contributor

Guess it should be assigned the upcoming milestone 10.0.9 ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants