Skip to content

[release/10.0] Fix heap_segment_used watermark after compaction#128342

Open
janvorli wants to merge 1 commit into
dotnet:release/10.0from
janvorli:backport-10-gc-compact-phase-used-watermark
Open

[release/10.0] Fix heap_segment_used watermark after compaction#128342
janvorli wants to merge 1 commit into
dotnet:release/10.0from
janvorli:backport-10-gc-compact-phase-used-watermark

Conversation

@janvorli
Copy link
Copy Markdown
Member

Backport of #128217 to release/10.0

Customer Impact

  • Customer reported
  • Found internally

GC with large pages enabled in regions mode can lead to intermittent crashes due to non-zeroed memory being returned for an allocation request that expects the memory to be zeroed.

Regression

  • Yes, introduced when GC regions were enabled
  • No

Testing

CI tests, local testing using targeted repro app from the customer, GC tests

Risk

Low. It adds maintaining heap_segment_used watermark after compaction so that it covers all the touched memory in a region. Before this change, it was stale (lower) for regions that receive relocated objects.

After compact_phase, heap_segment_used can be stale — lower than
the actual end of live data - because `plan_phase` sets
`plan_allocated` beyond used for regions that receive relocated objects.

When `decommit_region` later clears memory only up to used instead of
committed (the large-pages / never_decommit_p path),
the gap between used and plan_allocated retains dirty data from
a previous region lifetime, causing heap corruption on the next GC cycle.

Fix:
At the end of `compact_phase`, bump heap_segment_used to
`max(used, plan_allocated)` for every non-read-only region in
the condemned generations and one generation above (the maximum
compaction target range).

The fix cost is zero when no compaction occurs. When compaction
does occur, it avoids unnecessary `memclr` in `decommit_region`
by keeping the used watermark accurate, so only truly unused memory
is cleared.
@janvorli janvorli added this to the 10.0.x milestone May 18, 2026
@janvorli janvorli requested a review from VSadov May 18, 2026 22:25
@janvorli janvorli self-assigned this May 18, 2026
Copilot AI review requested due to automatic review settings May 18, 2026 22:25
@janvorli janvorli added area-GC-coreclr Servicing-consider Issue for next servicing release review labels May 18, 2026
@dotnet-policy-service
Copy link
Copy Markdown
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @dotnet/gc
See info in area-owners.md if you want to be subscribed.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This backport addresses a GC regions + large pages correctness issue where stale heap_segment_used values after compaction could cause decommit_region to clear an insufficient range, allowing dirty memory to be reused and leading to heap corruption/crashes.

Changes:

  • In compact_phase (regions mode), updates each affected region’s heap_segment_used to cover max(used, plan_allocated) for condemned generations and one generation above.
  • In decommit_heap_segment (regions mode), skips decommitting when large pages are enabled to avoid incorrect “logical decommit” behavior (large-page decommit is a no-op).

@janvorli
Copy link
Copy Markdown
Member Author

cc: @BenV

@BenV
Copy link
Copy Markdown

BenV commented May 19, 2026

cc: @BenV

Just wanted to confirm that this appears to resolve the issue on .NET 10 as expected. I'll keep running the tests for the full 4 hours just in case and report back. Thanks again for all your support @janvorli @cshung @mangod9, we really appreciate it!

Edit: The stress tests passed full 4 hour runs on both of my test machines!

@JulieLeeMSFT
Copy link
Copy Markdown
Member

Fixes #127687.

@JulieLeeMSFT JulieLeeMSFT added Servicing-approved Approved for servicing release and removed Servicing-consider Issue for next servicing release review labels May 19, 2026
@JulieLeeMSFT
Copy link
Copy Markdown
Member

@janvorli, once it is code reviewed, we can merge.

Comment thread src/coreclr/gc/gc.cpp
{
return;
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This block is not in the net11 fix. Is it a part of something else?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area-GC-coreclr Servicing-approved Approved for servicing release

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants