Skip to content

Delete GC handles in COOP mode#125350

Draft
EgorBo wants to merge 14 commits intodotnet:mainfrom
EgorBo:delete-handles-coop
Draft

Delete GC handles in COOP mode#125350
EgorBo wants to merge 14 commits intodotnet:mainfrom
EgorBo:delete-handles-coop

Conversation

@EgorBo
Copy link
Copy Markdown
Member

@EgorBo EgorBo commented Mar 9, 2026

Fixes #117138 crash.

Addresses the idea in this comment: #117138 (comment)

I've placed the GCX_COOP everywhere where the contract is either PREEMPTIVE or ANY (but also checked the callers of the callers)
It was not clear to me what is the right contract for destructors, though.

Copilot AI review requested due to automatic review settings March 9, 2026 21:46
@dotnet-policy-service
Copy link
Copy Markdown
Contributor

Tagging subscribers to this area: @agocke
See info in area-owners.md if you want to be subscribed.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses a GC-handle-related crash (#117138) by ensuring GC handle destruction sites run in cooperative GC mode, and by tightening the contract on common handle-destruction helpers.

Changes:

  • Add cooperative-mode transitions (GCX_COOP) before various GC handle destruction calls across VM, interop, debugger, and binder code paths.
  • Update DestroyHandleCommon’s contract to require cooperative mode (MODE_COOPERATIVE) rather than allowing any mode.
  • Adjust a few cleanup loops/destructors to perform handle destruction under a cooperative-mode scope.

Reviewed changes

Copilot reviewed 18 out of 18 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
src/coreclr/vm/threads.cpp Switch to coop before destroying thread-associated handles.
src/coreclr/vm/syncblk.cpp Switch to coop before destroying SyncBlock lock handle.
src/coreclr/vm/jitinterface.h Switch to coop before freeing per-JIT handles in CEEInfo dtor.
src/coreclr/vm/interoplibinterface_comwrappers.cpp Switch to coop before destroying refcounted handles exposed to interop.
src/coreclr/vm/gchandleutilities.h Tighten destruction helper contract to require cooperative mode.
src/coreclr/vm/exinfo.h Switch to coop before destroying exception throwable handle.
src/coreclr/vm/exinfo.cpp Switch to coop before destroying exception throwable handle in ReleaseResources.
src/coreclr/vm/encee.cpp Switch to coop during EnC sync block cleanup when destroying dependent handles.
src/coreclr/vm/eedbginterfaceimpl.cpp Switch to coop before destroying debug-created handles.
src/coreclr/vm/dynamicmethod.cpp Switch to coop before destroying long weak handle for resolver.
src/coreclr/vm/dllimportcallback.cpp Switch to coop before destroying long weak handle in thunk termination.
src/coreclr/vm/comconnectionpoints.h Switch to coop before destroying connection cookie handle.
src/coreclr/vm/comcallablewrapper.cpp Switch to coop before destroying refcounted CCW handle.
src/coreclr/vm/clrex.cpp Switch to coop before destroying throwable handle in exception destructor.
src/coreclr/vm/appdomain.cpp Switch to coop before destroying pinned heap handle bucket strong handle.
src/coreclr/debug/ee/debugger.h Switch to coop before destroying long weak handles on SHash entry removal.
src/coreclr/debug/ee/debugger.cpp Switch to coop before destroying debugger-disposed handles and force-catch table handles.
src/coreclr/binder/customassemblybinder.cpp Switch to coop before destroying ALC weak/strong handles on release.

Copilot AI review requested due to automatic review settings March 10, 2026 02:11
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 20 out of 20 changed files in this pull request and generated 1 comment.

Comments suppressed due to low confidence (1)

src/coreclr/vm/jitinterface.h:399

  • CEEInfo::~CEEInfo() is still annotated with LIMITED_METHOD_CONTRACT, but it now explicitly switches GC mode (GCX_COOP) and destroys GC handles. LIMITED_METHOD_CONTRACT is intended for trivial leaf methods, and this destructor is doing non-trivial runtime/GC work; the contract should be updated to a CONTRACTL (or equivalent) that accurately declares the GC mode behavior and lock-taking semantics of the handle destruction path.
    virtual ~CEEInfo()
    {
        LIMITED_METHOD_CONTRACT;

#if !defined(DACCESS_COMPILE)
        // Free all handles used by JIT
        if (m_pJitHandles != nullptr)
        {
            GCX_COOP();
            OBJECTHANDLE* elements = m_pJitHandles->GetElements();
            unsigned count = m_pJitHandles->GetCount();
            for (unsigned i = 0; i < count; i++)
            {

}
CONTRACTL_END;

// Null the handle value first so that GC will not follow a stale
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This won't solve the problem. DestroyHandleOfType nulls the handle already (first line in TableFreeSingleHandleToCache ).

We need to make sure that the GC handles are not getting scanned by the GC when the handle is released in preemptive mode. We can do that by taking the "handle manager lock" - take the slow path in GC handle allocator by skipping all lock-free caches.

Copilot AI review requested due to automatic review settings March 11, 2026 03:10
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 25 out of 25 changed files in this pull request and generated 3 comments.

Comments suppressed due to low confidence (1)

src/coreclr/debug/ee/debugger.cpp:11003

  • GCX_COOP_EEINTERFACE_IFTHREAD() is a no-op when running on the native debugger helper thread (no managed Thread), but the code still calls DestroyHandle helpers that ultimately use the normal (unlocked) handle free path. If DisposeHandle can run concurrently with GC, this still risks the original race/crash. Consider using the new locked/unsafe handle destruction path when g_pEEInterface->GetThread() == NULL (mapping CorDebugHandleType to the corresponding HNDTYPE_), similar to other call sites that branch on GetThreadNULLOk().
            // Switch to cooperative mode if a managed thread exists.
            // On the native debugger helper thread (no managed Thread),
            // the IFTHREAD variant is a no-op and the MODE_COOPERATIVE
            // contract in DestroyHandleCommon is also a no-op.
            GCX_COOP_EEINTERFACE_IFTHREAD();

            switch (handleType)
            {
            case HANDLE_STRONG:
                DestroyStrongHandle(objectHandle);
                break;
            case HANDLE_WEAK_TRACK_RESURRECTION:
                DestroyLongWeakHandle(objectHandle);
                break;
            case HANDLE_PINNED:
                DestroyPinningHandle(objectHandle);
                break;

EgorBo and others added 2 commits March 24, 2026 17:14
…ty, debugger paths

- Move DestroyHandleOfTypeLocked to end of IGCHandleManager for non-breaking
  GC interface compatibility; bump GC_INTERFACE_MINOR_VERSION 8 -> 9
- Replace SafeSetThrowables(NULL) in ~Thread() with direct preemptive-safe
  handle destruction for m_LastThrownObjectHandle, m_ExposedObject, and
  m_StrongHndToExposedObject to avoid calling DestroyHandle outside COOP mode
  and prevent handle leaks when CooperativeCleanup is skipped
- Use DestroyHandleInPreemptiveMode in debugger DisposeHandle,
  UpdateForceCatchHandlerFoundTable, and OnRemovePerEntryCleanupAction since
  these can run on the helper thread with no managed Thread
- Remove unused GCX_COOP_EEINTERFACE_IFTHREAD macro
- Fix misleading comment on DestroyHandleInPreemptiveMode

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@jkotas
Copy link
Copy Markdown
Member

jkotas commented Mar 24, 2026

Test failures look related

… destroyed

CooperativeCleanup (called from OnThreadTerminate) now destroys
m_StrongHndToExposedObject and sets it to NULL. DecExternalCount,
called later in OnThreadTerminate, dereferences the handle without
a NULL check, causing an access violation.

Add a NULL guard before the dereference.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings March 25, 2026 12:20
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 25 out of 25 changed files in this pull request and generated 1 comment.

Comments suppressed due to low confidence (1)

src/coreclr/vm/jitinterface.h:400

  • CEEInfo::~CEEInfo uses LIMITED_METHOD_CONTRACT, but the added GCX_COOP() can take locks and change GC mode, which is typically outside what a limited contract allows. This may cause contract asserts in debug builds. Consider switching to a non-limited contract that permits the mode change, or avoid the mode switch by using DestroyHandleInPreemptiveMode(elements[i], HNDTYPE_DEFAULT) (since this is cleanup code and performance isn’t critical).
#if !defined(DACCESS_COMPILE)
        // Free all handles used by JIT
        if (m_pJitHandles != nullptr)
        {
            GCX_COOP();
            OBJECTHANDLE* elements = m_pJitHandles->GetElements();
            unsigned count = m_pJitHandles->GetCount();
            for (unsigned i = 0; i < count; i++)
            {
                DestroyHandle(elements[i]);

if ((retVal == 1) && ((*((void**)m_StrongHndToExposedObject)) != NULL))
// m_StrongHndToExposedObject may already be NULL if CooperativeCleanup
// destroyed it earlier during OnThreadTerminate.
if ((retVal == 1) && (m_StrongHndToExposedObject != NULL) && ((*((void**)m_StrongHndToExposedObject)) != NULL))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

     // We do not have to DisablePreemptiveGC here, because
     // we just want to put NULL into a handle.

This code below has a subtle race condition. It can race with GC relocating the handle value. The GC can override the NULL stored into the handle with the relocated value.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, looks so, just wanted to check if that was the only place failing on CI

// Free all handles used by JIT
if (m_pJitHandles != nullptr)
{
GCX_COOP();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method should not have LIMITED_METHOD_CONTRACT now that it is switching GC modes.

@EgorBo
Copy link
Copy Markdown
Member Author

EgorBo commented Mar 25, 2026

sorry, some changes were pushed by my agent without my approval - I'm still working on it.

…n CEEInfo

- DecExternalCount: Replace racy StoreObjectInHandle(NULL) in preemptive
  mode with DestroyHandleInPreemptiveMode which takes the handle table
  lock, preventing GC from overwriting the NULL with a relocated value.
- CEEInfo::~CEEInfo: Replace LIMITED_METHOD_CONTRACT with
  WRAPPER_NO_CONTRACT since GCX_COOP() switches GC modes.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings March 25, 2026 14:08
@EgorBo EgorBo force-pushed the delete-handles-coop branch from f7826a4 to 1aaf9dc Compare March 25, 2026 14:08
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 25 out of 25 changed files in this pull request and generated 3 comments.

EgorBo and others added 2 commits March 26, 2026 01:32
…sed object handles

CooperativeCleanup() was destroying m_ExposedObject and m_StrongHndToExposedObject
during DetachThread(), but these handles are still needed afterwards by
CleanupDetachedThreads() which calls IncExternalCount() (dereferences the handle
directly), GetExposedObject() (calls ObjectFromHandle), and the managed
OnThreadExiting callback. ShouldTriggerGCForDeadThreads() also calls
GetExposedObjectRaw() on dead threads.

Remove the early handle destruction from CooperativeCleanup. The handles are
safely destroyed later in ~Thread() via DestroyHandleInPreemptiveMode (locked path).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings March 26, 2026 16:47
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 25 out of 25 changed files in this pull request and generated 4 comments.

Comment on lines +2176 to +2177
// m_StrongHndToExposedObject may already be NULL if CooperativeCleanup
// destroyed it earlier during OnThreadTerminate.
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new comment says m_StrongHndToExposedObject may already be NULL if CooperativeCleanup destroyed it, but CooperativeCleanup explicitly does not destroy m_StrongHndToExposedObject/m_ExposedObject. This comment looks stale/misleading and should be updated to reflect the current cleanup flow (or the code should be adjusted if CooperativeCleanup is expected to null/destroy it).

Suggested change
// m_StrongHndToExposedObject may already be NULL if CooperativeCleanup
// destroyed it earlier during OnThreadTerminate.
// m_StrongHndToExposedObject may already be NULL if no exposed thread object
// was ever created for this Thread or if its strong handle was cleared earlier.

Copilot uses AI. Check for mistakes.
Comment on lines +2180 to +2183
// Destroy the strong handle under the handle table lock to avoid
// racing with GC relocating the handle value in preemptive mode.
DestroyHandleInPreemptiveMode(m_StrongHndToExposedObject, HNDTYPE_STRONG);
m_StrongHndToExposedObject = NULL;
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DecExternalCount previously cleared the value in m_StrongHndToExposedObject but kept the handle slot alive. Destroying the handle and then setting m_StrongHndToExposedObject = NULL changes semantics and risks null-dereferencing later (e.g., IncExternalCount dereferences m_StrongHndToExposedObject unconditionally) or preventing the strong handle from being re-established if the external refcount increases again. Consider keeping the handle allocated and only clearing its referent under the handle table lock (or recreating a fresh strong handle slot if you truly need to destroy it), but avoid setting the member itself to NULL while the Thread object is still usable.

Suggested change
// Destroy the strong handle under the handle table lock to avoid
// racing with GC relocating the handle value in preemptive mode.
DestroyHandleInPreemptiveMode(m_StrongHndToExposedObject, HNDTYPE_STRONG);
m_StrongHndToExposedObject = NULL;
// Clear the strong handle's referent under the handle table lock to avoid
// racing with GC relocating the handle value in preemptive mode, but keep
// the handle itself alive so it can be reused if the external refcount
// increases again.
StoreObjectInHandle(m_StrongHndToExposedObject, NULL);

Copilot uses AI. Check for mistakes.
Comment on lines +2288 to +2292
// Destroy any remaining handles that weren't cleaned up by CooperativeCleanup.
// Use DestroyHandleInPreemptiveMode since the destructor can run outside
// cooperative mode (e.g., if HasStarted failed and CooperativeCleanup was skipped).
if (m_LastThrownObjectHandle != NULL &&
!CLRException::IsPreallocatedExceptionHandle(m_LastThrownObjectHandle))
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

~Thread() no longer calls SafeSetThrowables(NULL), and the replacement code only destroys m_LastThrownObjectHandle. If CooperativeCleanup was skipped (as the new comment notes), this risks leaking the current throwable handle held by ThreadExceptionState / ExInfo trackers (and potentially leaving other exception-related handles alive). Consider either keeping SafeSetThrowables(NULL) here (now that handle destruction paths have been made safe for non-managed threads) or explicitly clearing/destroying the ThreadExceptionState throwable handle(s) in addition to m_LastThrownObjectHandle.

Copilot uses AI. Check for mistakes.

// Variant of DestroyHandleOfType that takes the handle table lock.
// Safe to call from preemptive mode or from threads not known to the runtime.
virtual void DestroyHandleOfTypeLocked(OBJECTHANDLE handle, HandleType type) PURE_VIRTUAL
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a new pure virtual to IGCHandleManager and bumping GC_INTERFACE_MINOR_VERSION to 9 can break standalone GC compatibility: the runtime currently allows loading GCs with a lower minor version (it only logs), but DestroyHandleInPreemptiveMode will unconditionally call DestroyHandleOfTypeLocked. Either make the standalone GC version check fail when MinorVersion < 9, or gate calls to the new method behind a version check/fallback so older standalone GCs remain loadable.

Suggested change
virtual void DestroyHandleOfTypeLocked(OBJECTHANDLE handle, HandleType type) PURE_VIRTUAL
virtual void DestroyHandleOfTypeLocked(OBJECTHANDLE handle, HandleType type)
{
// Default fallback for GCs that do not provide a locked implementation.
// For older standalone GCs, this preserves the legacy behavior.
DestroyHandleOfType(handle, type);
}

Copilot uses AI. Check for mistakes.
if (!CLRException::IsPreallocatedExceptionHandle(m_hThrowable))
{
DestroyHandle(m_hThrowable);
Thread *pThread = GetThreadNULLOk();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can we ever end up with a managed exception on a thread without Thread*?

// This method may be called on an unmanaged thread, in which case no interception can be done.

I think this comment that motivated the copilot feedback is bogus. Notice that the line right before this comment is accessing pThread->GetExceptionState(), and all callers pass in pThread as the current thread as far as I can tell.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, is it not possible to get here without a Thread* existing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

AF: *(_UNCHECKED_OBJECTREF *)handle == NULL (HndCreateHandle called by getJitHandleForObject)

5 participants