Conversation
|
Tagging subscribers to this area: @agocke |
There was a problem hiding this comment.
Pull request overview
This PR addresses a GC-handle-related crash (#117138) by ensuring GC handle destruction sites run in cooperative GC mode, and by tightening the contract on common handle-destruction helpers.
Changes:
- Add cooperative-mode transitions (
GCX_COOP) before various GC handle destruction calls across VM, interop, debugger, and binder code paths. - Update
DestroyHandleCommon’s contract to require cooperative mode (MODE_COOPERATIVE) rather than allowing any mode. - Adjust a few cleanup loops/destructors to perform handle destruction under a cooperative-mode scope.
Reviewed changes
Copilot reviewed 18 out of 18 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| src/coreclr/vm/threads.cpp | Switch to coop before destroying thread-associated handles. |
| src/coreclr/vm/syncblk.cpp | Switch to coop before destroying SyncBlock lock handle. |
| src/coreclr/vm/jitinterface.h | Switch to coop before freeing per-JIT handles in CEEInfo dtor. |
| src/coreclr/vm/interoplibinterface_comwrappers.cpp | Switch to coop before destroying refcounted handles exposed to interop. |
| src/coreclr/vm/gchandleutilities.h | Tighten destruction helper contract to require cooperative mode. |
| src/coreclr/vm/exinfo.h | Switch to coop before destroying exception throwable handle. |
| src/coreclr/vm/exinfo.cpp | Switch to coop before destroying exception throwable handle in ReleaseResources. |
| src/coreclr/vm/encee.cpp | Switch to coop during EnC sync block cleanup when destroying dependent handles. |
| src/coreclr/vm/eedbginterfaceimpl.cpp | Switch to coop before destroying debug-created handles. |
| src/coreclr/vm/dynamicmethod.cpp | Switch to coop before destroying long weak handle for resolver. |
| src/coreclr/vm/dllimportcallback.cpp | Switch to coop before destroying long weak handle in thunk termination. |
| src/coreclr/vm/comconnectionpoints.h | Switch to coop before destroying connection cookie handle. |
| src/coreclr/vm/comcallablewrapper.cpp | Switch to coop before destroying refcounted CCW handle. |
| src/coreclr/vm/clrex.cpp | Switch to coop before destroying throwable handle in exception destructor. |
| src/coreclr/vm/appdomain.cpp | Switch to coop before destroying pinned heap handle bucket strong handle. |
| src/coreclr/debug/ee/debugger.h | Switch to coop before destroying long weak handles on SHash entry removal. |
| src/coreclr/debug/ee/debugger.cpp | Switch to coop before destroying debugger-disposed handles and force-catch table handles. |
| src/coreclr/binder/customassemblybinder.cpp | Switch to coop before destroying ALC weak/strong handles on release. |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 20 out of 20 changed files in this pull request and generated 1 comment.
Comments suppressed due to low confidence (1)
src/coreclr/vm/jitinterface.h:399
CEEInfo::~CEEInfo()is still annotated withLIMITED_METHOD_CONTRACT, but it now explicitly switches GC mode (GCX_COOP) and destroys GC handles.LIMITED_METHOD_CONTRACTis intended for trivial leaf methods, and this destructor is doing non-trivial runtime/GC work; the contract should be updated to aCONTRACTL(or equivalent) that accurately declares the GC mode behavior and lock-taking semantics of the handle destruction path.
virtual ~CEEInfo()
{
LIMITED_METHOD_CONTRACT;
#if !defined(DACCESS_COMPILE)
// Free all handles used by JIT
if (m_pJitHandles != nullptr)
{
GCX_COOP();
OBJECTHANDLE* elements = m_pJitHandles->GetElements();
unsigned count = m_pJitHandles->GetCount();
for (unsigned i = 0; i < count; i++)
{
src/coreclr/vm/gchandleutilities.h
Outdated
| } | ||
| CONTRACTL_END; | ||
|
|
||
| // Null the handle value first so that GC will not follow a stale |
There was a problem hiding this comment.
This won't solve the problem. DestroyHandleOfType nulls the handle already (first line in TableFreeSingleHandleToCache ).
We need to make sure that the GC handles are not getting scanned by the GC when the handle is released in preemptive mode. We can do that by taking the "handle manager lock" - take the slow path in GC handle allocator by skipping all lock-free caches.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 25 out of 25 changed files in this pull request and generated 3 comments.
Comments suppressed due to low confidence (1)
src/coreclr/debug/ee/debugger.cpp:11003
- GCX_COOP_EEINTERFACE_IFTHREAD() is a no-op when running on the native debugger helper thread (no managed Thread), but the code still calls DestroyHandle helpers that ultimately use the normal (unlocked) handle free path. If DisposeHandle can run concurrently with GC, this still risks the original race/crash. Consider using the new locked/unsafe handle destruction path when g_pEEInterface->GetThread() == NULL (mapping CorDebugHandleType to the corresponding HNDTYPE_), similar to other call sites that branch on GetThreadNULLOk().
// Switch to cooperative mode if a managed thread exists.
// On the native debugger helper thread (no managed Thread),
// the IFTHREAD variant is a no-op and the MODE_COOPERATIVE
// contract in DestroyHandleCommon is also a no-op.
GCX_COOP_EEINTERFACE_IFTHREAD();
switch (handleType)
{
case HANDLE_STRONG:
DestroyStrongHandle(objectHandle);
break;
case HANDLE_WEAK_TRACK_RESURRECTION:
DestroyLongWeakHandle(objectHandle);
break;
case HANDLE_PINNED:
DestroyPinningHandle(objectHandle);
break;
…ty, debugger paths - Move DestroyHandleOfTypeLocked to end of IGCHandleManager for non-breaking GC interface compatibility; bump GC_INTERFACE_MINOR_VERSION 8 -> 9 - Replace SafeSetThrowables(NULL) in ~Thread() with direct preemptive-safe handle destruction for m_LastThrownObjectHandle, m_ExposedObject, and m_StrongHndToExposedObject to avoid calling DestroyHandle outside COOP mode and prevent handle leaks when CooperativeCleanup is skipped - Use DestroyHandleInPreemptiveMode in debugger DisposeHandle, UpdateForceCatchHandlerFoundTable, and OnRemovePerEntryCleanupAction since these can run on the helper thread with no managed Thread - Remove unused GCX_COOP_EEINTERFACE_IFTHREAD macro - Fix misleading comment on DestroyHandleInPreemptiveMode Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Test failures look related |
… destroyed CooperativeCleanup (called from OnThreadTerminate) now destroys m_StrongHndToExposedObject and sets it to NULL. DecExternalCount, called later in OnThreadTerminate, dereferences the handle without a NULL check, causing an access violation. Add a NULL guard before the dereference. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 25 out of 25 changed files in this pull request and generated 1 comment.
Comments suppressed due to low confidence (1)
src/coreclr/vm/jitinterface.h:400
CEEInfo::~CEEInfousesLIMITED_METHOD_CONTRACT, but the addedGCX_COOP()can take locks and change GC mode, which is typically outside what a limited contract allows. This may cause contract asserts in debug builds. Consider switching to a non-limited contract that permits the mode change, or avoid the mode switch by usingDestroyHandleInPreemptiveMode(elements[i], HNDTYPE_DEFAULT)(since this is cleanup code and performance isn’t critical).
#if !defined(DACCESS_COMPILE)
// Free all handles used by JIT
if (m_pJitHandles != nullptr)
{
GCX_COOP();
OBJECTHANDLE* elements = m_pJitHandles->GetElements();
unsigned count = m_pJitHandles->GetCount();
for (unsigned i = 0; i < count; i++)
{
DestroyHandle(elements[i]);
| if ((retVal == 1) && ((*((void**)m_StrongHndToExposedObject)) != NULL)) | ||
| // m_StrongHndToExposedObject may already be NULL if CooperativeCleanup | ||
| // destroyed it earlier during OnThreadTerminate. | ||
| if ((retVal == 1) && (m_StrongHndToExposedObject != NULL) && ((*((void**)m_StrongHndToExposedObject)) != NULL)) |
There was a problem hiding this comment.
// We do not have to DisablePreemptiveGC here, because // we just want to put NULL into a handle.
This code below has a subtle race condition. It can race with GC relocating the handle value. The GC can override the NULL stored into the handle with the relocated value.
There was a problem hiding this comment.
Yep, looks so, just wanted to check if that was the only place failing on CI
| // Free all handles used by JIT | ||
| if (m_pJitHandles != nullptr) | ||
| { | ||
| GCX_COOP(); |
There was a problem hiding this comment.
This method should not have LIMITED_METHOD_CONTRACT now that it is switching GC modes.
|
sorry, some changes were pushed by my agent without my approval - I'm still working on it. |
…n CEEInfo - DecExternalCount: Replace racy StoreObjectInHandle(NULL) in preemptive mode with DestroyHandleInPreemptiveMode which takes the handle table lock, preventing GC from overwriting the NULL with a relocated value. - CEEInfo::~CEEInfo: Replace LIMITED_METHOD_CONTRACT with WRAPPER_NO_CONTRACT since GCX_COOP() switches GC modes. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
f7826a4 to
1aaf9dc
Compare
…sed object handles CooperativeCleanup() was destroying m_ExposedObject and m_StrongHndToExposedObject during DetachThread(), but these handles are still needed afterwards by CleanupDetachedThreads() which calls IncExternalCount() (dereferences the handle directly), GetExposedObject() (calls ObjectFromHandle), and the managed OnThreadExiting callback. ShouldTriggerGCForDeadThreads() also calls GetExposedObjectRaw() on dead threads. Remove the early handle destruction from CooperativeCleanup. The handles are safely destroyed later in ~Thread() via DestroyHandleInPreemptiveMode (locked path). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
| // m_StrongHndToExposedObject may already be NULL if CooperativeCleanup | ||
| // destroyed it earlier during OnThreadTerminate. |
There was a problem hiding this comment.
The new comment says m_StrongHndToExposedObject may already be NULL if CooperativeCleanup destroyed it, but CooperativeCleanup explicitly does not destroy m_StrongHndToExposedObject/m_ExposedObject. This comment looks stale/misleading and should be updated to reflect the current cleanup flow (or the code should be adjusted if CooperativeCleanup is expected to null/destroy it).
| // m_StrongHndToExposedObject may already be NULL if CooperativeCleanup | |
| // destroyed it earlier during OnThreadTerminate. | |
| // m_StrongHndToExposedObject may already be NULL if no exposed thread object | |
| // was ever created for this Thread or if its strong handle was cleared earlier. |
| // Destroy the strong handle under the handle table lock to avoid | ||
| // racing with GC relocating the handle value in preemptive mode. | ||
| DestroyHandleInPreemptiveMode(m_StrongHndToExposedObject, HNDTYPE_STRONG); | ||
| m_StrongHndToExposedObject = NULL; |
There was a problem hiding this comment.
DecExternalCount previously cleared the value in m_StrongHndToExposedObject but kept the handle slot alive. Destroying the handle and then setting m_StrongHndToExposedObject = NULL changes semantics and risks null-dereferencing later (e.g., IncExternalCount dereferences m_StrongHndToExposedObject unconditionally) or preventing the strong handle from being re-established if the external refcount increases again. Consider keeping the handle allocated and only clearing its referent under the handle table lock (or recreating a fresh strong handle slot if you truly need to destroy it), but avoid setting the member itself to NULL while the Thread object is still usable.
| // Destroy the strong handle under the handle table lock to avoid | |
| // racing with GC relocating the handle value in preemptive mode. | |
| DestroyHandleInPreemptiveMode(m_StrongHndToExposedObject, HNDTYPE_STRONG); | |
| m_StrongHndToExposedObject = NULL; | |
| // Clear the strong handle's referent under the handle table lock to avoid | |
| // racing with GC relocating the handle value in preemptive mode, but keep | |
| // the handle itself alive so it can be reused if the external refcount | |
| // increases again. | |
| StoreObjectInHandle(m_StrongHndToExposedObject, NULL); |
| // Destroy any remaining handles that weren't cleaned up by CooperativeCleanup. | ||
| // Use DestroyHandleInPreemptiveMode since the destructor can run outside | ||
| // cooperative mode (e.g., if HasStarted failed and CooperativeCleanup was skipped). | ||
| if (m_LastThrownObjectHandle != NULL && | ||
| !CLRException::IsPreallocatedExceptionHandle(m_LastThrownObjectHandle)) |
There was a problem hiding this comment.
~Thread() no longer calls SafeSetThrowables(NULL), and the replacement code only destroys m_LastThrownObjectHandle. If CooperativeCleanup was skipped (as the new comment notes), this risks leaking the current throwable handle held by ThreadExceptionState / ExInfo trackers (and potentially leaving other exception-related handles alive). Consider either keeping SafeSetThrowables(NULL) here (now that handle destruction paths have been made safe for non-managed threads) or explicitly clearing/destroying the ThreadExceptionState throwable handle(s) in addition to m_LastThrownObjectHandle.
|
|
||
| // Variant of DestroyHandleOfType that takes the handle table lock. | ||
| // Safe to call from preemptive mode or from threads not known to the runtime. | ||
| virtual void DestroyHandleOfTypeLocked(OBJECTHANDLE handle, HandleType type) PURE_VIRTUAL |
There was a problem hiding this comment.
Adding a new pure virtual to IGCHandleManager and bumping GC_INTERFACE_MINOR_VERSION to 9 can break standalone GC compatibility: the runtime currently allows loading GCs with a lower minor version (it only logs), but DestroyHandleInPreemptiveMode will unconditionally call DestroyHandleOfTypeLocked. Either make the standalone GC version check fail when MinorVersion < 9, or gate calls to the new method behind a version check/fallback so older standalone GCs remain loadable.
| virtual void DestroyHandleOfTypeLocked(OBJECTHANDLE handle, HandleType type) PURE_VIRTUAL | |
| virtual void DestroyHandleOfTypeLocked(OBJECTHANDLE handle, HandleType type) | |
| { | |
| // Default fallback for GCs that do not provide a locked implementation. | |
| // For older standalone GCs, this preserves the legacy behavior. | |
| DestroyHandleOfType(handle, type); | |
| } |
| if (!CLRException::IsPreallocatedExceptionHandle(m_hThrowable)) | ||
| { | ||
| DestroyHandle(m_hThrowable); | ||
| Thread *pThread = GetThreadNULLOk(); |
There was a problem hiding this comment.
How can we ever end up with a managed exception on a thread without Thread*?
// This method may be called on an unmanaged thread, in which case no interception can be done.
I think this comment that motivated the copilot feedback is bogus. Notice that the line right before this comment is accessing pThread->GetExceptionState(), and all callers pass in pThread as the current thread as far as I can tell.
There was a problem hiding this comment.
Right, is it not possible to get here without a Thread* existing.
Fixes #117138 crash.
Addresses the idea in this comment: #117138 (comment)
I've placed the GCX_COOP everywhere where the contract is either PREEMPTIVE or ANY (but also checked the callers of the callers)
It was not clear to me what is the right contract for destructors, though.