Skip to content

[cDAC] Implement DacDbi GetNativeCodeInfo / GetNativeCodeInfoForAddr#128338

Open
rcj1 wants to merge 13 commits into
mainfrom
copilot/cdac-enc-getnativecodeinfo
Open

[cDAC] Implement DacDbi GetNativeCodeInfo / GetNativeCodeInfoForAddr#128338
rcj1 wants to merge 13 commits into
mainfrom
copilot/cdac-enc-getnativecodeinfo

Conversation

@rcj1
Copy link
Copy Markdown
Contributor

@rcj1 rcj1 commented May 18, 2026

  • Restructures debugger-side lookup of EnC version data. Previously we had to walk through a hash table of all the DebuggerMethodInfos and then the DebuggerJitInfos to get this data. It was noted that this is quite heavy just to get this one piece of data. We don't have that many EnC versions, so we now have a linked list of these stored on the Module.
  • Introduces EnC contract to get EnC-related data.
    TargetNUInt GetLatestEnCVersion(TargetPointer module, uint methodDef);
    TargetNUInt GetEnCVersion(TargetPointer module, uint methodDef, TargetCodePointer nativeCodeAddress);
  • Implements cDAC DacDbi APIs GetNativeCodeInfo and GetNativeCodeInfoForAddr.

note to ccr: do not talk about changing com interface compatibility or compatibility with older datadescriptors.

Co-authored-by: rcj1 <77995559+rcj1@users.noreply.github.com>
@dotnet-policy-service
Copy link
Copy Markdown
Contributor

Tagging subscribers to this area: @steveisok, @tommcdon, @dotnet/dotnet-diag
See info in area-owners.md if you want to be subscribed.

@github-actions

This comment has been minimized.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds cDAC support for DacDbi native-code lookup APIs and introduces a lightweight EnC data contract/list so debugger-side code can retrieve native code region and EnC version information without walking debugger method-info tables.

Changes:

  • Implements managed cDAC GetNativeCodeInfo / GetNativeCodeInfoForAddr and related interop struct shape.
  • Adds EnC data descriptors, contract registration, native Module EnC data storage, and DAC lookup changes.
  • Extends loader/runtime-type-system contracts for member refs and async variant resolution, with design docs.
Show a summary per file
File Description
src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Legacy/Dbi/IDacDbiInterface.cs Updates DacDbi signatures and mirrors native code-info data.
src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Legacy/Dbi/DacDbiImpl.cs Implements native-code-info lookup in cDAC.
src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Contracts/EcmaMetadataUtils.cs Adds mdtMemberRef token type.
src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Contracts/DataType.cs Adds EnCData descriptor type.
src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Contracts/Data/Module.cs Reads optional EnC data list from modules.
src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Contracts/Data/EnCData.cs Adds managed view over native EnC data nodes.
src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Contracts/CoreCLRContracts.cs Registers the EnC contract implementation.
src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Contracts/Contracts/RuntimeTypeSystem_1.cs Adds async variant resolution support.
src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Contracts/Contracts/Loader_1.cs Adds member-ref-to-method lookup helper.
src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Contracts/Contracts/EnC_1.cs Implements EnC version lookup contract.
src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Contracts/Constants.cs Adds default EnC version global name.
src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Abstractions/Contracts/IRuntimeTypeSystem.cs Exposes async variant contract API.
src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Abstractions/Contracts/ILoader.cs Exposes member-ref lookup contract API.
src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Abstractions/Contracts/IEnC.cs Adds public EnC contract abstraction.
src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Abstractions/ContractRegistry.cs Adds EnC contract accessor.
src/coreclr/vm/datadescriptor/datadescriptor.inc Publishes EnC descriptors, global, and contract.
src/coreclr/vm/ceeload.h Adds native EnC data list and lookup helpers to Module.
src/coreclr/debug/inc/dacdbistructures.h Widens code-info EnC version field.
src/coreclr/debug/ee/functioninfo.cpp Records EnC data when JIT info is initialized.
src/coreclr/debug/di/module.cpp Casts widened EnC version for DI objects.
src/coreclr/debug/di/divalue.cpp Casts widened EnC version for function lookup.
src/coreclr/debug/daccess/dacdbiimpl.h Updates EnC lookup helper signature.
src/coreclr/debug/daccess/dacdbiimpl.cpp Switches native DAC EnC lookup to new module list.
docs/design/datacontracts/RuntimeTypeSystem.md Documents async variant contract behavior.
docs/design/datacontracts/Loader.md Documents member-ref lookup helper.
docs/design/datacontracts/EnC.md Adds EnC contract design documentation.

Copilot's findings

  • Files reviewed: 26/26 changed files
  • Comments generated: 11

Comment thread src/coreclr/debug/daccess/dacdbiimpl.cpp
Comment thread src/coreclr/debug/ee/functioninfo.cpp
Comment thread src/coreclr/debug/daccess/dacdbiimpl.cpp
Comment thread src/coreclr/vm/ceeload.h
Comment thread docs/design/datacontracts/Loader.md Outdated
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings May 18, 2026 20:47
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings May 18, 2026 20:50
@github-actions

This comment has been minimized.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot's findings

  • Files reviewed: 26/26 changed files
  • Comments generated: 5

Comment thread src/coreclr/debug/inc/dacdbistructures.h
Comment thread docs/design/datacontracts/Loader.md
@github-actions

This comment has been minimized.

Co-authored-by: rcj1 <77995559+rcj1@users.noreply.github.com>
Copilot AI review requested due to automatic review settings May 18, 2026 21:11
@rcj1 rcj1 review requested due to automatic review settings May 18, 2026 21:11
Copilot finished work on behalf of rcj1 May 18, 2026 21:12
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot's findings

  • Files reviewed: 27/27 changed files
  • Comments generated: 8

Comment thread src/coreclr/debug/daccess/dacdbiimpl.cpp Outdated
Comment thread src/coreclr/debug/inc/dacdbistructures.h
Comment on lines 1228 to +1237
this->m_encVersion = this->m_methodInfo->GetCurrentEnCVersion();
#ifdef FEATURE_METADATA_UPDATER
if (this->m_encVersion != CorDB_DEFAULT_ENC_FUNCTION_VERSION)
{
Module* pModule = this->m_pLoaderModule;
EnCData* pEnCData = (EnCData*)(void*)pModule->GetLoaderAllocator()->GetLowFrequencyHeap()->AllocMem(S_SIZE_T(sizeof(EnCData)));
pEnCData->addrOfCode = (TADDR)this->m_addrOfCode;
pEnCData->token = this->m_methodInfo->m_token;
pEnCData->encVersion = this->m_encVersion;
pModule->AddEncData(pEnCData);
Copy link
Copy Markdown
Contributor Author

@rcj1 rcj1 May 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch. Only possibly relevant in

HRESULT CordbObjectValue::GetFunction(ICorDebugFunction **ppFunction)
, but could return the latest jitted version as opposed to the latest version total. This is probably worth fixing

uint coldSize = 0;
if (cbh is not null)
{
em.GetMethodRegionInfo(cbh.Value, out hotSize, out coldStart, out coldSize);
Comment thread src/native/managed/cdac/tests/DacDbiImplTests.cs
Comment thread docs/design/datacontracts/Loader.md
@rcj1 rcj1 force-pushed the copilot/cdac-enc-getnativecodeinfo branch from 829f44a to 7f0ded6 Compare May 18, 2026 22:24
@github-actions
Copy link
Copy Markdown
Contributor

Caution

Security scanning requires review for Code Review

Details

The threat detection results could not be parsed. The workflow output should be reviewed before merging.

Review the workflow run logs for details.

Note

This review was generated by Copilot.

🤖 Copilot Code Review — PR #128338

Holistic Assessment

Motivation: The PR is well-motivated. It restructures debugger-side EnC version lookup to avoid walking all DebuggerMethodInfo/DebuggerJitInfo entries, introduces a new cDAC EnC contract, and implements the GetNativeCodeInfo/GetNativeCodeInfoForAddr DacDbi APIs in managed code. This is a clear improvement for both performance and cDAC coverage.

Approach: The approach is sound — a simple linked list on Module populated during DebuggerJitInfo::Init provides a lightweight data structure readable by both DAC and cDAC. The contract design follows established patterns. The async variant unwrapping and interpreter precode resolution are correctly handled.

Summary: ⚠️ Needs Human Review. The code is well-structured and follows existing patterns. I have a few concerns around thread safety of AddEncData and the "latest version" semantics that warrant human judgment.


Detailed Findings

⚠️ Thread Safety — AddEncData uses non-atomic list prepend

Module::AddEncData does a simple linked-list prepend without any synchronization:

pData->pNext = m_pEnCDataList;
m_pEnCDataList = dac_cast<PTR_EnCData>(pData);

If two threads could EnC-jit methods concurrently on the same module, this is a torn-write race. In practice, EnC likely holds the debugger lock so this may be safe, but it's worth confirming. If it's always called under the debugger lock, a comment to that effect would help future readers. Advisory — not merge-blocking since EnC operations are typically serialized.

⚠️ GetLatestEnCVersion returns the first match, not necessarily the latest

FindLatestEncData (native) and FindFirstByToken (cDAC) both return the first entry found by token. Since AddEncData prepends to the list, the first entry is indeed the most recently added one — so this is correct. However, the correctness depends on the prepend-ordering invariant. If the list were ever reordered or if entries were inserted in a different order, this would silently return wrong data. A brief comment in both FindLatestEncData and FindFirstByToken noting "returns most recent because list is prepend-ordered" would improve clarity. Advisory.

✅ Interpreter precode resolution in GetNativeCodeInfoForAddr

The review comments show this was caught and fixed — GetInterpreterCodeFromInterpreterPrecodeIfPresent is now called before GetCodeBlockHandle. This correctly mirrors native behavior.

✅ EnC contract is properly optional via TryGetContract

Both GetNativeCodeInfo and GetNativeCodeInfoForAddr use TryGetContract<IEnC> so runtimes without FEATURE_METADATA_UPDATER gracefully fall back to the default version. Good pattern.

NativeCodeFunctionData.encVersion type change to ULONG64

Changing from SIZE_T to ULONG64 makes the wire format consistent across 32/64-bit targets. The casts to (SIZE_T) in the DI layer preserve existing RS behavior. This is correct.

LookupMemberRefAsMethod implementation and contract

Clean implementation delegating to existing GetModuleLookupMapElement with the IS_FIELD_MEMBER_REF flag check. Matches the documented contract.

GetAsyncVariant implementation

Mirrors the native GetAsyncVariantNoCreate logic correctly — iterates introduced methods on canonical MT, checks matching token with IsAsyncVariant && !ReturnDroppingThunk. This is a read-only lookup that doesn't create anything, matching the "NoCreate" semantics.

💡 EnC_1 reads CorDBDefaultEnCFunctionVersion as ulong but stores as ulong

The global is declared as T_NUINT in the data descriptor, so on a 32-bit target this would be 4 bytes. ReadGlobal<ulong> reads 8 bytes. This may be fine if ReadGlobal for T_NUINT globals auto-widens, but worth verifying the plumbing handles 32-bit targets correctly. Low confidence — likely fine given existing patterns with other nuint globals.

✅ Debug validation pattern

The #if DEBUG blocks that validate cDAC output against legacy DAC output are thorough and follow the established pattern in other DacDbi implementations. Good for catching regressions during development.

✅ Test coverage

New tests cover both GetNativeCodeInfo (latest EnC version path) and GetNativeCodeInfoForAddr (address-specific EnC version). The mock setup is clean and follows existing test patterns in the file.

Generated by Code Review for issue #128338 · ● 3.6M ·

Co-authored-by: rcj1 <77995559+rcj1@users.noreply.github.com>
Copilot AI review requested due to automatic review settings May 18, 2026 22:33
@rcj1 rcj1 review requested due to automatic review settings May 18, 2026 22:33
Co-authored-by: rcj1 <77995559+rcj1@users.noreply.github.com>
Copilot AI review requested due to automatic review settings May 18, 2026 22:34
@rcj1 rcj1 review requested due to automatic review settings May 18, 2026 22:34
Co-authored-by: rcj1 <77995559+rcj1@users.noreply.github.com>
Copilot AI review requested due to automatic review settings May 18, 2026 22:35
@rcj1 rcj1 review requested due to automatic review settings May 18, 2026 22:35
Co-authored-by: rcj1 <77995559+rcj1@users.noreply.github.com>
Copilot AI review requested due to automatic review settings May 18, 2026 22:35
@rcj1 rcj1 review requested due to automatic review settings May 18, 2026 22:35
Co-authored-by: rcj1 <77995559+rcj1@users.noreply.github.com>
Copilot AI review requested due to automatic review settings May 18, 2026 22:36
@rcj1 rcj1 review requested due to automatic review settings May 18, 2026 22:36
Co-authored-by: rcj1 <77995559+rcj1@users.noreply.github.com>
Comment thread src/coreclr/vm/ceeload.h
void AddEncData(EnCData* pData)
{
LIMITED_METHOD_CONTRACT;
pData->pNext = m_pEnCDataList;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need synchronization?

Comment thread src/coreclr/vm/ceeload.h
return nullptr;
}

PTR_EnCData FindLatestEncData(mdMethodDef token)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this work with generic methods? Generic methods can have multiple instantiations.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, the thing is this doesn’t look like it ever worked to distinguish generic instantiations when getting the latest EnC version. I wonder if we even need/use this?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently I see FindLatestEncData getting used in CordbObjectValue::GetFunctionHelper but it doesn't have to be. Instead of calling CordbModule::LookupOrCreateFunction(token,encVersion) where encVersion ultimately gets calculated in this call you can instead call CordbModule::LookupOrCreateFunctionLatestVersion(token). That LatestVersion variant of the API already has another independent path to discover the latest (potentially unjitted) ENC version based on Debugger::UpdateFunction sending the new ENC version in updates.

Aside from CordbObjectValue::GetFunctionHelper I only see one other callsite for GetNativeCodeInfo and that one doesn't look at the ENC version information that is returned. You could modify GetNativeCodeInfo to only return exactly what that caller needs (MethodDesc+CodeStartAddr) which will eliminate the only callsite that cared about the pLatestEnCVersion param on LookupEnCVersions(). Then you can remove that parameter and this method.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds good. @jkotas it looks like generics in EnC are handled here, in that edits are made per-method and later applied per-instantiation:

// Generics are involved so we need to search for all related MethodDescs.

Comment thread src/coreclr/vm/ceeload.h
SUPPORTS_DAC;
for (PTR_EnCData pCur = m_pEnCDataList; pCur != nullptr; pCur = pCur->pNext)
{
if (pCur->token == token && pCur->addrOfCode == addrOfCode)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

token match should be unnecessary. The code should be enough to identify the method.

Comment thread src/coreclr/vm/ceeload.h
return nullptr;
}

PTR_EnCData FindLatestEncData(mdMethodDef token)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will find the most recently JITed method body. It may not be the one with the highest EnC version. My guess is that you expect this is to find the highest EnC version. Is that right?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, copilot pointed this discrepancy out here #128338 (comment). I think I’ll have to bring the mutation of the linked list up next to SetCurrentEnCVersion

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can refactor to remove the dependency on this method entirely (https://github.com/dotnet/runtime/pull/128338/changes#r3265486372)

Comment thread src/coreclr/vm/ceeload.h
return nullptr;
}

PTR_EnCData FindLatestEncData(mdMethodDef token)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently I see FindLatestEncData getting used in CordbObjectValue::GetFunctionHelper but it doesn't have to be. Instead of calling CordbModule::LookupOrCreateFunction(token,encVersion) where encVersion ultimately gets calculated in this call you can instead call CordbModule::LookupOrCreateFunctionLatestVersion(token). That LatestVersion variant of the API already has another independent path to discover the latest (potentially unjitted) ENC version based on Debugger::UpdateFunction sending the new ENC version in updates.

Aside from CordbObjectValue::GetFunctionHelper I only see one other callsite for GetNativeCodeInfo and that one doesn't look at the ENC version information that is returned. You could modify GetNativeCodeInfo to only return exactly what that caller needs (MethodDesc+CodeStartAddr) which will eliminate the only callsite that cared about the pLatestEnCVersion param on LookupEnCVersions(). Then you can remove that parameter and this method.

Comment thread src/coreclr/vm/ceeload.h
return nullptr;
}

PTR_EnCData FindLatestEncData(mdMethodDef token)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can refactor to remove the dependency on this method entirely (https://github.com/dotnet/runtime/pull/128338/changes#r3265486372)

Comment thread src/coreclr/vm/ceeload.h
{
LIMITED_METHOD_CONTRACT;
SUPPORTS_DAC;
for (PTR_EnCData pCur = m_pEnCDataList; pCur != nullptr; pCur = pCur->pNext)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm worried about the performance overhead of an O(N) linked list search. Although I suspect many uses of ENC are small I'm not confident that all of them are and this method can get called a lot.

Rather than a new linked list, how about we add the ENC version to the method DebugInfo? We could add an ENCVersion as the 7th item in the FAT header. This would have no cost for most methods which use the slim header or 1 nibble per method for those using the FAT header.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't have to keep track of unjitted methods, we could stuff the EnC version into the code header.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think it belongs to code header directly.

I think having in the side-table where it is today or in the debug info as suggested by Noah are reasonable options.

We do have yet another code version scheme for tiered compilation and Rejit. Rejit is very much like EnC. Would it make sense to store the EnC version in the existing structure used by tiered compilation and Rejit? (ILCodeVersionNode, NativeCodeVersionNode, etc.)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to store the EnC version in the existing structure used by tiered compilation and Rejit?

Architecturally yes, it makes sense to have it there. I didn't initially propose that because I worried it might be a more elaborate change, but its where I envisioned ENC would be some day. If it winds up being not too hard to add it there right now thats very nice. I am guessing some of the work along that route would be:

  • ENC currently uses independent logic to determine the active version and switch between versions
  • ICorDebug is aware of ILCodeVersions but currently they have different treatment than ENC versions. Right now ILCodeVersions generate multiple ICorDebugILCode objects parented under the same ICorDebugFunction whereas ENC expects multiple unique ICorDebugFunctions. Assuming new ENCVersions get represented at runtime as new ILCodeVersionNodes we'd need to teach ICorDebug to distinguish them.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are just concerned about performance, a cheap alternative would be to add a lookup map keyed by method token on the module, and have it store a pointer to per-token linked lists of (codeAddr, version) - the same structure used in ILCodeVersions. I am not sure how much work it would be to completely unify EnC and ReJIT, or if maybe there is a middle ground where we add a node but say "actually this is an EnC version".

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if maybe there is a middle ground where we add a node but say "actually this is an EnC version".

I think it would be a significant improvement compared to what we have today.

Copy link
Copy Markdown
Contributor Author

@rcj1 rcj1 May 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if maybe there is a middle ground where we add a node but say "actually this is an EnC version".

I think it would be a significant improvement compared to what we have today.

I can see us introducing ILCodeVersions for each EnC edit, each of which would get a singular NativeCodeVersion. We would not fold the actual logic of EnC into the logic for CodeVersions, which means that the existing iterators over CodeVersions would have to be updated to skip EnC nodes. We could even fold the APIs I just added on the EnC contract into the CodeVersions contract; it would be iterating the same data with a different filter.

Do you think this makes sense or would we have to go further in integrating the two for it to make sense?

Copy link
Copy Markdown
Contributor Author

@rcj1 rcj1 May 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe finding EnC methods, fields, etc would be easily folded into CodeVersions. So we would still need an EnC contract until this goes away.

Comment thread src/coreclr/debug/daccess/dacdbiimpl.cpp
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants