Skip to content

[cDAC] Source generator for IData<T> data classes#128356

Open
max-charlamb wants to merge 18 commits into
dotnet:mainfrom
max-charlamb:dev/max-charlamb/cdac-source-generator-prototype
Open

[cDAC] Source generator for IData<T> data classes#128356
max-charlamb wants to merge 18 commits into
dotnet:mainfrom
max-charlamb:dev/max-charlamb/cdac-source-generator-prototype

Conversation

@max-charlamb
Copy link
Copy Markdown
Member

@max-charlamb max-charlamb commented May 19, 2026

Introduces a Roslyn incremental source generator (DataGenerator) for cDAC IData<T> data classes, and ports ~150 hand-written classes under Microsoft.Diagnostics.DataContractReader.Contracts/Data/ to the new attribute-driven form.

Attribute surface

  • [CdacType(params string[] names)] — candidate type names; HasTypeHandle = true emits a TypeHandle(Target) accessor.
  • [Field] — instance field read from the descriptor. Writable = true emits Write{Name}. Pointer = true dereferences via GetOrAdd<T>. UsePropertyName = false opts out of property-name fallback.
  • [StaticAddress] / [StaticReference] — static field accessors. Try native globals first (TypeName.fieldName), then fall back to ManagedTypeSource.
  • [ThreadStaticAddress] — thread-static field accessors via ManagedTypeSource.
  • [RawOffset] / [FieldAddress] / [InstanceDataStart] — low-level offset and address attributes.
  • OnInit(Target, TargetPointer) — escape hatch for logic that doesn't fit the declarative surface.

Design notes

See docs/design/datacontracts/IData.md for the full attribute surface and good practices:

  • Avoid algorithm logic in IData classes — declarative attributes only; use OnInit for things that don't fit.
  • Avoid eagerly dereferencing pointers to other IData classes — store as TargetPointer and let callers materialize, to avoid ambiguous null semantics. Inline structs are fine.
  • One class, one descriptor.

Intentional surface refinements

A handful of conversions go slightly beyond a mechanical port (also documented in IData.md):

  • Thread.RuntimeThreadLocals — eager IData deref -> lazy TargetPointer; Thread_1.cs materializes via ProcessedData.GetOrAdd.
  • InteropSyncBlockInfo.{RCW,CCW,CCF,TaggedMemory} — always-non-null TargetPointer (with .Null sentinel) -> nullable TargetPointer?; SyncBlock_1.cs updated.
  • Thread.DebuggerControlledThreadState — now a real [Field(Writable = true)]; Set/Reset paths use the generated WriteDebuggerControlledThreadState against the cached Data.Thread.

Migration Support

Managed-to-native

The generator is designed so that managed types can be migrated to native data descriptors transparently. The IData class does not need to change -- the [CdacType] name list and field-name cascade handle the fallback automatically. Static fields are supported as native globals using the TypeName.fieldName naming scheme.

Example: Lock is currently a managed-only type:

[CdacType("System.Threading.Lock")]
internal sealed partial class Lock : IData<Lock>
{
    [Field("_state")]
    public uint State { get; }

    [Field("_owningThreadId")]
    public int OwningThreadId { get; }

    [Field("_recursionCount")]
    public uint RecursionCount { get; }
}

To migrate this to a native data descriptor, add an entry in datadescriptor.inc with the matching fully qualified name and field names. The data descriptor macro infrastructure currently uses C preprocessor token pasting for type names, which does not support dotted names like System.Threading.Lock. A future PR will add a two-argument macro form (e.g., CDAC_TYPE_BEGIN_2(tag, jsonname)) that separates the C++ identifier from the JSON name, allowing dotted managed names to flow through. The IData class itself requires no changes.

Native-to-managed migration (future)

Native-to-managed migration is not yet supported. A future PR will add type forwarding logic to the data descriptor system to enable this direction. The IData.md doc has a placeholder section for this.

Test results

All cDAC unit tests pass (2177 + 32 source-generator-specific tests). 16 skipped (unchanged from main). Source generator tests cover:

  • Field reads (native, managed, cross-source fallback)
  • Writable fields and write-back
  • Static and thread-static field accessors
  • TypeHandle generation
  • Name fallback across multiple candidate names
  • Managed-to-native migration scenarios

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a Roslyn incremental source generator (DataGenerator) that emits IData<T>.Create factories, field accessors, managed-type lookups, and write-back methods from declarative attributes ([CdacType], [Field], [FieldAddress], [InstanceDataStart], [FieldOffset], [Static…], [ThreadStaticAddress]). It then mechanically converts ~150 hand-written IData<T> classes under Microsoft.Diagnostics.DataContractReader.Contracts/Data/ to the new attribute-driven form, removing ~1300 lines of boilerplate. The PR also depends on/includes the new IManagedTypeSource contract from #127310 and refactors callers (SyncBlock_1, Thread_1, Debugger_1, AuxiliarySymbols_1, dump tests) to use it.

Changes:

  • New Microsoft.Diagnostics.DataContractReader.DataGenerator Roslyn analyzer project (Model, EquatableArray, generator entry, plus attribute/parser/emitter sources not all shown).
  • Conversion of ~150 Data/* classes to [CdacType] + partial with attribute-driven member declarations and optional partial void OnInit hooks.
  • Surface refinements: Thread.RuntimeThreadLocals becomes a lazy TargetPointer; InteropSyncBlockInfo.{RCW,CCW,CCF,TaggedMemory} become TargetPointer?; Debugger writable fields use [Field(Writable = true)] with generated Write{Name} methods; dump/SyncBlock_1/AuxiliarySymbols_1 updated to follow.

Reviewed changes

Copilot reviewed 183 out of 184 changed files in this pull request and generated no comments.

Show a summary per file
File Description
DataGenerator/*.cs, .csproj New incremental generator project (model types, EquatableArray helper, IsExternalInit shim, generator entry point).
Contracts.csproj, cdac.slnx Wire generator as analyzer; add generator project to solution.
Data/*.cs (~140 files) Mechanical port to [CdacType] + partial class with [Field]/[FieldAddress]/[InstanceDataStart]/[FieldOffset] properties; some classes retain OnInit for non-declarative logic.
Data/Managed/*.cs New per-managed-type wrappers (Lock, List, ComWrappers, NativeObjectWrapper, ConditionalWeakTable*) using ManagedFullName.
Data/AuxiliarySymbolInfo.cs Address renamed to CodeAddress to avoid colliding with generator-emitted Address.
Data/Debugger.cs SetField helper removed; writable fields use [Field(Writable = true)] and generated Write{Name}.
Data/InteropSyncBlockInfo.cs, Data/SyncBlock.cs RCW/CCW/CCF/TaggedMemory become nullable; SyncBlock loses Address.
Contracts/Thread_1.cs Materializes RuntimeThreadLocals lazily; handles new nullable ExceptionWatsonBucketTrackerBuckets / UEWatsonBucketTrackerBuckets.
Contracts/SyncBlock_1.cs Uses Data.Managed.Lock + IManagedTypeSource instead of hand-rolled metadata walk; updated for nullable interop pointers.
Contracts/Debugger_1.cs, Contracts/AuxiliarySymbols_1.cs, CoreCLRContracts.cs Switch to Write{Name} helpers; register ManagedTypeSource contract; rename to CodeAddress.
Abstractions/Contracts/IManagedTypeSource.cs, ContractRegistry.cs, IRuntimeTypeSystem.cs New IManagedTypeSource contract; remove GetTypeByNameAndModule / GetCoreLibFieldDescAndDef from IRuntimeTypeSystem.
datadescriptor.inc Register ManagedTypeSource contract version.
docs/design/datacontracts/ComWrappers.md Doc updated to describe ManagedTypeSource-based lookups.
tests/DumpTests/*.cs Replace rts.GetTypeByNameAndModule calls with IManagedTypeSource.GetTypeHandle/TryGetThreadStaticFieldAddress.

@max-charlamb max-charlamb force-pushed the dev/max-charlamb/cdac-source-generator-prototype branch from bc0e567 to 04d4c54 Compare May 19, 2026 14:41
Copilot AI review requested due to automatic review settings May 19, 2026 14:41
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 184 out of 184 changed files in this pull request and generated no new comments.

@max-charlamb max-charlamb force-pushed the dev/max-charlamb/cdac-source-generator-prototype branch from 1f09704 to 2440300 Compare May 19, 2026 20:28
Copilot AI review requested due to automatic review settings May 19, 2026 20:28
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 194 out of 194 changed files in this pull request and generated no new comments.

@max-charlamb max-charlamb force-pushed the dev/max-charlamb/cdac-source-generator-prototype branch from f913e03 to 2440300 Compare May 19, 2026 20:35
Copilot AI review requested due to automatic review settings May 19, 2026 20:35
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 194 out of 194 changed files in this pull request and generated no new comments.

Copilot AI review requested due to automatic review settings May 19, 2026 21:43
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 194 out of 194 changed files in this pull request and generated no new comments.

TargetPointer dataAddr = lockObj.Data;
uint state = ReadUintField(lockType, LockStateName, rts, mdReader, dataAddr);
bool monitorHeld = (state & 1) != 0;
Data.Managed.Lock lockData = _target.ProcessedData.GetOrAdd<Data.Managed.Lock>(sb.Lock.Object);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the PR still has distinct Data.Managed.* and Data.* types which implies we can't change a type's implementation between managed and native without also making a parallel (breaking) change to cDAC. Am I interpreting that right?

I think we want to have flexibility to shift between managed and native implementations without requiring a cDAC change. Do you think that is something we can amend this PR to do, or there are substantial challenges to being able to do that? I focus on this part first because I think the design decisions here will have a number of cascading effects.

As a concrete example I'm imagining two different versions of the runtime where one implements Lock as:

class Lock // C++
{
    private:
    int _state;
} 

Lock* g_pLockInstance;

And the other is:

namespace System.Runtime; // C#
class Lock
{
     private int _state;
     private static Lock s_lockInstance;
}

The goal would be that we are free to require anything in the runtime data descriptor we want to aid in the migration, but we could write the cDAC code targetting one of those shapes initially and switch to the other one without any of the debugging tools needing an update.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing this out. This is an oversight. The two different versions don't need to be seperated.

Here is a sample that makes lock work with both native and managed: max-charlamb@c9ce9363b38

Copilot AI review requested due to automatic review settings May 20, 2026 16:34
@max-charlamb max-charlamb marked this pull request as ready for review May 20, 2026 16:34
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 194 out of 194 changed files in this pull request and generated 4 comments.

/// <c>TargetLayoutExtensions.ResolveLayouts</c> for IData classes that opt into
/// per-field fallback between native cdac descriptors and managed type metadata.
/// </summary>
public abstract bool TryGetTypeInfo(string typeName, out TypeInfo info);
Comment on lines +9 to +27
/// <summary>
/// Resolves layout information for managed CLR types by fully-qualified name.
/// </summary>
public interface IManagedTypeSource : IContract
{
static string IContract.Name { get; } = nameof(ManagedTypeSource);

bool TryGetTypeInfo(string fullyQualifiedName, out Target.TypeInfo info) => throw new NotImplementedException();
Target.TypeInfo GetTypeInfo(string fullyQualifiedName) => throw new NotImplementedException();

bool TryGetTypeHandle(string fullyQualifiedName, out TypeHandle typeHandle) => throw new NotImplementedException();
TypeHandle GetTypeHandle(string fullyQualifiedName) => throw new NotImplementedException();

bool TryGetStaticFieldAddress(string fullyQualifiedName, string fieldName, out TargetPointer address) => throw new NotImplementedException();
TargetPointer GetStaticFieldAddress(string fullyQualifiedName, string fieldName) => throw new NotImplementedException();

bool TryGetThreadStaticFieldAddress(string fullyQualifiedName, string fieldName, TargetPointer thread, out TargetPointer address) => throw new NotImplementedException();
TargetPointer GetThreadStaticFieldAddress(string fullyQualifiedName, string fieldName, TargetPointer thread) => throw new NotImplementedException();
}
Comment on lines +129 to +135
if (managedFullName is not null
&& target.Contracts.ManagedTypeSource.TryGetTypeInfo(managedFullName, out Target.TypeInfo m))
{
managed = m;
if (!isValueType)
managedDataOffset = target.GetTypeInfo("Object").Size!.Value;
}
Comment on lines +43 to +61
List<MemberModel> members = new();
foreach (ISymbol member in classSymbol.GetMembers())
{
switch (member)
{
case IPropertySymbol prop:
if (TryParseProperty(prop, out MemberModel? pm))
{
members.Add(pm!);
}
break;
case IMethodSymbol method:
if (TryParseStaticMethod(method, out MemberModel? mm))
{
members.Add(mm!);
}
break;
}
}
Copilot AI review requested due to automatic review settings May 20, 2026 20:18
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 194 out of 194 changed files in this pull request and generated 6 comments.

Comment on lines 49 to 53
void IThread.SetDebuggerControlledThreadState(TargetPointer thread, DebuggerControlledThreadState state)
{
uint current = _target.ReadField<uint>(thread, _threadTypeInfo, nameof(Data.Thread.DebuggerControlledThreadState));
_target.WriteField(thread, _threadTypeInfo, nameof(Data.Thread.DebuggerControlledThreadState), current | (uint)state);
Data.Thread t = _target.ProcessedData.GetOrAdd<Data.Thread>(thread);
t.WriteDebuggerControlledThreadState(_target, t.DebuggerControlledThreadState | (uint)state);
}
Comment on lines 55 to 59
void IThread.ResetDebuggerControlledThreadState(TargetPointer thread, DebuggerControlledThreadState state)
{
uint current = _target.ReadField<uint>(thread, _threadTypeInfo, nameof(Data.Thread.DebuggerControlledThreadState));
_target.WriteField(thread, _threadTypeInfo, nameof(Data.Thread.DebuggerControlledThreadState), current & ~(uint)state);
Data.Thread t = _target.ProcessedData.GetOrAdd<Data.Thread>(thread);
t.WriteDebuggerControlledThreadState(_target, t.DebuggerControlledThreadState & ~(uint)state);
}
/// <c>TargetLayoutExtensions.ResolveLayouts</c> for IData classes that opt into
/// per-field fallback between native cdac descriptors and managed type metadata.
/// </summary>
public abstract bool TryGetTypeInfo(string typeName, out TypeInfo info);
Comment on lines 185 to 191
// return true if the TypeHandle represents an array, and set the rank to either 0 (if the type is not an array), or the rank number if it is.
bool IsArray(TypeHandle typeHandle, out uint rank) => throw new NotImplementedException();
TypeHandle GetTypeParam(TypeHandle typeHandle) => throw new NotImplementedException();
TypeHandle GetConstructedType(TypeHandle typeHandle, CorElementType corElementType, int rank, ImmutableArray<TypeHandle> typeArguments) => throw new NotImplementedException();
TypeHandle GetPrimitiveType(CorElementType typeCode) => throw new NotImplementedException();
TypeHandle GetTypeByNameAndModule(string name, string nameSpace, ModuleHandle moduleHandle) => throw new NotImplementedException();
bool IsGenericVariable(TypeHandle typeHandle, out TargetPointer module, out uint token) => throw new NotImplementedException();
bool IsFunctionPointer(TypeHandle typeHandle, out ReadOnlySpan<TypeHandle> retAndArgTypes, out byte callConv) => throw new NotImplementedException();
Comment on lines +9 to +26
/// <summary>
/// Resolves layout information for managed CLR types by fully-qualified name.
/// </summary>
public interface IManagedTypeSource : IContract
{
static string IContract.Name { get; } = nameof(ManagedTypeSource);

bool TryGetTypeInfo(string fullyQualifiedName, out Target.TypeInfo info) => throw new NotImplementedException();
Target.TypeInfo GetTypeInfo(string fullyQualifiedName) => throw new NotImplementedException();

bool TryGetTypeHandle(string fullyQualifiedName, out TypeHandle typeHandle) => throw new NotImplementedException();
TypeHandle GetTypeHandle(string fullyQualifiedName) => throw new NotImplementedException();

bool TryGetStaticFieldAddress(string fullyQualifiedName, string fieldName, out TargetPointer address) => throw new NotImplementedException();
TargetPointer GetStaticFieldAddress(string fullyQualifiedName, string fieldName) => throw new NotImplementedException();

bool TryGetThreadStaticFieldAddress(string fullyQualifiedName, string fieldName, TargetPointer thread, out TargetPointer address) => throw new NotImplementedException();
TargetPointer GetThreadStaticFieldAddress(string fullyQualifiedName, string fieldName, TargetPointer thread) => throw new NotImplementedException();
Comment on lines +43 to +61
List<MemberModel> members = new();
foreach (ISymbol member in classSymbol.GetMembers())
{
switch (member)
{
case IPropertySymbol prop:
if (TryParseProperty(prop, out MemberModel? pm))
{
members.Add(pm!);
}
break;
case IMethodSymbol method:
if (TryParseStaticMethod(method, out MemberModel? mm))
{
members.Add(mm!);
}
break;
}
}
Copilot AI review requested due to automatic review settings May 21, 2026 02:38
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 198 out of 198 changed files in this pull request and generated 5 comments.

/// <c>TargetLayoutExtensions.ResolveLayouts</c> for IData classes that opt into
/// per-field fallback between native cdac descriptors and managed type metadata.
/// </summary>
public abstract bool TryGetTypeInfo(string typeName, out TypeInfo info);
Comment on lines +9 to +27
/// <summary>
/// Resolves layout information for managed CLR types by fully-qualified name.
/// </summary>
public interface IManagedTypeSource : IContract
{
static string IContract.Name { get; } = nameof(ManagedTypeSource);

bool TryGetTypeInfo(string fullyQualifiedName, out Target.TypeInfo info) => throw new NotImplementedException();
Target.TypeInfo GetTypeInfo(string fullyQualifiedName) => throw new NotImplementedException();

bool TryGetTypeHandle(string fullyQualifiedName, out TypeHandle typeHandle) => throw new NotImplementedException();
TypeHandle GetTypeHandle(string fullyQualifiedName) => throw new NotImplementedException();

bool TryGetStaticFieldAddress(string fullyQualifiedName, string fieldName, out TargetPointer address) => throw new NotImplementedException();
TargetPointer GetStaticFieldAddress(string fullyQualifiedName, string fieldName) => throw new NotImplementedException();

bool TryGetThreadStaticFieldAddress(string fullyQualifiedName, string fieldName, TargetPointer thread, out TargetPointer address) => throw new NotImplementedException();
TargetPointer GetThreadStaticFieldAddress(string fullyQualifiedName, string fieldName, TargetPointer thread) => throw new NotImplementedException();
}
Comment on lines 186 to 273
bool IsArray(TypeHandle typeHandle, out uint rank) => throw new NotImplementedException();
TypeHandle GetTypeParam(TypeHandle typeHandle) => throw new NotImplementedException();
TypeHandle GetConstructedType(TypeHandle typeHandle, CorElementType corElementType, int rank, ImmutableArray<TypeHandle> typeArguments) => throw new NotImplementedException();
TypeHandle GetPrimitiveType(CorElementType typeCode) => throw new NotImplementedException();
TypeHandle GetTypeByNameAndModule(string name, string nameSpace, ModuleHandle moduleHandle) => throw new NotImplementedException();
bool IsGenericVariable(TypeHandle typeHandle, out TargetPointer module, out uint token) => throw new NotImplementedException();
bool IsFunctionPointer(TypeHandle typeHandle, out ReadOnlySpan<TypeHandle> retAndArgTypes, out byte callConv) => throw new NotImplementedException();
bool IsPointer(TypeHandle typeHandle) => throw new NotImplementedException();
// Returns null if the TypeHandle is not a class/struct/generic variable
#endregion TypeHandle inspection APIs

#region MethodDesc inspection APIs
MethodDescHandle GetMethodDescHandle(TargetPointer targetPointer) => throw new NotImplementedException();
TargetPointer GetMethodTable(MethodDescHandle methodDesc) => throw new NotImplementedException();

// Return true for an uninstantiated generic method
bool IsGenericMethodDefinition(MethodDescHandle methodDesc) => throw new NotImplementedException();
ReadOnlySpan<TypeHandle> GetGenericMethodInstantiation(MethodDescHandle methodDesc) => throw new NotImplementedException();

GenericContextLoc GetGenericContextLoc(MethodDescHandle methodDescHandle) => throw new NotImplementedException();

// Return true if the method uses the async calling convention (CORINFO_CALLCONV_ASYNCCALL).
// This corresponds to native MethodDesc::IsAsyncMethod().
bool IsAsyncMethod(MethodDescHandle methodDesc) => throw new NotImplementedException();

// Return mdtMethodDef (0x06000000) if the method doesn't have a token, otherwise return the token of the method
uint GetMethodToken(MethodDescHandle methodDesc) => throw new NotImplementedException();

// Return true if a MethodDesc represents an array method
// An array method is also a StoredSigMethodDesc
bool IsArrayMethod(MethodDescHandle methodDesc, out ArrayFunctionType functionType) => throw new NotImplementedException();

// Return true if a MethodDesc represents a method without metadata, either an IL Stub dynamically
// generated by the runtime, or a MethodDesc that describes a method represented by the System.Reflection.Emit.DynamicMethod class
// Or something else similar.
// A no metadata method is also a StoredSigMethodDesc
bool IsNoMetadataMethod(MethodDescHandle methodDesc, out string methodName) => throw new NotImplementedException();
// A StoredSigMethodDesc is a MethodDesc for which the signature isn't found in metadata.
bool IsStoredSigMethodDesc(MethodDescHandle methodDesc, out ReadOnlySpan<byte> signature) => throw new NotImplementedException();

// Return true for a MethodDesc that describes a method represented by the System.Reflection.Emit.DynamicMethod class
// A DynamicMethod is also a StoredSigMethodDesc, and a NoMetadataMethod
bool IsDynamicMethod(MethodDescHandle methodDesc) => throw new NotImplementedException();

// Returns true if a MethodDesc represents an IL-backed method
bool IsIL(MethodDescHandle methodDesc) => throw new NotImplementedException();

// Return true if a MethodDesc represents an IL Stub dynamically generated by the runtime
// A IL Stub method is also a StoredSigMethodDesc, and a NoMetadataMethod
bool IsILStub(MethodDescHandle methodDesc) => throw new NotImplementedException();

// Return true if a MethodDesc represents an IL stub with a special MethodDesc context arg
bool HasMDContextArg(MethodDescHandle methodDesc) => throw new NotImplementedException();

bool IsCollectibleMethod(MethodDescHandle methodDesc) => throw new NotImplementedException();
bool IsVersionable(MethodDescHandle methodDesc) => throw new NotImplementedException();

TargetPointer GetMethodDescVersioningState(MethodDescHandle methodDesc) => throw new NotImplementedException();

TargetCodePointer GetNativeCode(MethodDescHandle methodDesc) => throw new NotImplementedException();
TargetCodePointer GetMethodEntryPointIfExists(MethodDescHandle methodDesc) => throw new NotImplementedException();

ushort GetSlotNumber(MethodDescHandle methodDesc) => throw new NotImplementedException();

bool HasNativeCodeSlot(MethodDescHandle methodDesc) => throw new NotImplementedException();

TargetPointer GetAddressOfNativeCodeSlot(MethodDescHandle methodDesc) => throw new NotImplementedException();

TargetPointer GetGCStressCodeCopy(MethodDescHandle methodDesc) => throw new NotImplementedException();

OptimizationTier GetMethodDescOptimizationTier(MethodDescHandle methodDescHandle) => throw new NotImplementedException();
bool IsEligibleForTieredCompilation(MethodDescHandle methodDescHandle) => throw new NotImplementedException();

bool IsAsyncThunkMethod(MethodDescHandle methodDesc) => throw new NotImplementedException();

bool IsWrapperStub(MethodDescHandle methodDesc) => throw new NotImplementedException();
#endregion MethodDesc inspection APIs
#region FieldDesc inspection APIs
TargetPointer GetMTOfEnclosingClass(TargetPointer fieldDescPointer) => throw new NotImplementedException();
uint GetFieldDescMemberDef(TargetPointer fieldDescPointer) => throw new NotImplementedException();
bool IsFieldDescThreadStatic(TargetPointer fieldDescPointer) => throw new NotImplementedException();
bool IsFieldDescStatic(TargetPointer fieldDescPointer) => throw new NotImplementedException();
CorElementType GetFieldDescType(TargetPointer fieldDescPointer) => throw new NotImplementedException();
uint GetFieldDescOffset(TargetPointer fieldDescPointer, FieldDefinition fieldDef) => throw new NotImplementedException();
TargetPointer GetFieldDescByName(TypeHandle typeHandle, string fieldName) => throw new NotImplementedException();
TargetPointer GetFieldDescStaticAddress(TargetPointer fieldDescPointer, bool unboxValueTypes = true) => throw new NotImplementedException();
TargetPointer GetFieldDescThreadStaticAddress(TargetPointer fieldDescPointer, TargetPointer thread, bool unboxValueTypes = true) => throw new NotImplementedException();
#endregion FieldDesc inspection APIs
#region Other APIs
void GetCoreLibFieldDescAndDef(string typeNamespace, string typeName, string fieldName, out TargetPointer fieldDescAddr, out FieldDefinition fieldDef) => throw new NotImplementedException();
#endregion Other APIs
}
Comment on lines +29 to +40
AttributeData? cdacAttr = classSymbol.GetAttributes()
.FirstOrDefault(a => a.AttributeClass?.ToDisplayString() == CdacTypeAttributeFqn);
if (cdacAttr is null)
{
return null;
}

EquatableArray<string> names = EquatableArray<string>.FromEnumerable(
cdacAttr.ConstructorArguments[0].Values
.Select(v => (string)v.Value!)
.ToList());
bool hasTypeHandle = GetNamedBool(cdacAttr, "HasTypeHandle");
Comment on lines +45 to +63
List<MemberModel> members = new();
foreach (ISymbol member in classSymbol.GetMembers())
{
switch (member)
{
case IPropertySymbol prop:
if (TryParseProperty(prop, out MemberModel? pm))
{
members.Add(pm!);
}
break;
case IMethodSymbol method:
if (TryParseStaticMethod(method, out MemberModel? mm))
{
members.Add(mm!);
}
break;
}
}
Copilot AI review requested due to automatic review settings May 21, 2026 04:02
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 198 out of 198 changed files in this pull request and generated 5 comments.

Comment on lines +35 to +38
IncrementalValueProvider<bool> shouldEmitLayoutPair = context.CompilationProvider
.Select(static (compilation, _) =>
compilation.GetTypeByMetadataName(LayoutPairSource.FullyQualifiedName) is null);

Comment on lines +45 to +49
List<MemberModel> members = new();
foreach (ISymbol member in classSymbol.GetMembers())
{
switch (member)
{
Comment on lines +90 to +93
HasTypeHandle: hasTypeHandle,
ImplementsIData: implementsIData,
HintFilePath: syntaxRef?.SyntaxTree.FilePath,
Members: EquatableArray<MemberModel>.FromEnumerable(members));
Max Charlamb and others added 18 commits May 21, 2026 09:14
Squashed from 7 commits:

- base implementation

- update users

- Address Copilot review: fix Lock _owningThreadId type and ComWrappers null handling

- Register ManagedTypeSource contract in datadescriptor.inc

- Document ManagedTypeSource contract and update consumers

- Potential fix for pull request finding

- Add object data offset to SyncBlock.md ManagedTypeSource reads
Introduces a Roslyn incremental source generator (DataGenerator) that
emits the ctor, `IData<T>.Create` factory, `Address` property, and
optional `Write{Name}` write-back methods for cDAC `IData<T>` data
classes, from a small attribute surface:

  - `[CdacType("Foo")]` or `[CdacType(ManagedFullName = "...")]`
    selects native vs managed type descriptors.
  - `[Field]` on a property declares a descriptor-driven field read.
    Bool, primitive, pointer, NUInt, code pointer, in-place struct,
    and pointer-to-IData read kinds are supported. Nullable property
    types are treated as descriptor-optional.
  - `[Field(Writable = true)]` additionally emits a
    `Write{Name}(Target, T)` method.
  - `[FieldAddress]`, `[InstanceDataStart]`, `[FieldOffset(N)]`
    cover address arithmetic and hardcoded-offset reads.
  - `[StaticAddress]`, `[StaticReference]`,
    `[ThreadStaticAddress]` emit partial static accessor methods
    against the managed type source.
  - A `partial void OnInit(Target, TargetPointer)` hook lets the user
    do anything that doesn't fit the declarative surface.

No existing IData<T> classes are converted in this commit; that follows
separately. See docs/design/datacontracts/IData.md for the full
attribute surface and good-practices guidance.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Ports ~150 hand-written `IData<T>` data classes under
Microsoft.Diagnostics.DataContractReader.Contracts/Data/ to the
attribute-driven form supported by the DataGenerator source generator
introduced in the previous commit. Each class loses its hand-rolled
ctor/Create boilerplate in favor of declarative `[CdacType]`/
`[Field]` attributes on a `partial` class; the generator emits the
equivalent ctor, `IData<T>.Create`, `Address` property, and any
required `Write{Name}` write-back methods.

A handful of intentional surface refinements come along for the ride
(documented in IData.md):

  - Pointer-to-IData fields are stored as `TargetPointer` and
    materialized lazily by callers, instead of being eagerly
    dereferenced in the ctor. This avoids ambiguous null semantics for
    fields that may be optional or self-referential.
    Affected: Thread.RuntimeThreadLocals, plus a handful of similar
    fields whose callers in Contracts/*.cs have been updated.
  - InteropSyncBlockInfo.{RCW,CCW,CCF,TaggedMemory} switch from
    always-non-null `TargetPointer` (with `Null` sentinels for
    missing fields) to nullable `TargetPointer?`. Callers in
    SyncBlock_1.cs have been updated to handle the new nullability.
  - Thread.DebuggerControlledThreadState is now a real
    `[Field(Writable = true)]` property, and Set/Reset paths in
    Thread_1.cs use the generated `WriteDebuggerControlledThreadState`
    method instead of bespoke `ReadField`/`WriteField` calls.

JITNotification is intentionally left in hand-written form for now
because its mutable, count-driven layout doesn't map cleanly onto the
current generator surface.

All 2177 cDAC unit tests continue to pass.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Adds three good-practice sections informed by the IData<T> conversion
work:

  - Materialize cached instances through `ProcessedData.GetOrAdd<T>`,
    never via `new T(target, addr)` (avoids cache bypass and stale
    write-back snapshots).
  - Don't capture `Target` in instance state -- treat IData
    instances as snapshots and accept `Target` as a parameter on
    methods that need a live channel.
  - Match the descriptor's declared field type verbatim (no widening,
    narrowing, or sign-flipping); document the standard descriptor
    type -> C# type mapping and call out `bool` as the lone
    intentional deviation.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
… RawOffset; doc cleanup

Namespace migration:
  - Move CdacAttributes.cs from
    `Abstractions/Generated/CdacAttributes.cs` to
    `Abstractions/CdacAttributes.cs`. The `Generated` subfolder was
    misleading -- the attributes are hand-authored, not source-
    generated.
  - Change the namespace from
    `Microsoft.Diagnostics.DataContractReader.Generated` to the root
    `Microsoft.Diagnostics.DataContractReader` namespace, matching
    where `Target`, `TargetPointer`, and the other foundational
    abstractions already live.
  - Strip the now-unnecessary `using ...Generated;` directive from
    the ~150 IData<T> classes (their file-scoped `...Data` namespace
    is a child of the root and resolves the attributes automatically).
  - Update the generator's FQN constants and doc-comment to match.

FieldOffset -> RawOffset rename:
  - Rename `FieldOffsetAttribute` to `RawOffsetAttribute`. The
    old name collided with `System.Runtime.InteropServices.FieldOffset`
    once the attribute moved to the root namespace; the new name is
    also more accurate (these are raw byte offsets relative to the
    instance address, not BCL-style explicit-layout offsets).
  - Rename all `[FieldOffset(...)]` uses on IData classes
    accordingly (ImageDosHeader, ImageFileHeader, ImageNTHeaders,
    ImageOptionalHeader, ImageSectionHeader, WebcilHeader,
    WebcilSectionHeader).
  - Update Parser.cs FQN constant and emitter helper to match.

IData.md cleanup (consistency with the current code):
  - Reflect the namespace + project + attribute-name changes above.
  - Update the `[CdacType]` attribute-surface table -- the
    `DataType` enum overload was removed earlier; the recommended
    form is now `[CdacType(nameof(DataType.X))]`.
  - Sweep all worked examples to use `[CdacType(nameof(DataType.X))]`
    instead of the obsolete `[CdacType(DataType.X)]`.
  - Fix the generated `WriteFlags` example to show the string form
    that the generator actually emits.
  - Correct the `[Field(Writable = true)]` rules in two places: the
    write goes through the descriptor field offset regardless of
    which side (native or managed) supplied it.
  - Soften the `init`/`required`/`= null!` blanket prohibition
    into a positive recommendation to use `[MemberNotNull]` on
    `OnInit` for properties populated by custom logic.

Build clean; all 2177 cDAC unit tests still pass.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
When an IData class supplies both a native cdac descriptor name and a
managed full name, each `[Field]` resolves at runtime via a per-field
cascade: each candidate name is tried against the native descriptor
first, then against the managed metadata. The first match wins.
Motivation: Jan's review on dotnet#127310 -- types like `Lock` may move
between sources or gain partial native coverage; a single IData class
should survive that without C# changes.

The fallback machinery is contained entirely in the generator's output;
no public type surface is added to `Abstractions`.

User-side surface (collapsed from four name-related properties to one):

  - `[Field("name1", "name2", ...)]` -- `params string[]` ctor.
    Defaults to `[propertyName]` when none given.
  - Cascade tries every name against native first, then managed.
  - `[FieldAddress(...)]` accepts the same `params string[]` shape.

LayoutPair (PostInit-emitted into the consuming assembly):

  - `LayoutPair` struct + `LayoutPairResolver.Resolve(target, ...)`
    are emitted via `RegisterPostInitializationOutput` into the
    consuming assembly, gated by a compilation check so multiple
    InternalsVisibleTo-linked assemblies don't double-emit.
  - All Read/Write/HasField/GetFieldAddress methods take a single
    `string` or `string[]` of candidate names.
  - `ManagedDataOffset` (`Object.Size` for ref types, `0` for
    value types) is applied only when the cascade resolves on the
    managed side.

Generator/parser:

  - `Target.TryGetTypeInfo(string, out TypeInfo)` -- new abstract
    on `Target`; non-throwing form used by `LayoutPairResolver`.
  - Unified codegen: every class that needs a descriptor lookup goes
    through `LayoutPair`. The previous dual single-source vs
    cross-source code paths are gone (~120 LOC deleted from the
    emitter); `[CdacType]` parameterless + `[RawOffset]`-only
    classes still skip the resolver call.
  - `IsSourceProject=false` on the generator csproj to stop the
    repo's DownlevelLibraryImportGenerator from attaching to this
    netstandard2.0 source generator.

Existing 150 IData<T> classes are unchanged: positional forms like
`[Field("_state")]` (Lock) and `[Field("_message")]` (Exception)
still resolve through the cascade. `Exception` is the only existing
two-source class; its descriptor field names happen to match the
managed names, so the happy path is identical to before.

DataGeneratorTests: a new self-contained xUnit sub-project under
`tests/DataGenerator/` exercises the generator's emitted code via a
minimal `TestTarget` (no dependency on the cdac mocking framework)
and 12 test-only IData classes. 10 direct `LayoutPair` unit tests
+ 19 integration scenarios cover single-source, cross-source cascade,
alias resolution, writable round-trip, optional `T?`, and
`[FieldAddress]` paths.

Test counts: 29 new tests in DataGeneratorTests; 2177 existing cdac
Tests unchanged (was 2187 -- the 10 LayoutPair direct tests moved into
the new sub-project). Total 2206 passing across the cdac surface.

IData.md: new Fallback section + updated attribute surface table.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…candidate

The C# property name is now always appended as the lowest-priority

candidate in the [Field] / [FieldAddress] name cascade (de-duped if

already present). This means an explicit name list still falls back

to the property name if none of the listed names matched the

descriptor, removing the need to repeat the property name in mixed

single-source/cross-source classes.

Opt out by setting UsePropertyName = false on the attribute. This is

rarely needed; it exists for cases where the C# property name happens

to collide with an unrelated descriptor field.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…kind

Replaces the three-way pattern (string overload, string[] overload,

private ReadOnlySpan<string> core) with a single public method using

'params ReadOnlySpan<string> names' (C# 13).

- Single-name callers still bind to an inline span buffer (no heap

  allocation), matching the previous fast path.

- Multi-name callers can pass either comma-separated string literals

  or an existing string[] (implicit array-to-span conversion).

- Emitter's NameArgs no longer special-cases single vs multi: it

  always emits a comma-separated quoted list.

- WriteField parameter order swapped to put 'value' before the params

  names tail; Emitter codegen updated to match.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
LayoutPair previously exposed one wrapper per Target read/write kind

(ReadField, ReadPointerField, ReadNUIntField, ReadCodePointerField,

ReadDataField, WriteField, GetFieldAddress, HasField). Each wrapper

did a name-cascade resolution and then forwarded to the matching

Target method.

The same shape is now generated directly into each IData ctor:

Select / TrySelect resolves once into (TypeInfo, base, name) locals

and the appropriate Target.* call runs inline. This drops the

wrapper layer entirely; optional fields also gain a free win, since

they previously did a HasField + Read pair that resolved twice.

Also folded LayoutPairResolver.Resolve into a static LayoutPair.Resolve

method -- there's no reason to keep the factory in a separate type.

Net surface: LayoutPair has TrySelect, Select, Resolve (static),

InstanceSize, ManagedDataOffset, NativeType, ManagedType. Tests use

small helpers (FieldAddress, HasField) to stay readable.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
All IData types now live in Microsoft.Diagnostics.DataContractReader.Data.

The Managed/ subfolder and its separate namespace added unnecessary

indirection; types are moved to Data/ alongside all other IData classes.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ize error message

- Add UsePropertyName_False tests verifying property name opt-out behavior
- Add DataPointer test verifying IData<T> pointer-chase materialization
- Improve LayoutPair.Resolve error when Object descriptor lacks Size

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The source generator now emits /// <summary> on the generated partial
class describing whether it wraps a native descriptor, managed type, or
both. Removes redundant hand-written 'Wraps ...' doc comments from four
Data classes since the information is now auto-derived from the attribute.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Write methods no longer take Target parameter (_target is captured)
- Document [ThreadStaticAddress] attribute (thread statics are supported)
- Fix cascade API references: LayoutPair.Resolve/Select, not TypeNameResolver.Resolve/LayoutPair.ReadField
- Fix InstanceDataStart description to use layouts.InstanceSize
- Add TargetNUInt to writable field types
- Update usage examples to match new signatures

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Gate helper emission on both LayoutPair and TypeNameResolver presence
- Add FullyQualifiedName constant to TypeNameResolverSource
- Remove unused HintFilePath from CdacTypeModel and Parser

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@max-charlamb max-charlamb force-pushed the dev/max-charlamb/cdac-source-generator-prototype branch from d14a25c to 89f52d8 Compare May 21, 2026 13:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants