Problem
For C# projects, extractTypeAnnotations runs (csharp is in TYPE_ANNOTATION_LANGUAGES) but emits zero references edges. This breaks codegraph_callers, codegraph_impact, and codegraph_context for any DTO, interface, record, or value type used as a parameter/return type — even though the same tools work correctly for TypeScript.
Reproduction
Indexed a .NET 10 + ASP.NET Core monorepo (533 C# files, 217 TS files):
SELECT n.language, e.kind, COUNT(*)
FROM edges e JOIN nodes n ON n.id = e.source
GROUP BY n.language, e.kind;
| language |
kind |
count |
| csharp |
calls |
810 |
| csharp |
extends |
70 |
| csharp |
implements |
20 |
| csharp |
instantiates |
169 |
| csharp |
imports |
1338 |
| csharp |
contains |
3296 |
| csharp |
references |
0 ← bug |
| typescript |
references |
192 |
Concrete query failures on the same repo:
codegraph_callers SessionInfoDto → 0 callers (grep finds 6 type usages in Produces<SessionInfoDto>, return signatures, etc.)
codegraph_callees DataExporter (a class with 11 methods, each taking/returning multiple DTO types) → returns only using imports, no type references
unresolved_refs table is empty for language='csharp' → references are never collected, not just never resolved
Root Cause (from dist/ inspection)
Two issues in extraction/languages/csharp.js and extraction/tree-sitter.js:
1. csharp.js is missing the returnField:
exports.csharpExtractor = {
methodTypes: ['method_declaration', 'constructor_declaration'],
paramsField: 'parameter_list',
// returnField missing! Falls back to 'return_type' in
// tree-sitter.js, but C# uses field name 'type'.
};
Compare: java.js, dart.js, kotlin.js, pascal.js all set returnField: 'type'. TypeScript, Rust, Python, PHP use 'return_type'. C# has no setting at all, so extractTypeAnnotations never finds the return-type child.
2. extractTypeRefsFromSubtree only matches type_identifier:
if (node.type === 'type_identifier') {
// emit 'references' edge
}
C# tree-sitter does not produce type_identifier nodes. C# type positions contain:
identifier → simple class/interface names
predefined_type → int, string, bool
qualified_name → System.String, MyNs.Foo
generic_name → List<T>, Task<Foo>
array_type, nullable_type, pointer_type
So even if paramsField/returnField resolved correctly, the walker would still emit nothing because no type_identifier node exists.
Suggested Fix
a) Add returnField: 'type' to csharp.js (one line).
b) Extend extractTypeRefsFromSubtree with a language-aware node-type set, or make it a virtual method on the extractor so each language can override what counts as a "type reference leaf". For C#, recurse into type positions and capture identifier | predefined_type | qualified_name | generic_name.
Caveat: Naively matching identifier everywhere will produce false positives for parameter names. The walker should only recurse into the type field of each parameter, not the whole parameter_list. The cleanest implementation is a custom extractTypeRefs(node, source) method on the C# extractor that manually navigates parameter_list → parameter → type and method_declaration → type.
Impact
C# coverage is a major use case for the project (Java/Kotlin/Dart all handle type references; C# is the outlier). Without references edges, the headline feature "structural lookups grep can't do" — finding all consumers of a DTO or interface across a codebase — silently degrades to text search for ~half of typical backend stacks.
Happy to take this on as a PR if a maintainer can sanity-check the fix approach (custom extractor method vs language-aware walker extension).
Environment
- codegraph version: latest npm
@colbymchenry/codegraph
- Node: lts via mise
- Project: .NET 10, mixed with Angular 21 TypeScript frontend
- DB backend:
node:sqlite WAL
Problem
For C# projects,
extractTypeAnnotationsruns (csharp is inTYPE_ANNOTATION_LANGUAGES) but emits zeroreferencesedges. This breakscodegraph_callers,codegraph_impact, andcodegraph_contextfor any DTO, interface, record, or value type used as a parameter/return type — even though the same tools work correctly for TypeScript.Reproduction
Indexed a .NET 10 + ASP.NET Core monorepo (533 C# files, 217 TS files):
Concrete query failures on the same repo:
codegraph_callers SessionInfoDto→ 0 callers (grep finds 6 type usages inProduces<SessionInfoDto>, return signatures, etc.)codegraph_callees DataExporter(a class with 11 methods, each taking/returning multiple DTO types) → returns onlyusingimports, no type referencesunresolved_refstable is empty forlanguage='csharp'→ references are never collected, not just never resolvedRoot Cause (from
dist/inspection)Two issues in
extraction/languages/csharp.jsandextraction/tree-sitter.js:1.
csharp.jsis missing thereturnField:Compare:
java.js,dart.js,kotlin.js,pascal.jsall setreturnField: 'type'. TypeScript, Rust, Python, PHP use'return_type'. C# has no setting at all, soextractTypeAnnotationsnever finds the return-type child.2.
extractTypeRefsFromSubtreeonly matchestype_identifier:C# tree-sitter does not produce
type_identifiernodes. C# type positions contain:identifier→ simple class/interface namespredefined_type→int,string,boolqualified_name→System.String,MyNs.Foogeneric_name→List<T>,Task<Foo>array_type,nullable_type,pointer_typeSo even if
paramsField/returnFieldresolved correctly, the walker would still emit nothing because notype_identifiernode exists.Suggested Fix
a) Add
returnField: 'type'tocsharp.js(one line).b) Extend
extractTypeRefsFromSubtreewith a language-aware node-type set, or make it a virtual method on the extractor so each language can override what counts as a "type reference leaf". For C#, recurse into type positions and captureidentifier | predefined_type | qualified_name | generic_name.Caveat: Naively matching
identifiereverywhere will produce false positives for parameter names. The walker should only recurse into thetypefield of eachparameter, not the wholeparameter_list. The cleanest implementation is a customextractTypeRefs(node, source)method on the C# extractor that manually navigatesparameter_list → parameter → typeandmethod_declaration → type.Impact
C# coverage is a major use case for the project (Java/Kotlin/Dart all handle type references; C# is the outlier). Without
referencesedges, the headline feature "structural lookups grep can't do" — finding all consumers of a DTO or interface across a codebase — silently degrades to text search for ~half of typical backend stacks.Happy to take this on as a PR if a maintainer can sanity-check the fix approach (custom extractor method vs language-aware walker extension).
Environment
@colbymchenry/codegraphnode:sqliteWAL