Add ElfSymbolModule for Parsing ELF Symbol Tables#2384
Merged
Conversation
Add ElfSymbolModule, which reads ELF (Executable and Linkable Format) files and resolves RVAs to symbol names. This enables PerfView and TraceEvent to resolve native symbols from Linux ELF binaries when analyzing universal traces captured on Linux. Key capabilities: - Parses both .symtab and .dynsym sections - Supports 32-bit and 64-bit ELF, little-endian and big-endian - Implements ISymbolLookup for integration with TraceEvent's symbol resolution pipeline - Demangles Itanium C++ (_Z) and Rust v0 (_R) mangled names Design for performance: - Lazy name resolution: symbol names decoded on first FindNameForRva hit, not during construction - LOH-free strtab via SegmentedList<byte> with 64KB segments - Two-pass section scan pre-allocates all structures with zero resizes - BitConverter fast path for 64-bit LE symbol entries - O(log n) zero-allocation lookups via sorted array binary search Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add comprehensive test suite with 26 tests covering all ElfSymbolModule code paths: - Error handling: invalid magic, truncated, empty, bad class, no sections - Format coverage: 64-bit LE, 32-bit LE, 64-bit BE, 32-bit BE - Symbol filtering: non-function types, zero-value, zero-size, below PT_LOAD - RVA adjustment with pVaddr and pOffset - Demangling integration: Itanium C++, Rust v0, plain passthrough - .dynsym section parsing alongside .symtab - Edge cases: empty table, RVA zero, 100-symbol binary search stress - File path constructor validation ElfBuilder is a test helper that constructs synthetic minimal ELF binaries in memory, avoiding the need for checked-in test binaries. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add BenchmarkDotNet benchmarks for ELF symbol parsing and lookup using ElfBuilder-generated synthetic ELF binaries (5, 256, and 10000 symbols). No external file dependencies. Parse benchmarks measure construction time and memory allocation. Lookup benchmarks measure FindNameForRva performance (zero-alloc). Link ElfBuilder.cs from the test project to share the ELF binary builder without duplicating code. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
3328404 to
e5cf6d2
Compare
leculver
approved these changes
Mar 20, 2026
This was referenced Mar 31, 2026
Merged
This was referenced Apr 7, 2026
This was referenced Apr 15, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds
ElfSymbolModule, an ELF (Executable and Linkable Format) symbol parser that resolves RVAs to symbol names. This enables PerfView and TraceEvent to resolve native symbols from Linux ELF binaries when analyzing universal traces captured on Linux.Key capabilities
.symtaband.dynsymsectionsISymbolLookupfor integration with TraceEvent's symbol resolution pipeline_Z) and Rust v0 (_R) mangled names using demanglers from Implement Symbol Demanglers for Linux Binaries #2383.Design
FindNameForRvahit, not during construction. Most symbols in a trace are never looked up — this avoids unnecessary work.SegmentedList<byte>with 64KB segments.Performance
To give a sense of existing vs. new performance of symbol parsing and lookup.
ELF vs PDB comparison (CoreCLR, 18K symbols, net8.0)
ELF parses 3.2x faster than PDB and first lookups are 303x faster (sorted array binary search vs COM interop). PDB's lower parse memory is due to DIA's lazy/deferred loading model.
Lookup benchmarks (zero-alloc)
Testing
Unit tests (26 tests)
Synthetic ELF binaries generated by
ElfBuilder(no checked-in test binaries):Offline validation against pyelftools (568 real ELF files, 109,293 symbols)
Validated the parser against a known-good reference:
.debugfiles: the executable PT_LOAD segment parameters and allSTT_FUNCsymbols (st_value, st_size, raw mangled name).ElfSymbolModulewith demangling disabled for each file, then verifies that every reference symbol is found at the correct RVA with an exact match on start address and raw name.Result: zero mismatches across all 568 files and 109,293 symbols.
Contributes to #2382.