AArch64 DWARF unwind + macOS os_signpost integration#176
Open
angerman wants to merge 11 commits into
Open
Conversation
The AArch64 native code generator was silently discarding CmmUnwind nodes (`return nilOL`), making DWARF-based profiling and debugging impossible on Apple Silicon and ARM64 Linux. This commit adds full DWARF unwind support, mirroring the existing X86 implementation: - Add UNWIND pseudo-instruction to AArch64.Instr - Convert CmmUnwind nodes to UNWIND instructions in CodeGen - Emit UNWIND after DELTA via addSpUnwindings for SP tracking - Wire extractUnwindPoints into the AArch64 NcgImpl record - Pretty-print UNWIND as a label + comment in Ppr With this change, `ghc -g` on AArch64 produces .debug_frame entries, enabling `lldb` backtraces, `dwarfdump --debug-frame`, and sampler- based profilers (Instruments, Samply) to unwind through Haskell code.
GHC-compiled programs are invisible to Apple Instruments because the
RTS emits no os_signpost events. This makes it hard to correlate GC
pauses and thread scheduling with system-level activity on macOS.
Add a new Signpost.c/Signpost.h module that bridges RTS events to the
os_signpost API, using OS_LOG_CATEGORY_POINTS_OF_INTEREST so events
appear in Instruments by default without a custom .instrpkg:
- GC intervals: begin/end pairs tracked per-capability with unique
signpost IDs, emitting generation, bytes copied, and slop
- Thread lifecycle: create/run/stop as point events with cap and tid
- User events: traceEvent#/traceMarker# forwarded as signposts
All functions gate on os_signpost_enabled() so the overhead when
Instruments is not attached is near zero (a single branch on the
log handle's signpost-enabled flag).
On non-Darwin platforms, all functions compile to empty macros.
Integration points:
- Stats.c: GC begin/end with full statistics
- Trace.h/Trace.c: thread and user event forwarding
- RtsStartup.c: init after initScheduler, free before endTracing
b203c9e to
eeb6c01
Compare
The AArch64 NCG was explicitly excluded from DWARF debug info generation despite having full UNWIND pseudo-instruction support. This meant all computed unwind data was silently discarded. Changes: - Remove ArchAArch64 exclusion from ncgDwarfEnabled, enabling .debug_frame output on AArch64 ELF (Linux) - Define REG_MachSp as r31 in arm64.h so the C stack pointer maps to DWARF register 31 (SP) instead of 0 (x0). Without this, all addSpUnwindings output incorrectly described x0 changes. - Add DW_CFA_same_value for AArch64 SP (register 31) in the CIE initial instructions. This prevents the DWARF unwinder from incorrectly setting SP = CFA (which is the STG Sp on x20). - Fix UNWIND Note reference to point to GHC.CmmToAsm (where the Note actually lives) instead of X86/Instr.hs.
Add basic block structure validation to the AArch64 code generator, mirroring the existing X86 implementation. This catches NCG bugs where non-control-flow instructions appear after block-terminating jumps, which would violate the basic block invariant. BL (Branch and Link) is exempted from the block-end check since it is a call that returns to the caller, not a block terminator. Only active when debugIsOn (debug builds).
DWARF generation was gated on osElfTarget, excluding all MachO targets despite the DWARF assembly output code already handling MachO section directives (__DWARF,__debug_*), darwin-specific alignment (.align as log2), and section offsets. Changes: - Replace osElfTarget gate with osDwarfTarget that accepts both ELF and MachO, enabling -g on macOS/darwin - Make DWARF section labels (dwarfInfoLabel, etc.) platform-aware: use "L" prefix on darwin (MachO convention) instead of hardcoded ".L" (ELF convention), via asmTempLabelPrefix With this change, ghc -g on AArch64-darwin produces .debug_frame entries visible to dsymutil, lldb, and Instruments.
Add DWARF Call Frame Information directives to the AArch64 StgRun function so debuggers and profilers can unwind through the Haskell↔C boundary. Without CFI, tools like lldb, gdb, and perf cannot produce backtraces that cross from Haskell into C code. The CFA is anchored at x29+16 (the frame pointer saved by the first stp), and all callee-saved registers (x19-x28, x16-x17, x29-x30, d8-d15) are annotated with their save locations on the C stack. Enable ENABLE_UNWINDING on AArch64-darwin in addition to Linux. The original restriction (#15207) was about x86_64 GCC/Clang assembler incompatibilities that do not apply to AArch64 where both Linux and darwin use Clang-compatible assemblers.
asmTempLabelPrefix is not exported from GHC.Cmm.CLabel. Use a local dwarfLocalLabel helper that implements the same logic: "L" on darwin, ".L" on ELF targets.
The assembler was reporting 'local symbol LcXX_proc_end not defined' because AArch64/Ppr.hs never emitted _proc_end labels that DWARF .debug_info and .debug_frame reference for procedure address ranges. Add pprProcEndLabel and pprBlockEndLabel helpers (matching the X86 pattern) and emit them: - At the end of each basic block (since blocks may become standalone top-level blocks after branch-chain elimination) - At the end of each procedure in pprNatCmmDecl (both with and without info tables) This fixes 14 test failures on aarch64-darwin with DWARF enabled.
The MachO assembler cannot handle relocations against local symbols
(L-prefixed labels on darwin) in DWARF debug sections, producing:
error: unsupported relocation of local symbol 'Lc134_die'.
Must have non-local symbol earlier in section.
This is a fundamental MachO assembler limitation that requires either
non-local DWARF labels or section anchor symbols to resolve.
Revert to ELF-only DWARF debug sections for now. The MachO-related
infrastructure (section directives, local label prefix support, CFI
directives in StgRun) is kept in place for future MachO DWARF work.
CFI directives (.cfi_*) in StgCRun.c remain enabled on AArch64-darwin
as they produce .eh_frame entries that the system tools handle fine.
On MachO, the assembler cannot create relocations against temporary symbols (L-prefixed) in DWARF debug sections unless there is a non-temporary symbol earlier in the section to serve as the relocation base. Without such an anchor, the assembler fails with: error: unsupported relocation of local symbol 'Lfoo'. Must have non-local symbol earlier in section. This was preventing DWARF debug info (-g) from working on macOS/darwin. The fix emits a linker-private anchor symbol (l_ prefix) at the start of each DWARF section (.debug_info, .debug_abbrev, .debug_line, .debug_frame, .debug_aranges). The l_ prefix gives us: - A symbol table entry (assembler can create relocations against it) - Local binding (no duplicate symbol errors across compilation units) Also fixes the label ordering in .debug_info where the section label was emitted BEFORE the section directive (placing it in the wrong section). With these anchors in place, DWARF is re-enabled on MachO targets. Additional changes: - Gate ncgComputeUnwinding to DWARF-capable targets (ELF + MachO) to avoid wasting work on platforms that cannot emit debug info - Document info table alignment decision in AArch64/Ppr.hs
Add signpostsAddCapabilities() to resize per-capability signpost ID arrays when setNumCapabilities grows the number of capabilities at runtime. Without this, new capabilities' GC intervals would not be tracked in Instruments (graceful degradation via bounds check, but data loss). Follows the pattern of tracingAddCapabilities() and storageAddCapabilities() in Schedule.c. Also use pprBlockEndLabel helper consistently in AArch64/Ppr.hs instead of manually constructing the label.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Two independent, complementary features that make GHC-compiled Haskell code visible to standard profiling tools on macOS:
Part 1: AArch64 DWARF Unwind Support (GHC #19913)
The AArch64 NCG was silently discarding
CmmUnwindnodes (return nilOL), producing no DWARF unwind information. This madelldb bt, Instruments, and Samply unable to unwind through Haskell frames on Apple Silicon.UNWINDpseudo-instruction to AArch64Instrdata type (mirrors X86)CmmUnwind→UNWINDconversion instmtToInstrsaddSpUnwindingsto emitUNWINDafterDELTA(tracks SP changes)extractUnwindPointsand wire intoNcgImpl(wasconst [])Part 2: macOS os_signpost Integration
The RTS had no
os_signpostsupport, making GC pauses, thread events, and user events invisible in Apple Instruments.rts/Signpost.{h,c}with os_signpost API wrapperstraceEvent#/traceMarker#)os_signpost_enabled()gate)Files Changed
Compiler (AArch64 DWARF):
compiler/GHC/CmmToAsm/AArch64/Instr.hs— UNWIND constructor + pattern matchescompiler/GHC/CmmToAsm/AArch64/CodeGen.hs— CmmUnwind handler, addSpUnwindings, extractUnwindPointscompiler/GHC/CmmToAsm/AArch64/Ppr.hs— UNWIND pretty-printingcompiler/GHC/CmmToAsm/AArch64.hs— Wire extractUnwindPoints into NcgImplRTS (os_signpost):
rts/Signpost.h— Header with Darwin functions / non-Darwin empty macrosrts/Signpost.c— os_signpost implementationrts/RtsStartup.c— initSignposts/freeSignposts lifecyclerts/Stats.c— GC begin/end signpost callsrts/Trace.h— Thread event signpost callsrts/Trace.c— User event signpost callsrts/rts.cabal— Add Signpost.c to buildTest plan
-gand verifydwarfdump --debug-frameshows FDE entrieslldband verifybtshows Haskell framestraceEvent/traceMarkerevents appear as signpost events