Skip to content
View johnzfitch's full-sized avatar
🖲️
Building something new, something better.
🖲️
Building something new, something better.

Block or report johnzfitch

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
johnzfitch/README.md

Header

definitelynot.ai  Internet Universe  UC Berkeley Mathematics  Email

SF Bay Area  •  Git Page  •  All icons from iconics


OpenAI Codex: Finding the Ghost in the Machine

Important

Solved a pre-main()(⁠#[ctor::ctor]) environment stripping bug causing 11–300× GPU slowdowns that eluded OpenAI's debugging team for months. This was the main blocker to Codex spawning and controlling effective subagents. The regression often times caused delayed cpu fallback or silent failures in ML-related tasks across all operating systems.

Proof: Issue #8945 | PR #8951 | Release notes (rust-v0.80.0)

Full Investigation Details

The Ghost

In October 2025, OpenAI assembled a specialized debugging team to investigate mysterious slowdowns affecting Codex. After a week of intensive investigation: nothing.

The bug was literally a ghost — pre_main_hardening() executed before main(), stripped critical environment variables (LD_LIBRARY_PATH, DYLD_LIBRARY_PATH), and disappeared without a trace. Standard profilers saw nothing. Users saw variables in their shell, but inside codex exec they vanished.


The Hunt

Within 3 days of their announcement, I identified the problematic commit PR #4521 and contacted @tibo_openai.

But identification is not proof. I spent 2 months building an undeniable case.

Timeline

Date Event
Sept 30, 2025 PR #4521 merges, enabling pre_main_hardening() in release builds
Oct 1, 2025 rust-v0.43.0 ships (first affected release)
Oct 6, 2025 First “painfully slow” regression reports
Oct 1–29, 2025 Spike in env/PATH inheritance issues across platforms
Oct 29, 2025 Emergency PATH fix lands (did not catch root cause)
Late Oct 2025 OpenAI’s specialized team investigates, declares there is no root cause, identifies issue as user behavior change
Jan 9, 2026 My fix merged, credited in release notes

Evidence Collected

Platform Issues Failure Mode
macOS #6012, #5679, #5339, #6243, #6218 DYLD_* stripping breaking dynamic linking
Linux/WSL2 #4843, #3891, #6200, #5837, #6263 LD_LIBRARY_PATH stripping → silent CUDA/MKL degradation

Compiled evidence packages:

 Platform-specific failure modes
Reproduction steps with quantifiable performance regressions (11–300×) and benchmarks
 Pattern analysis
Cross-referenced 15+ scattered user reports over 3 months, traced process environment inheritance through fork/exec boundaries

Comprehensive Technical Analysis
Investigation Methodology


Why Conventional Debugging Failed

The bug was designed to be invisible:

Pre-main execution
Used #[ctor::ctor] to run before main(), before any logging or instrumentation
Silent stripping
No warnings, no errors — just missing environment variables
Distributed symptoms
Appeared as unrelated issues across different platforms and configurations
User attribution
Everyone assumed they misconfigured something (shell looked fine)
Wrong search space
Team was debugging post-main application code

[!NOTE] Standard debugging tools cannot see pre-main execution. Profilers start at main(). Log hooks are not initialized yet. The code executes, modifies the environment, and vanishes.


The Impact

OpenAI confirmed and merged the fix within 24 hours, explicitly crediting the investigation in v0.80.0 release notes:

"Codex CLI subprocesses again inherit env vars like LD_LIBRARY_PATH/DYLD_LIBRARY_PATH to avoid runtime issues. As explained in #8945, failure to pass along these environment variables to subprocesses that expect them (notably GPU-related ones), was causing 10×+ performance regressions! Special thanks to @johnzfitch for the detailed investigation and write-up in #8945."

Restored:

GPU acceleration Internal ML/AI dev teams
CUDA/PyTorch ML researchers
MKL/NumPy Scientific computing users
Conda environments Cross-platform compatibility
Enterprise drivers Database connectivity

When the tools are blind, the system lies, and everyone else has stopped looking for it..


Recent Work

claude-cowork-linux ⭐35
Run Claude Desktop's Cowork feature on Linux through reverse engineering and native module stubbing
human-interface-markdown
Apple Human Interface Guidelines archive (1980-2014) — 35 documents spanning Lisa, Mac, NeXT, Newton, Aqua, and iOS eras
claude-warden ⭐4
Token-saving hooks for Claude Code — prevents verbose output, blocks binary reads, enforces subagent budgets
pyghidra-lite
Lightweight MCP server for Ghidra reverse engineering — official MCP server listing
sites
Mutable topology layer for static sites on NixOS — reconciler-based deployer with zero webhooks
llmx
Codebase indexer with BM25 search and semantic chunk exports — live demo at llm.cat
dota
Defense of the Artifacts — post-quantum secure secrets manager with TUI

Selected Work

claude-wiki
Comprehensive Anthropic/Claude documentation wiki — 749+ docs across 24 categories
specHO
LLM watermark detection via phonetic/semantic analysis (The Echo Rule) — live demo at definitelynot.ai
codex-patcher
Automated code patching system for Rust with byte-span replacement and tree-sitter integration
htmx-docs
Organized HTMX ecosystem documentation corpus in Markdown (htmx.org, Big Sky repos, RFC 9110/9113/9114)
filearchy
COSMIC Files fork with sub-10ms(2.15M files) trigram search (Rust)
nautilus-plus
Enhanced GNOME Files with sub-ms search (AUR)
indepacer
PACER CLI for federal court research (PyPI: pacersdk)

Self-hosting bare metal infrastructure (NixOS) with post-quantum cryptography, authoritative DNS, and containerized services.


Live Demos

Cosmic Code Cleaner @ definitelynot.ai
LLM paste sanitizer with vectorhit algorithm — fix curly quotes, invisible Unicode, confusable punctuation, dedent blocks
LLMX Ingestor @ llm.cat
WebAssembly codebase indexer — private, deterministic chunking and BM25 search for large folders
LINTENIUM FIELD @ internetuniverse.org
Terminal-based ARG experience — interactive mystery with audio visualizations
Observatory @ look.definitelynot.ai
WebGPU deepfake detection running 4 ML models in browser

Featured

Observatory — WebGPU Deepfake Detection

Live Demo: look.definitelynot.ai

Browser-based AI image detection running 4 specialized ML models (ViT, Swin Transformer) through WebGPU. Zero server-side processing; all inference happens client-side with 672MB of ONNX models.

Model Accuracy Architecture
dima806_ai_real 98.2% Vision Transformer
SMOGY 98.2% Swin Transformer
Deep-Fake-Detector-v2 92.1% ViT-Base
umm_maybe 94.2% Vision Transformer

Stack: JavaScript (ES6) • Transformers.js • ONNX • WebGPU/WASM


iconics — Semantic Icon Library

3,372+ PNG icons with semantic CLI discovery. Find the right icon by meaning, not filename.

icon suggest security       # → lock, shield, key, firewall…
icon suggest data           # → chart, database, folder…
icon use lock shield        # Export to ./icons/

Features: Fuzzy search • theme variants • batch export • markdown integration
Stack: Python • FuzzyWuzzy • PIL


filearchy + triglyph — Sub-10ms File Search

COSMIC Files fork with embedded trigram search engine. Memory-mapped indices achieve sub-millisecond searches across 2.15M+ files with near-zero resident memory.

filearchy/
├── triglyph/      # Trigram library (mmap)
└── triglyphd/     # D-Bus daemon for system-wide search
Performance
2.15M(files indexed)  •  <10ms(query time)  •  156MB(index on disk)
Stack
Rust  •  libcosmic  •  memmap2  •  zbus

The Echo Rule — LLM Detection Methodology

LLMs echo their training data. That echo is detectable through pattern recognition:

Signature Detection Method
Phonetic CMU phoneme analysis, Levenshtein distance
Structural POS tag patterns, sentence construction
Semantic Word2Vec cosine similarity, hedging clusters

Implemented in specHO with 98.6% preprocessor test pass rate. Live demo at definitelynot.ai.


Skills

Technical focus — skills breakdown

Core: Rust | Python | TypeScript | C | Nix | Shell


AI / ML / Agent Tooling


Infrastructure

Primary server: Dedicated bare-metal NixOS host (details available on request)

Security Post-quantum SSH  •  Rosenpass VPN  •  nftables firewall
DNS Unbound resolver with DNSSEC  •  ad/tracker blocking
Services FreshRSS  •  Caddy (HTTPS/HTTP/3)  •  cPanel/WHM  •  Podman containers
Network Local 10Gbps  •  Authoritative BIND9 with RFC 2136 ACME

Philosophy

Pinned Loading

  1. llmx llmx Public

    Codebase indexer with BM25 search and semantic chunk exports for local agent consumption

    Rust 1

  2. specho-v2 specho-v2 Public

    Python 1

  3. claude-warden claude-warden Public

    Token-saving hooks for Claude Code. Prevents verbose output, blocks binary reads, enforces subagent budgets, truncates large outputs.

    Shell 4

  4. pyghidra-lite pyghidra-lite Public

    Lightweight MCP server for Ghidra-based reverse engineering with iOS, Linux, and game file support

    Python 1

  5. claude-cowork-linux claude-cowork-linux Public

    Run Claude Desktop's Cowork feature on Linux through reverse engineering and native module stubbing

    JavaScript 35 8

  6. dota dota Public

    Defense of the Artifacts - Post-quantum secure secrets manager with TUI

    Rust