Skip to content

perf: bounds-check elimination in loop-indexed array access#506

Merged
cs01 merged 1 commit intomainfrom
perf/bounds-check-elim
Apr 13, 2026
Merged

perf: bounds-check elimination in loop-indexed array access#506
cs01 merged 1 commit intomainfrom
perf/bounds-check-elim

Conversation

@cs01
Copy link
Copy Markdown
Owner

@cs01 cs01 commented Apr 13, 2026

Why this matters

Every arr[i] emits a runtime bounds check (icmp + branch + fail-path + llvm.assume). In loops where the index is provably in-bounds, those checks are pure overhead that bloat the generated IR and prevent LLVM from vectorizing cleanly. This PR teaches the compiler to recognize the most common safe loop pattern and emit a fast-path GEP with no bounds check at all.

What users will see

  • Tighter generated code for the canonical while (i < arr.length) and for (let i = 0; i < arr.length; i++) patterns. Fewer icmp/branch instructions inside the hot loop, fewer basic blocks per access, smaller .ll output.
  • Measured impact on the test fixtures: the new array-bounds-elim-safe.ts fixture drops from 11 bounds-check labels to 2 when compiled with the patched compiler. The stringsearch benchmark drops from 8 to 5 bounds-check labels; stringops drops from 5 to 2.
  • Safety preserved: out-of-bounds accesses still crash. Two new fixture tests (array-bounds-elim-unsafe-offset.ts, array-bounds-elim-unsafe-pop.ts) explicitly prove that offset indices (arr[i+5] inside i < arr.length) and array-mutating loop bodies (e.g. arr.pop() inside the loop) keep their runtime bounds checks.

What's actually eliminated

The analyzer runs before loop body codegen and registers (indexVar, arrayVar) pairs on a per-loop-scope stack. The IndexAccessGenerator consults the stack when it sees a direct arr[i] pattern and skips emitBoundsCheck when the pair is proven safe.

Supported patterns:

  • for (let i = 0; i < arr.length; i = i + 1) { ... arr[i] ... } (Pattern A)
  • while (i < arr.length) { ... arr[i] ...; i = i + 1 } (Pattern B)
  • Monotone non-negative stride i = i + k where k is a non-negative numeric literal (Pattern C)

Soundness bail-outs (conservative — keep the bounds check):

  • any push/pop/shift/unshift/splice/sort/reverse/fill/copyWithin on the array inside the body
  • reassignment of the array variable
  • any non-monotone update to the index variable
  • any nested loop that rebinds either variable
  • unknown statement shapes — fall back to checked access

Coverage limitations (honest call-outs)

  • Compound index expressions are not optimized. a[row * N + k] in benchmarks/matmul/chadscript.ts is not a direct arr[i] access, so bounds checks remain. Matmul timings are unchanged by this PR (~109ms on my laptop, same as main).
  • Non-.length loop bounds are not optimized. while (i < N) or while (p * p <= LIMIT) (sieve, nbody, quicksort) do not match the pattern — the analyzer only recognizes i < arr.length where arr is the access target. A future PR could track fixed-length arrays (new Uint8Array(N+1)) and prove i < N+1 == arr.length.
  • Index-access assignments (arr[i] = x) still emit bounds checks. The primary win is read-heavy loops.

Scope

Applied to four access paths in src/codegen/expressions/access/index.ts:

  • generateNumericArrayIndex (number[])
  • generateStringArrayIndex (string[])
  • generateObjectArrayIndex (object[])
  • generateUint8ArrayIndex (Uint8Array reads)

JSON array index paths and index-assignment paths are unchanged.

Test plan

  • npm test — 777/777 pass (3 new fixtures)
  • bash scripts/self-hosting.sh --quick — Stage 0 + Stage 1 both green
  • array-bounds-elim-safe.ts — exercises the pattern, asserts correct sums
  • array-bounds-elim-unsafe-offset.ts — proves out-of-bounds arr[i+5] still exits 1
  • array-bounds-elim-unsafe-pop.ts — proves mutating loop keeps its bounds check
  • Verified IR: array-bounds-elim-safe.ts goes from 11 to 2 bounds_fail labels

@github-actions
Copy link
Copy Markdown
Contributor

Benchmark Results (Linux x86-64)

Benchmark C ChadScript Go Node Place
Binary Trees 1.436s 1.204s 2.707s 1.183s 🥈
Cold Start 0.9ms 0.8ms 1.2ms 25.9ms 🥇
Fibonacci 0.815s 0.815s 1.561s 3.162s 🥇
File I/O 0.116s 0.090s 0.084s 0.196s 🥈
JSON Parse/Stringify 0.004s 0.005s 0.017s 0.016s 🥈
Matrix Multiply 0.435s 0.994s 0.616s 0.379s #4
Monte Carlo Pi 0.389s 0.410s 0.405s 2.248s 🥉
N-Body Simulation 1.695s 2.128s 2.203s 2.388s 🥈
Quicksort 0.214s 0.245s 0.213s 0.261s 🥉
SQLite 0.352s 0.404s 0.408s 🥈
Sieve of Eratosthenes 0.015s 0.028s 0.020s 0.040s 🥉
String Manipulation 0.008s 0.045s 0.016s 0.036s #4

CLI Tool Benchmarks

Benchmark ChadScript grep node xxd Place
Hex Dump 0.431s 0.912s 0.130s 🥈
Recursive Grep 0.018s 0.009s 0.096s 🥈

@cs01 cs01 merged commit dcf1cd3 into main Apr 13, 2026
13 checks passed
cs01 added a commit that referenced this pull request Apr 13, 2026
Co-authored-by: cs01 <cs01@users.noreply.github.com>
cs01 added a commit that referenced this pull request Apr 14, 2026
cs01 added a commit that referenced this pull request Apr 14, 2026
…506)" (#511)

This reverts commit dcf1cd3.

Co-authored-by: cs01 <cs01@users.noreply.github.com>
@cs01 cs01 deleted the perf/bounds-check-elim branch April 24, 2026 20:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant