perf: skip temporal accumulation when disabled#2
Open
tonyblu331 wants to merge 1 commit into
Open
Conversation
deaa473 to
9df28d7
Compare
- Gate accumulation target clearing: only clear when accumulate=true - Skip accumulation render pass when accumulate=false (default) - Point composite directly to blur output when accumulate disabled - Fix floating point precision tests with toBeCloseTo() Saves 1 render pass + 2 clears per frame (~2M-8M pixel ops)
9df28d7 to
2c90f8b
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Continuing with the refinements of the N8AO shader, this is quite a solid base to build off, and a quick optimization that can bring a lot. This is a follow-up to the ongoing work on the N8AO WebGPU/TSL implementation.
Summary
Optimizes the pipeline by skipping unnecessary GPU work when temporal accumulation is disabled (the default).
When the user has not enabled temporal accumulation, the code was still clearing two accumulation targets to black each frame and running a full-screen accumulation pass that did nothing more than copying the blur output back to itself via an identity mix(). This removes that waste.
Why this matters
The default configuration has accumulate: false. Most users bump into this path with every frame. The accumulation pass only earns its keep when the camera is static and noise is being blended across successive frames. When it is off, the pass is dead work waiting to be cut.
Changes
src/N8AONode.ts -- Two gates:
src/math.test.ts -- Floating-point comparisons now use toBeCloseTo via a compact expectCloseTo helper. The previous exact-equality assertions were brittle across platforms and toolchain versions.
Benchmark
The following numbers come from a mock-renderer pipeline trace (benchmark.mjs) that records render pass and clear counts per frame. Each configuration was run for 1000 iterations.
The accumulation-enabled path is unchanged. The savings appear in every other configuration, meaning the path that runs for the majority of users.
Rough GPU-operation equivalents at 1080p (half-res AO = 960x540):
About 3.1 million GPU operations per frame at 1080p. At 4K the figure is roughly 10.4 million operations per frame.
Safety
Test results
All 4 tests pass across both the N8AO math utilities and the pipeline logic.