Skip to content
#

number-theoretic-transform

Here are 17 public repositories matching this topic...

GPU-accelerated Number-Theoretic Transform for ZK-Proof generation. Targets the NTT bottleneck (91% of Groth16 prover time) via two CUDA optimizations: async double-buffered pipeline eliminating CPU-GPU transfer overhead, and IADD3-path Montgomery multiplication reducing finite-field instruction latency. BLS12-381, Ampere sm_86, Nsight-profiled.

  • Updated Mar 16, 2026
  • Cuda

Improve this page

Add a description, image, and links to the number-theoretic-transform topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the number-theoretic-transform topic, visit your repo's landing page and select "manage topics."

Learn more