-
Notifications
You must be signed in to change notification settings - Fork 3.2k
Pull requests: openai/parameter-golf
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[Record] Block Attention Residuals + Tuned Legal TTT — val_bpb 1.12242 (8xH100 primary)
#1696
opened Apr 17, 2026 by
kings-crown
Loading…
[Record] Stage 3 + SpinQuant V1 + MP-SGD-TTT — val_bpb 1.0759
#1695
opened Apr 17, 2026 by
X-Abhishek-X
Loading…
Non-record: 11L XSA-All + EMA + Legal GPTQ on 8xH100 (1.11355 BPB)
#1694
opened Apr 17, 2026 by
Rtx09x
Loading…
Record: Casefold V4 + AttnOutGate + Multi-Phase Global SGD TTT — val_bpb 1.05733 (3-seed mean)
#1693
opened Apr 17, 2026 by
dexhunter
Contributor
Loading…
5 of 7 tasks
Add train_gpt_13 trainer and SDPA GQA compatibility guard in train_gpt
#1690
opened Apr 17, 2026 by
vanivamshi
Loading…
SP8192 + Adaptive
Hessian-Sensitivity GPTQ Clipping — 1.0822 bpb
#1689
opened Apr 17, 2026 by
chris-colinsky
Loading…
Add SP8192 qkramp05 + par-residual L6 + legal TTT systems rerun (1.080885 seed 42)
#1688
opened Apr 17, 2026 by
Buld1n
Loading…
Record: K_KVShare_Wider full-recipe FLA — val_bpb 1.04090 (3-seed mean)
#1687
opened Apr 17, 2026 by
resouer
Loading…
Non-record: JEPA Hybrid — first latent-prediction LM (1.7622 BPB, 7.5MB)
#1685
opened Apr 17, 2026 by
butbutt42
Loading…
10-min record: 13L int4 MLP + qTTT + QAT Precompile + ANS Hybrid (val…
#1683
opened Apr 16, 2026 by
yunoshev
Loading…
Non-record: GradPower for Muon prefers p<1 in matched H100 ablation
#1682
opened Apr 16, 2026 by
PapaFranku4647
Loading…
[Non-record] Megakernel Saturation Study: 5 Triton fusion variants cannot beat torch.compile at 27M scale
#1679
opened Apr 16, 2026 by
ChideraIbe123
Loading…
4 tasks done
SP8192 + 4-Layer Depth Recurrence (loop_end=6)
#1678
opened Apr 16, 2026 by
tashapais
Loading…
5 tasks
Record-track: Trajectory-State Readout + Muon 0.98 + Legal TTT (1.0788)
#1676
opened Apr 16, 2026 by
aazizyan
Loading…
11L + LN Scale + BigramHash 3072x112 + GPTQ: val_bpb=1.1451
#1675
opened Apr 16, 2026 by
jayzuccarelli
Loading…
Non-record: Parcae Loop Injection + Gemma-style Attention + Gram NS
#1674
opened Apr 16, 2026 by
mikeapedia
Loading…
[Non-Record] FoBa-GLU + GramMuon + INT6 QAT — val_bpb 2.636 (unlimited compute track)
#1673
opened Apr 16, 2026 by
thehimalayanleo
Loading…
Record: Gated Residual Scaling (Token-wise) for Attention + MLP - 1.3827 BPB
#1671
opened Apr 16, 2026 by
souro26
Loading…
Record: Casefold V4 Tokenizer + Multi-Phase Global SGD TTT — val_bpb 1.05970 (3-seed mean)
#1670
opened Apr 16, 2026 by
dexhunter
Contributor
Loading…
RECORD: SmearGate + Attention Output Gate + Legal TTT | val_bpb=1.07139
#1667
opened Apr 16, 2026 by
MarioPaerle
Loading…
Previous Next
ProTip!
Adding no:label will show everything without a label.