- Austin, TX, USA
-
09:35
(UTC -05:00) - @TheQonfused
- https://orcid.org/0009-0008-8122-3375
Highlights
Lists (3)
Sort Name ascending (A-Z)
Stars
- All languages
- ASL
- Assembly
- Astro
- C
- C#
- C++
- CSS
- Cuda
- Dart
- Dockerfile
- Elm
- Go
- HTML
- Java
- JavaScript
- Julia
- Jupyter Notebook
- Just
- Logos
- Makefile
- Mathematica
- Metal
- OCaml
- Objective-C
- Pascal
- PowerShell
- Python
- Rust
- Sass
- Scala
- Shell
- Starlark
- Swift
- SystemVerilog
- TeX
- TypeScript
- VBScript
- Visual Basic .NET
- Vue
- Wolfram Language
- Zig
Command-line toolkit for interacting with the Naya Create split keyboard over USB CDC.
CLI text optimizer built on GEPA. Uses Agentic Coding CLI's as mutator and observer -- no api keys required
FlexTensor is a tensor offloading and management library for PyTorch that enables running large models on limited GPU memory by intelligently offloading tensors between GPU and CPU memory.
TriAttention — Efficient long reasoning with trigonometric KV cache compression. Enables OpenClaw local deployment on memory-constrained GPUs.
🎨 NeMo Data Designer: Generate high-quality synthetic data from scratch or from seed data.
turboquant-based compression engine for LLM KV cache
MathCode: A Frontier Mathematical Coding Agent
Python package for LLM compression
Code for the papers: “Four Over Six: More Accurate NVFP4 Quantization with Adaptive Block Scaling” and “Adaptive Block-Scaled Data Types”
Inspects nsys dumps and measures NCCL collective launch skew
Region-level profiling for CUDA kernels with trace, NVBit, CUPTI, NSys, and an interactive Explorer.
Rust implementation of protobuf with editions support, JSON serialization, and zero-copy views
APOLLO: SGD-like Memory, AdamW-level Performance; MLSys'25 Oustanding Paper Honorable Mention
Compile programs directly into transformer weights. Includes a 2D convex-hull KV cache with O(log n) inference.
Repository for the blog post JAX-LM: Language Modeling and Distributed Training in JAX
🔥 LeetCode for PyTorch — practice implementing softmax, attention, GPT-2 and more from scratch with instant auto-grading. Jupyter-based, self-hosted or try online.
A framework for verifiable reasoning with language models.
Running a big model on a small laptop
REAP: Router-weighted Expert Activation Pruning for SMoE compression
Your personal intelligence agent. Watches the world from multiple data sources and pings you when something changes.