I'm an ML Engineer building production-grade systems where memory, latency, and model independence are non-negotiable. Currently shipping real-time sports analytics at Owl AI (X Games, snowboard) and a brain-inspired memory framework called NeuroStack that runs in my system tray. Coming up on my MS at CU Boulder (GPA 3.93/4.00), TA-ing Distributed Systems, and graduating May 2026 — open to full-time ML / SE / DS roles.
|
real-time inference throughput 45 → 58 FPS via TensorRT, quantization, prefetch |
Top-1 accuracy on AWA2 CLIP × KG-RGCN beats CLIP baseline (59% vs 36%) |
deployment time 15 → 7 min dockerized + CI/CD on Azure |
A/B-test decision latency group-sequential sampling at Fittlyf |
SOTA models beaten Frontiers paper on nuclear fuel ML |
A framework, two applications built on it, and a fourth bet on motion. NeuroStack is the foundation; everything else is what gets built on top.
A brain-inspired memory framework for AI agents. Built from scratch — no LangChain, no AutoGen, no ORM.
- 8 memory layers across 4 temperature zones — Hot (identity / temporal context / experience), Session (episodic buffer), Cool (episodic store + Kuzu entity graph + procedural memory), Archive (soft-delete)
- Multi-signal retrieval —
0.4 × cosine + 0.3 × ACT-R activation + 0.2 × salience + 0.1 × recency. Retrieved memories rank higher next time, automatically. - CompactionAgent runs 10 jobs every 5 min — embed, Zettelkasten link, decay, consolidate to semantic graph, reflect, re-promote, trim, refresh temporal cache
- Salience-gated ingestion — every episode scored at write time:
0.5 × surprise + 0.5 × LLM importance - Local-first ModelRouter — qwen3.5:4b on Ollama by default, OpenAI as fallback chain (gpt-5.2 → 5.1 → 5 → 4.1 → 4o)
- Stack · Python · SQLite (28 tables) · Kuzu graph DB · Ollama · OpenAI · embeddings (text-embedding-3-small)
First app built on NeuroStack. Runs silently in the macOS / Windows system tray, researches papers continuously, and brief me every morning.
- Autonomous arXiv + Semantic Scholar + web search every 2h
- Trend reports every 4h · daily morning briefings · personal & career assistant chat
- Sandboxed code agent that writes and executes Python with procedural memory
- 7-daemon orchestrator — research / trends / briefing / compaction / model-check / local-warmup
- Stack · tkinter + pystray · arXiv + Semantic Scholar + Brave Search APIs · ~696 KB Python
The next NeuroStack app — fully local agent runtime. No OpenAI fallback, no Anthropic, no provider lock-in.
- Privacy-by-default · cost-bounded · resilient to provider rate-limits or model sunsets
- One config flag swaps Ollama → llama.cpp → MLX → Claude → GPT
- Stack · NeuroStack · Ollama / llama.cpp / MLX · MCP-compatible tool registry
Task-specific movement understanding beyond pose landmarks. The platform doesn't ship analyzers — it generates them.
- Intent → agent-network generates analyzer → sandbox validates → deterministic runtime
- Action signatures over time, not pose snapshots — feedback / scoring / comparison
- Three surfaces: coaching · rehab · analytics
- Stack · MediaPipe · PyTorch · agent network · WebRTC
→ Full architecture write-up on the /building page
Real-time sports analytics for X Games snowboard. ML pipeline that ingests live broadcast feeds, extracts pose, runs temporal models, and emits commentator-facing trick metrics in real time.
- Productionized a live snowboard video ML pipeline on MediaMTX (SRT) for reliable broadcast ingest
- 45 → 58 FPS GPU inference via TensorRT, INT8 quantization, caching/prefetching
- Trained keypoint + temporal models on large-scale X Games video; tracked with MLflow
- Bayesian / MCMC scoring of trick metrics → event-score and podium-probability priors
- Distributed GPU data-curation pipeline orchestrating multimodal models (Qwen-3VL, NVIDIA Cosmos, SAM) for autonomous event detection, scene segmentation, dataset curation
- LLM agents for labeling / dataset QA / bookkeeping
location: Boulder, Colorado
education: MS Computer Science · CU Boulder · 2024–2026
gpa: 3.93 / 4.00
working: ML Engineer Intern @ Owl AI
teaching: TA Distributed Systems · Prof. Mark Zhao
shipping: real-time snowboard ML pipeline (X Games)
studying: CSCI 5214 Big Data Architecture
reading: Systems for ML — 22 paper reviews on the portfolio
open-to: Full-time ML / SE / Data Science roles · May 2026
|
|
|
|
- Bridging Multimodal Microscopy for Advanced Characterization on Nuclear Fuel Using Machine Learning — Frontiers in Mechanical Engineering · Digital Manufacturing · Vol. 11, 2025 · Co-authored at Idaho National Lab — transfer-learning framework outperformed 4 SOTA models on cross-scale defect segmentation
- Format Matters: An Empirical Study of Data Storage Format Impact on ML Training Pipelines — Systems for ML coursework · CU Boulder · Benchmarked 6 formats on CIFAR-10 + 1M-row tabular ML; isolated effects to I/O
- Multimodal Zero-Shot Learning for Unseen Concepts — CLIP × Knowledge-Graph R-GCN · +23% Top-1 over CLIP baseline on AWA2 (59% vs. 36%), +13% F1
- Human Pose Estimation in Fitness Tracking & Guidance — IJFMR · BlazePose-based real-time exercise monitoring + automated reporting
- 3rd Place · T9-MediHack 2025 (24 hr hackathon) — RePosture AI
- 7th of 20 teams · AWS Jam Hackathon, CU Boulder — 9/13 security challenges solved
- Best Project of the Year (Computer Science) · VTU Belagavi 2022 — KSCST-supported
- Frontiers paper · Co-author, nuclear fuel characterization, 2025
I'm looking for full-time ML / SE / Data Science roles starting May 2026. US-authorized to work (OPT eligible). Strongest fit: production ML systems with hard latency, throughput, or cost constraints — sports / video / agents / RAG / inference optimization.
→ AIML resume · SDE resume · Transcript
Gym · football · cricket · biking · the occasional film. Repetition over inspiration.


