Skip to content
View hmishfaq's full-sized avatar

Highlights

  • Pro

Block or report hmishfaq

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results
Python 473 38 Updated Aug 28, 2025

Assignments for CS146S: The Modern Software Dev (Stanford University Fall 2025)

Python 3,060 698 Updated Nov 10, 2025

Official Implementation of "Maximum Likelihood Reinforcement Learning (MaxRL)"

Python 153 20 Updated Feb 26, 2026

A Zsh theme

Shell 53,316 2,392 Updated Mar 11, 2026

🙃 A delightful community-driven (with 2,400+ contributors) framework for managing your zsh configuration. Includes 300+ optional plugins (rails, git, macOS, hub, docker, homebrew, node, php, python…

Shell 185,330 26,313 Updated Mar 10, 2026

EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL

Python 4,714 362 Updated Mar 10, 2026
Python 1,174 119 Updated Feb 28, 2026

Jax implementation of LMC-LSVI and Adam LMCDQN .

Python 1 Updated Jun 24, 2025

[ICLR 2026]QeRL enables RL for 32B LLMs on a single H100 GPU.

Python 492 50 Updated Nov 27, 2025

This repo contains the source code for the paper "Evolution Strategies at Scale: LLM Fine-Tuning Beyond Reinforcement Learning"

Python 320 33 Updated Feb 18, 2026

A scalable asynchronous reinforcement learning implementation with in-flight weight updates.

Python 380 38 Updated Mar 10, 2026
Python 144 21 Updated Sep 29, 2025

A playbook for systematically maximizing the performance of deep learning models.

29,906 2,425 Updated Jun 18, 2024

🌾 OAT: A research-friendly framework for LLM online alignment, including reinforcement learning, preference learning, etc.

Python 638 60 Updated Jan 29, 2026

🙌 OpenHands: AI-Driven Development

Python 68,990 8,626 Updated Mar 12, 2026

Kinetics: Rethinking Test-Time Scaling Laws

Python 86 3 Updated Jul 11, 2025

SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning

Python 177 20 Updated Sep 18, 2025

A little Python script to collect LaTeX sources for upload to the arXiv.

Python 375 27 Updated Jul 5, 2025

A series of Jupyter notebooks that walk you through the fundamentals of Machine Learning and Deep Learning in Python using Scikit-Learn, Keras and TensorFlow 2.

Jupyter Notebook 12,532 4,888 Updated Feb 22, 2026

Solve puzzles. Improve your pytorch.

Jupyter Notebook 3,976 359 Updated Jul 15, 2024

Simple RL training for reasoning

Python 3,833 287 Updated Dec 23, 2025

Minimal reproduction of DeepSeek R1-Zero

Python 12,926 1,580 Updated Feb 27, 2026

verl: Volcano Engine Reinforcement Learning for LLMs

Python 19,858 3,410 Updated Mar 12, 2026

Code for the paper "VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment"

Python 187 23 Updated May 25, 2025

Recipes to train reward model for RLHF.

Python 1,520 108 Updated Apr 24, 2025

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 72,943 14,276 Updated Mar 12, 2026

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

Python 157,787 32,417 Updated Mar 12, 2026

Machine Learning Foundations: Linear Algebra, Calculus, Statistics & Computer Science

Jupyter Notebook 4,610 2,208 Updated Nov 20, 2024

Python best practices guidebook, written for humans.

Batchfile 29,540 5,904 Updated Jul 29, 2024
Next