Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLM
-
Updated
Oct 11, 2025 - Python
Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLM
Open-source pre-training implementation of Google's LaMDA in PyTorch. Adding RLHF similar to ChatGPT.
[CVPR 2024] Code for the paper "Using Human Feedback to Fine-tune Diffusion Models without Any Reward Model"
The ParroT framework to enhance and regulate the Translation Abilities during Chat based on open-sourced LLMs (e.g., LLaMA-7b, Bloomz-7b1-mt) and human written translation and evaluation data.
Product analytics for AI Assistants
[ECCV2024] Towards Reliable Advertising Image Generation Using Human Feedback
Dataset Viber is your chill repo for data collection, annotation and vibe checks.
Code for the paper "Aligning LLM Agents by Learning Latent Preference from User Edits".
[ICML 2024] Code for the paper "Confronting Reward Overoptimization for Diffusion Models: A Perspective of Inductive and Primacy Biases"
[ NeurIPS 2023 ] Official Codebase for "Aligning Synthetic Medical Images with Clinical Knowledge using Human Feedback"
Documentation at
Reinforcement Learning from Human Feedback with 🤗 TRL
Break out of the AI training bubble
REactive Behavior Constraint-Aware Tree learning (REBCAT) - a human-robot collaboration framework to learn task from demonstrations. Interpretable, fast, object-centric, and reactive.
RLHF Loop System - Learning project with monitoring dashboard, drift detection, and AI feedback analysis built with Claude's assistance
Daily Mandarin-English semantic alignment corpus for RLHF training, tone repair, AI metaphor translation, and OpenAI contributor tracking. #SamPickMe #RLHF #TSMC
🤖 Enhance reinforcement learning stability and efficiency with advanced algorithms like TRPO, PPO, DPO, GRPO, DAPO, and GSPO for optimized policy training.
Add a description, image, and links to the human-feedback topic page so that developers can more easily learn about it.
To associate your repository with the human-feedback topic, visit your repo's landing page and select "manage topics."