OpenRLHF

Open-source RLHF framework for training reward models and policy optimization

View on GitHub

Overview

OpenRLHF provides distributed RLHF training capabilities, including reward model training, PPO, and DPO alignment.