OpenRLHF
Open-source RLHF framework for training reward models and policy optimization
View on GitHubOverview
OpenRLHF provides distributed RLHF training capabilities, including reward model training, PPO, and DPO alignment.
Open-source RLHF framework for training reward models and policy optimization
View on GitHubOpenRLHF provides distributed RLHF training capabilities, including reward model training, PPO, and DPO alignment.