NeMo RL
Reinforcement learning from human feedback with NeMo
View on GitHubOverview
NeMo RL extends NeMo with reinforcement learning capabilities for alignment training using RLHF, PPO, and reward model training.
Reinforcement learning from human feedback with NeMo
View on GitHubNeMo RL extends NeMo with reinforcement learning capabilities for alignment training using RLHF, PPO, and reward model training.