TRL (Transformers Reinforcement Learning)
HuggingFace TRL for RLHF, DPO, PPO, and reward model training
View on GitHubOverview
TRL (Transformers Reinforcement Learning) is HuggingFace’s library for training language models with reinforcement learning. It supports Supervised Fine-Tuning (SFT), Reward Modeling, PPO, DPO, and other alignment methods.
Key Features
- SFT Trainer — Supervised fine-tuning with PEFT support
- Reward Modeling — Train reward models from human preference data
- PPO — Proximal Policy Optimization for RLHF
- DPO — Direct Preference Optimization (reward-model-free)
- ORPO / KTO — Alternative alignment methods
- Multi-GPU — Native distributed training support
Quick Start
# Build the container
docker build -t trl-training .
# Launch with Slurm
sbatch slurm/run_trl.sbatch