TRL (Transformers Reinforcement Learning)

HuggingFace TRL for RLHF, DPO, PPO, and reward model training

View on GitHub

Overview

TRL (Transformers Reinforcement Learning) is HuggingFace’s library for training language models with reinforcement learning. It supports Supervised Fine-Tuning (SFT), Reward Modeling, PPO, DPO, and other alignment methods.

Key Features

  • SFT Trainer — Supervised fine-tuning with PEFT support
  • Reward Modeling — Train reward models from human preference data
  • PPO — Proximal Policy Optimization for RLHF
  • DPO — Direct Preference Optimization (reward-model-free)
  • ORPO / KTO — Alternative alignment methods
  • Multi-GPU — Native distributed training support

Quick Start

# Build the container
docker build -t trl-training .

# Launch with Slurm
sbatch slurm/run_trl.sbatch