🎯
Reinforcement Learning
4 test casesRLHF, DPO, PPO, and scalable RL frameworks for LLM alignment and post-training. Train reward models, optimize policies, and align models with human preferences at scale.
🤖
NVIDIA Isaac Lab
Sim-to-real robot learning with NVIDIA Isaac Lab on GPU clusters
Isaac LabRoboticsSim2RealPhysical AI
🎯
TRL (Transformers Reinforcement Learning)
HuggingFace TRL for RLHF, DPO, PPO, and reward model training
TRLRLHFDPOPPOAlignment
🎯
vERL
Scalable reinforcement learning framework for LLM alignment and post-training
vERLRLHFPPOScalable RLAlignment
🎯
SLIME
Lightweight distributed training library for efficient LLM fine-tuning
SLIMELightweightFine-tuningEfficient