🔥

PyTorch

22 test cases

Native PyTorch distributed training examples covering DDP, FSDP, TorchTitan, DeepSpeed, and more. Includes LLM pre-training, fine-tuning, RLHF, inference serving, robotics, and multimodal models.

PyTorch FSDP

Fully Sharded Data Parallel training for large language models

FSDPShardingLarge ModelsMulti-GPU

PyTorch DDP

Distributed Data Parallel training - the foundation for multi-GPU PyTorch

DDPData ParallelMulti-GPUBaseline

DeepSpeed

Microsoft DeepSpeed ZeRO optimizer for memory-efficient distributed training

DeepSpeedZeROMemory EfficientLarge Models

TorchTitan

PyTorch native distributed training framework for production LLM pre-training

TorchTitanPre-training4D ParallelismProduction

Picotron

Lightweight distributed training library for educational and research use

PicotronLightweightEducationalResearch

vLLM

High-throughput LLM inference and serving engine

vLLMInferenceServingPagedAttention

OpenRLHF

Open-source RLHF framework for training reward models and policy optimization

RLHFPPODPOAlignment

NVIDIA Dynamo

Distributed LLM inference with KV cache-aware routing and disaggregated prefill/decode on HyperPod EKS

DynamoInferenceKV CacheDisaggregatedSGLang

MosaicML Composer

Training efficiency library with algorithmic speedups and multi-GPU orchestration

MosaicMLComposerTraining EfficiencySpeedups

NVIDIA Isaac Lab

Sim-to-real robot learning with NVIDIA Isaac Lab on GPU clusters

Isaac LabRoboticsSim2RealPhysical AIReinforcement Learning

OpenVLA OFT

Open Vision-Language-Action models with fine-tuning for robotic manipulation

OpenVLAVLARoboticsFine-tuningPhysical AI

nanoVLM

Lightweight vision-language model training for embodied AI

nanoVLMVLMMultimodalPhysical AIVision-Language

V-JEPA 2

Video Joint Embedding Predictive Architecture for physical world understanding

V-JEPA 2VideoSelf-supervisedPhysical AIWorld Models

Cosmos 3

NVIDIA Cosmos 3 Physical AI flywheel — omnimodal world models for generate → post-train → eval

CosmosWorld ModelsPhysical AIOmnimodalVideo Generation

DreamZero

14B World-Action Model (WAM) for robotic manipulation via video diffusion on EKS

DreamZeroWorld ModelsPhysical AIRoboticsVideo Diffusion

V-JEPA 2.1

Updated Video Joint Embedding Predictive Architecture for physical world understanding

V-JEPA 2VideoSelf-supervisedPhysical AIWorld Models

PointWorld

Distributed 3D world model pre-training for robotic manipulation (NVIDIA + Stanford)

PointWorld3D World ModelsPhysical AIRoboticsPoint Flow

OpenVLA

Open Vision-Language-Action model for generalist robotic manipulation

OpenVLAVLARoboticsPhysical AIVision-Language-Action

TRL (Transformers Reinforcement Learning)

HuggingFace TRL for RLHF, DPO, PPO, and reward model training

TRLRLHFDPOPPOAlignmentReinforcement Learning

vERL

Scalable reinforcement learning framework for LLM alignment and post-training

vERLRLHFPPOScalable RLAlignmentReinforcement Learning

SLIME

Lightweight distributed training library for efficient LLM fine-tuning

SLIMELightweightFine-tuningEfficientReinforcement Learning

Model Distillation

Knowledge distillation for compressing large models into smaller, efficient ones

DistillationKnowledge TransferCompressionModel Customisation