PyTorch DDP
Distributed Data Parallel training - the foundation for multi-GPU PyTorch
View on GitHubOverview
PyTorch Distributed Data Parallel (DDP) is the standard approach for multi-GPU data-parallel training. It replicates the model on each GPU and synchronizes gradients during backpropagation.
When to Use DDP
- Model fits in a single GPU memory
- You want to scale training across multiple GPUs/nodes
- Simple setup with minimal code changes
Quick Start
import torch.distributed as dist
from torch.nn.parallel import DistributedDataParallel as DDP
dist.init_process_group("nccl")
model = DDP(model, device_ids=[local_rank])