DeepSpeed

Microsoft DeepSpeed ZeRO optimizer for memory-efficient distributed training

View on GitHub

Overview

DeepSpeed is a deep learning optimization library that makes distributed training efficient and effective. It provides ZeRO (Zero Redundancy Optimizer) for memory-efficient training.

ZeRO Stages

  • ZeRO-1: Optimizer state partitioning
  • ZeRO-2: + Gradient partitioning
  • ZeRO-3: + Parameter partitioning (similar to FSDP)

Quick Start

deepspeed --num_gpus 8 train.py --deepspeed_config ds_config.json