Phase 19: Capstone Projects
AI From Scratch/Lesson 77/~90 min

Data Parallel DDP From Scratch

DistributedDataParallel is a hook on top of allreduce. Wrap a model, broadcast the initial parameters from rank 0 so every rank starts identical, install a backward hook on every parameter that issues an allreduce of the gradient, and the...

BuildPythonNo prerequisites
Loading lesson page...