Phase 10: LLMs from Scratch
AI From Scratch/Lesson 11/~120 minutes

Quantization: Making Models Fit

A 70B model in FP16 needs 140GB. Two A100s just for weights. Quantize to FP8: one 80GB GPU. INT4: a MacBook.

BuildPython (with numpy)No prerequisites
Loading lesson page...