Loading lesson page...
AI From Scratch/Lesson 36/~90 minutes
Training Loop and Evaluation
A loop that does not measure is a loop that lies. This lesson builds the training loop that drives the GPT model: AdamW with weight decay split, a warmup plus cosine learning rate schedule, a calc_loss_batch helper, an evaluate_model pass...
BuildPython