AI From Scratch/Lesson 25/~75 minutes

Speculative Decoding and EAGLE

A frontier LLM generating one token requires a full forward pass over billions of parameters. That forward pass is massively over-provisioned: most of the time a much smaller model can guess the next 3-5 tokens correctly, and the big model...

BuildPython (with numpy)No prerequisites

Loading lesson page...