AI From Scratch/Lesson 28/~60 minutes

Long-Context Evaluation — NIAH, RULER, LongBench, MRCR

Gemini 3 Pro advertises 10M tokens of context. At 1M tokens, 8-needle MRCR drops to 26.3%. Advertised ≠ usable. Long-context evaluation tells you the actual capacity of the model you are shipping on.

LearnPython

Loading lesson page...