Phase 07: Transformers Deep Dive
AI From Scratch/Lesson 06/~45 minutes

BERT — Masked Language Modeling

GPT predicts the next word. BERT predicts a missing word. One sentence of difference — and half a decade of everything embedding-shaped.

BuildPython
Loading lesson page...