Loading lesson page...
AI From Scratch/Lesson 12/~120 minutes
Emu3: Next-Token Prediction for Image and Video Generation
BAAI's Emu3 (Wang et al., September 2024) is the 2024 result that should have ended the diffusion-versus-autoregressive debate. A single Llama-style decoder-only transformer, trained only on the next-token-prediction objective, across a un...
LearnPython (stdlib3D video tokenizer math + autoregressive sampler skeleton)