Loading lesson page...
AI From Scratch/Lesson 19/~180 minutes
Audio-Language Models: the Whisper to Audio Flamingo 3 Arc
Whisper (Radford et al., December 2022) settled speech recognition — 680k hours of weakly-supervised multilingual speech, a simple encoder-decoder transformer, a benchmark that made every subsequent ASR release cite it. But recognition is...
BuildNo prerequisites