Phase 07: Transformers Deep Dive
AI From Scratch/Lesson 10/~45 minutes

Audio Transformers — Whisper Architecture

Audio is an image of frequency over time. Whisper is a ViT that eats mel spectrograms and speaks back.

LearnPythonNo prerequisites
Loading lesson page...