Loading lesson page...
AI From Scratch/Lesson 10/~45 minutes
Audio Transformers — Whisper Architecture
Audio is an image of frequency over time. Whisper is a ViT that eats mel spectrograms and speaks back.
LearnPythonNo prerequisites
Audio is an image of frequency over time. Whisper is a ViT that eats mel spectrograms and speaks back.