Loading lesson page...
AI From Scratch/Lesson 09/~45 minutes
Vision Transformers (ViT)
An image is a grid of patches. A sentence is a grid of tokens. The same transformer eats both.
BuildPython
An image is a grid of patches. A sentence is a grid of tokens. The same transformer eats both.