AI From Scratch/Lesson 02/~60 minutes

STaR, V-STaR, Quiet-STaR — Self-Taught Reasoning

The smallest possible self-improvement loop sits inside the rationale. A model generates a chain of thought, keeps the ones that land on correct answers, and fine-tunes on those. That is STaR. V-STaR adds a verifier so inference-time selec...

LearnNo prerequisites

Loading lesson page...