Loading lesson page...
AI From Scratch/Lesson 02/~60 minutes
STaR, V-STaR, Quiet-STaR — Self-Taught Reasoning
The smallest possible self-improvement loop sits inside the rationale. A model generates a chain of thought, keeps the ones that land on correct answers, and fine-tunes on those. That is STaR. V-STaR adds a verifier so inference-time selec...
LearnNo prerequisites