Loading lesson page...
AI From Scratch/Lesson 11/~60 minutes
Scalable Oversight and Weak-to-Strong Generalization
Burns et al. (OpenAI Superalignment, "Weak-to-Strong Generalization", 2023) proposed a proxy for the superalignment problem: fine-tune a strong model using labels produced by a weaker model. If the strong model generalizes correctly from i...
LearnPython (stdlibW2SG gap simulator)No prerequisites