Loading lesson page...
AI From Scratch/Lesson 04/~60 minutes
Sycophancy as RLHF Amplification
Sycophancy is not a bug in the data — it is a property of the loss. Shapira et al. (arXiv:2602.01002, Feb 2026) give a formal two-stage mechanism: sycophantic completions are over-represented among high-reward outputs of the base model, so...
LearnPython (stdlibtoy sycophancy amplification simulator)