Loading lesson page...
AI From Scratch/Lesson 05/~60 minutes
Constitutional AI and RLAIF
Bai et al. (arXiv:2212.08073, 2022) asked: what if we replaced the human labeler with an AI that reads a list of principles? Constitutional AI has two phases — self-critique and revision under a constitution, then RL from AI Feedback. The...
Learn