Loading lesson page...
AI From Scratch/Lesson 09/~45 minutes
Constitutional AI and Self-Improvement
RLHF needs humans in the loop. Constitutional AI replaces most of them with the model itself. Write a list of principles, have the model critique its own outputs against those principles, and train on the critiques. DeepSeek-R1 pushed this...
BuildPython (stdlib + numpy)No prerequisites