Loading lesson page...
AI From Scratch/Lesson 03/~60 minutes
Reflexion: Verbal Reinforcement Learning
Gradient-based RL needs thousands of trials and a GPU cluster to fix a failure mode. Reflexion (Shinn et al., NeurIPS 2023) does it in natural language: after each failed trial, the agent writes a reflection, stores it in episodic memory,...
BuildPython (stdlib)