AI From Scratch/Lesson 03/~60 minutes

Reflexion: Verbal Reinforcement Learning

Gradient-based RL needs thousands of trials and a GPU cluster to fix a failure mode. Reflexion (Shinn et al., NeurIPS 2023) does it in natural language: after each failed trial, the agent writes a reflection, stores it in episodic memory,...

BuildPython (stdlib)

Loading lesson page...