Loading lesson page...
AI From Scratch/Lesson 15/~60 minutes
Prompt Caching and Context Caching
Your system prompt is 4,000 tokens. Your RAG context is 20,000 tokens. You send both with every request. You also pay for both — every time. Prompt caching lets the provider keep that prefix warm on their side and bill you 10% of the norma...
BuildPython