Loading lesson page...
AI From Scratch/Lesson 10/~60 minutes
Cold Start Mitigation for Serverless LLMs
A 20 GB model image takes 5-10 minutes (7B) to 20+ minutes (70B) to go from cold to serving. In a true serverless world, that is not a warm-up — it is an outage. Mitigations operate at five layers: pre-seeded node images (Bottlerocket on A...
Learn