Phase 17: Infrastructure & Production
AI From Scratch/Lesson 10/~60 minutes

Cold Start Mitigation for Serverless LLMs

A 20 GB model image takes 5-10 minutes (7B) to 20+ minutes (70B) to go from cold to serving. In a true serverless world, that is not a warm-up — it is an outage. Mitigations operate at five layers: pre-seeded node images (Bottlerocket on A...

Learn
Loading lesson page...