Loading lesson page...
AI From Scratch/Lesson 15/~120 minutes
Janus-Pro: Decoupled Encoders for Unified Multimodal Models
Unified multimodal models have an unavoidable tension. Understanding wants semantic features — SigLIP or DINOv2 output vectors rich with concept-level information. Generation wants reconstruction-friendly codes — VQ tokens that compose bac...
BuildNo prerequisites