AI From Scratch

Phase 16/25 lessons/~28 hours

Multi-Agent & Swarms

Coordination, emergence, and collective intelligence.

0 / 25 complete0%

Lessons

01Why Multi-Agent?Up nextOne agent hits a wall. The smart move is not a bigger agent - it is more agents.Learn/~60 minutes/TypeScript 02Heritage of FIPA-ACL and Speech ActsBefore MCP, before A2A, there was FIPA-ACL. In 2000 the IEEE Foundation for Intelligent Physical Agents ratified an agent communication language with twenty performatives, two content languages, and a set of interaction protocols — contrac...Learn/~60 minutes/Python (stdlib)03Communication ProtocolsAgents that can't speak the same language aren't a team. They're strangers shouting into the void.Build/~120 minutes/TypeScript 04The Multi-Agent Primitive ModelEvery multi-agent framework shipping in 2026 — AutoGen, LangGraph, CrewAI, OpenAI Agents SDK, Microsoft Agent Framework — is a point in a four-dimensional design space. Four primitives, nothing more: the agent, the handoff, the shared stat...Learn/~60 minutes/Python (stdlib)05Supervisor / Orchestrator-Worker PatternOne lead agent plans and delegates; specialized workers execute in parallel contexts and report back. This is the pattern behind Anthropic's Research system (Claude Opus 4 as lead, Sonnet 4 as subagents), measured at +90.2% over single-age...Learn + Build/~75 minutes/Python (stdlib, threading)06Hierarchical Architecture and Its Failure ModeHierarchical is supervisor nested. Manager agents over sub-managers over workers. CrewAI Process.hierarchical is the textbook version: a manager_llm dynamically delegates tasks and validates outputs. The LangGraph equivalent is create_supe...Learn + Build/~60 minutes/Python (stdlib)07Society of Mind and Multi-Agent DebateMinsky's 1986 premise — intelligence is a society of specialists — gets rediscovered every decade. In 2023 Du et al. turned it into a concrete algorithm: multiple LLM instances propose answers, read each other's answers, critique, and upda...Learn + Build/~60 minutes/Python (stdlib)08Role Specialization — Planner, Critic, Executor, VerifierThe most common multi-agent decomposition in 2026: one agent plans, one executes, one critiques or verifies. MetaGPT (arXiv:2308.00352) formalizes this as SOPs encoded into role prompts — Product Manager, Architect, Project Manager, Engine...Learn + Build/~60 minutes/Python (stdlib)09Parallel / Swarm / Networked ArchitecturesContrast with supervisor: no central decider. Agents read a shared event bus, pick up work asynchronously, write results back. LangGraph explicitly supports "Swarm Architecture" for decentralized, dynamic environments. Matrix (arXiv:2511.2...Learn + Build/~75 minutes/Python (stdlib, threading, queue)10Group Chat and Speaker SelectionAutoGen GroupChat and AG2 GroupChat share one conversation across N agents; a selector function (LLM, round-robin, or custom) picks who speaks next. This is the archetype of emergent multi-agent conversation — agents do not know their role...Learn + Build/~60 minutes/Python (stdlib)11Handoffs and Routines — Stateless OrchestrationOpenAI's Swarm (October 2024) distilled multi-agent orchestration to two primitives: routines (instructions + tools as a system prompt) and handoffs (a tool that returns another Agent). No state machine, no branching DSL — the LLM routes b...Learn + Build/~60 minutes/Python (stdlib)12A2A — The Agent-to-Agent ProtocolGoogle announced A2A in April 2025; by April 2026 the spec is at https://a2a-protocol.org/latest/specification/ and 150+ organizations back it. A2A is the horizontal complement to MCP (Lesson 13): where MCP is vertical (agent ↔ tools), A2A...Learn + Build/~75 minutes/Python (stdlib, http.server, json)13Shared Memory and Blackboard PatternsTwo approaches coexist in 2026 multi-agent systems: the message pool (everyone sees everyone's messages, as in AutoGen GroupChat or MetaGPT) and the blackboard with subscription (agents subscribe to relevant events, as in Context-Aware MCP...Learn + Build/~75 minutes/Python (stdlib, threading)14Consensus and Byzantine Fault Tolerance for AgentsClassical distributed-systems BFT meets stochastic LLMs. In 2025-2026 three research directions emerged: CP-WBFT (arXiv:2511.10400) weighs each vote by a confidence probe; DecentLLMs (arXiv:2507.14928) goes leaderless with parallel worker...Learn + Build/~75 minutes/Python (stdlib)15Voting, Self-Consistency, and Debate TopologyThe cheapest aggregation: sample N independent agents, majority-vote. Wang et al. 2022 self-consistency did this with one model sampled N times. Multi-agent extends it with heterogeneous agents to escape monoculture — different models, dif...Learn + Build/~75 minutes/Python (stdlib)16Negotiation and BargainingAgents negotiate resources, prices, task allocations, and terms. The 2026 benchmark set is clear: NegotiationArena (arXiv:2402.05863) shows LLMs can improve payoffs ~20% via persona manipulation ("desperation"); "Measuring Bargaining Abili...Learn + Build/~75 minutes/Python (stdlib)17Generative Agents and Emergent SimulationPark et al. 2023 (UIST '23, arXiv:2304.03442) populated Smallville, a sandbox of 25 agents, with a three-part architecture: memory stream (natural-language log), reflection (higher-level syntheses the agent generates about its own stream),...Learn + Build/~75 minutes/Python (stdlib)18Theory of Mind and Emergent CoordinationLi et al. (arXiv:2310.10701) showed that LLM agents in a cooperative text game exhibit emergent high-order Theory of Mind (ToM) — reasoning about what another agent believes about a third agent's beliefs — but fail on long-horizon planning...Learn + Build/~75 minutes/Python (stdlib)19Swarm Optimization for LLMs (PSO, ACO)Bio-inspired optimization is making an LLM comeback. LMPSO (arXiv:2504.09247) uses PSO where each particle's velocity is a prompt and the LLM generates the next candidate; works well on structured-sequence outputs (math expressions, progra...Learn + Build/~75 minutes/Python (stdlib)20MARL — MADDPG, QMIX, MAPPOThe reinforcement-learning heritage of multi-agent coordination, which still informs LLM-agent systems in 2026. MADDPG (Lowe et al., NeurIPS 2017, arXiv:1706.02275) introduced Centralized Training, Decentralized Execution (CTDE): each crit...Learn/~90 minutes 21Agent Economies, Token Incentives, ReputationLong-horizon autonomous agents (METR's 1-hour to 8-hour work-curve) need economic agency. The emerging 5-layer stack is: DePIN (physical compute) → Identity (W3C DIDs + reputation capital) → Cognition (RAG + MCP) → Settlement (account abst...Learn/~75 minutes/Python (stdlib)22Production Scaling — Queues, Checkpoints, DurabilityScaling multi-agent systems to thousands of concurrent runs requires durable execution. LangGraph's runtime writes a checkpoint after each super-step keyed by thread_id (Postgres by default); worker crashes release a lease and another work...Learn + Build/~75 minutes/Python (stdlib, asyncio, sqlite3)23Failure Modes — MAST, Groupthink, Monoculture, Cascading ErrorsThe reference taxonomy for 2026 is MAST (Cemri et al., NeurIPS 2025, arXiv:2503.13657), derived from 1642 execution traces across 7 state-of-the-art open-source MAS showing 41–86.7% failure rate. Three root categories: Specification Proble...Learn/~75 minutes/Python (stdlib)24Evaluation and Coordination BenchmarksFive 2025-2026 benchmarks cover the multi-agent evaluation space. MultiAgentBench / MARBLE (ACL 2025, arXiv:2503.01935) evaluates star/chain/tree/graph topologies with milestone KPIs; graph is best for research, cognitive planning adds ~3%...Learn/~75 minutes/Python (stdlib)25Case Studies and the 2026 State of the ArtThree production-grade references to study end-to-end, each illustrating a different slice of multi-agent engineering. Anthropic's Research system (orchestrator-worker, 15x tokens, +90.2% over single-agent Opus 4, rainbow deployments) is t...Learn (capstone)/~90 minutes