AI From Scratch
Phase 13/23 lessons/~24.5 hours

Tools & Protocols

The interfaces between AI and the real world.

0 / 23 complete0%
Start phase
Lessons
01The Tool Interface — Why Agents Need Structured I/OUp nextA language model produces tokens. A program takes actions. The gap between those two is the tool interface: a contract that lets the model request an action and the host execute it. Every 2026 stack — function calling on OpenAI, Anthropic,...Learn/~45 minutes/Python (stdlib, no LLM)02Function Calling Deep Dive — OpenAI, Anthropic, GeminiThe three frontier providers converged on the same tool-call loop in 2024 and then diverged on everything else. OpenAI uses tools and tool_calls. Anthropic uses tool_use and tool_result blocks. Gemini uses functionDeclarations and unique-i...Build/~75 minutes/Python (stdlib, schema translators)03Parallel Tool Calls and Streaming with ToolsThree independent weather lookups serialized is three round trips. Run them in parallel and total time collapses to the slowest single call. Every frontier provider now emits multiple tool calls in a single turn. The payoff is real; the pl...Build/~75 minutes/Python (stdlib, thread pool + streaming harness)04Structured Output — JSON Schema, Pydantic, Zod, Constrained Decoding"Ask the model nicely to return JSON" fails 5 to 15 percent of the time, even on frontier models. Structured outputs close that gap with constrained decoding: the model is literally prevented from emitting a token that would violate the sc...Build/~75 minutes05Tool Schema Design — Naming, Descriptions, Parameter ConstraintsA correct tool fails silently when the model cannot tell when to use it. Naming, descriptions, and parameter shapes drive 10 to 20 percentage-point swings in tool-selection accuracy on benchmarks like StableToolBench and MCPToolBench++. Th...Learn/~45 minutes/Python (stdlib, tool schema linter)06MCP Fundamentals — Primitives, Lifecycle, JSON-RPC BaseEvery integration before MCP was a one-off. The Model Context Protocol, first shipped by Anthropic in November 2024 and now stewarded by the Linux Foundation's Agentic AI Foundation, standardizes discovery and invocation so any client can...Learn/~45 minutes07Building an MCP Server — Python + TypeScript SDKsMost MCP tutorials show only stdio hello-worlds. A real server exposes tools plus resources plus prompts, handles capability negotiation, emits structured errors, and works the same across SDKs. This lesson builds a notes server end-to-end...Build/~75 minutes/Python (stdlib, stdio MCP server)08Building an MCP Client — Discovery, Invocation, Session ManagementMost MCP content ships server tutorials and waves a hand at the client. Client code is where the hard orchestration lives: process spawning, capability negotiation, tool list merging across multiple servers, sampling callbacks, reconnectio...Build/~75 minutes09MCP Transports — stdio vs Streamable HTTP vs SSE Migrationstdio works locally and nowhere else. Streamable HTTP (2025-03-26) is the remote standard. The old HTTP+SSE transport is deprecated and being removed in mid-2026. Picking the wrong transport costs a migration; picking the right one buys a...Learn/~45 minutes/Python (stdlib, Streamable HTTP endpoint skeleton)10MCP Resources and Prompts — Context Exposure Beyond ToolsTools get 90 percent of MCP attention. The other two server primitives solve different problems. Resources expose data for reading; prompts expose reusable templates as slash-commands. Many servers should use resources instead of wrapping...Build/~45 minutes/Python (stdlib, resource + prompt handler)11MCP Sampling — Server-Requested LLM Completions and Agent LoopsMost MCP servers are dumb executors: take arguments, run code, return content. Sampling lets a server flip direction: it asks the client's LLM to make a decision. This enables server-hosted agent loops without the server owning any model c...Build/~75 minutes/Python (stdlib, sampling harness)12Roots and Elicitation — Scoping and Mid-Flight User InputHard-coded paths break the moment a user opens a different project. Pre-filled tool arguments break when the user under-specifies. Roots scope the server to a user-controlled set of URIs; elicitation pauses mid-tool-call to ask the user fo...Build/~45 minutes/Python (stdlib, roots + elicitation demo)13Async Tasks (SEP-1686) — Call-Now, Fetch-Later for Long-Running WorkReal agent work takes minutes to hours: CI runs, deep-research synthesis, batch exports. Synchronous tool calls drop connections, time out, or block the UI. SEP-1686, merged in 2025-11-25, adds a Tasks primitive: any request can be augment...Build/~75 minutes/Python (stdlib, async task state machine)14MCP Apps — Interactive UI Resources via `ui://`Text-only tool output caps what agents can show. MCP Apps (SEP-1724, official January 26, 2026) let a tool return sandboxed interactive HTML rendered inline in Claude Desktop, ChatGPT, Cursor, Goose, and VS Code. Dashboards, forms, maps, 3...Build/~75 minutes/Python (stdlib, UI resource emitter), HTML (sample app)15MCP Security I — Tool Poisoning, Rug Pulls, Cross-Server ShadowingTool descriptions land in the model's context verbatim. Malicious servers embed hidden instructions that users never see. Research in 2025-2026 from Invariant Labs, Unit 42, and an arXiv study published March 2026 measured attack-success r...Learn/~45 minutes16MCP Security II — OAuth 2.1, Resource Indicators, Incremental ScopesRemote MCP servers need authorization, not just authentication. The 2025-11-25 spec aligns with OAuth 2.1 + PKCE + resource indicators (RFC 8707) + protected-resource metadata (RFC 9728). SEP-835 adds incremental scope consent with step-up...Build/~75 minutes/Python (stdlib, OAuth state machine simulator)17MCP Gateways and Registries — Enterprise Control PlanesEnterprises cannot let every dev install random MCP servers. A gateway centralizes auth, RBAC, audit, rate limiting, caching, and tool-poisoning detection, then exposes the merged tool surface as a single MCP endpoint. The Official MCP Reg...Learn/~45 minutes/Python (stdlib, minimal gateway)18MCP Auth in Production — Enrollment, JWKS Refresh, Audience-Pinned TokensLesson 16 stood up the OAuth 2.1 state machine in memory. By 2026, every MCP server you ship to a real org sits behind production auth: client enrollment that scales to an unbounded client population (Client ID Metadata Documents first, dy...Build/~90 minutes/Python (stdlib)19A2A — Agent-to-Agent ProtocolMCP is agent-to-tool. A2A (Agent2Agent) is agent-to-agent — an open protocol for letting opaque agents built on different frameworks collaborate. Released by Google in April 2025, donated to the Linux Foundation in June 2025, reaching v1.0...Build/~75 minutes/Python (stdlib, Agent Card + Task harness)20OpenTelemetry GenAI — Tracing Tool Calls End-to-EndAn agent calls five tools, three MCP servers, and two sub-agents. You need one trace across all of it. The OpenTelemetry GenAI semantic conventions (stable attributes in v1.37 and up) are the 2026 standard, natively supported by Datadog, L...Build/~75 minutes/Python (stdlib, OTel span emitter)21LLM Routing Layer — LiteLLM, OpenRouter, PortkeyProvider lock-in is expensive. Different tool-calling workloads suit different models. Routing gateways give one API surface, retries, failover, cost tracking, and guardrails. Three archetypes dominate 2026: LiteLLM (open-source self-hoste...Learn/~45 minutes/Python (stdlib, routing + failover + cost tracker)22Skills and Agent SDKs — Anthropic Skills, AGENTS.md, OpenAI Apps SDKMCP says "what tools exist." Skills say "how to do a task." The 2026 stack layers both. Anthropic's Agent Skills (open standard, December 2025) ship as SKILL.md with progressive disclosure. OpenAI's Apps SDK is MCP plus widget metadata. AG...Learn/~45 minutes/Python (stdlib, SKILL.md parser and loader)23Capstone — Build a Complete Tool EcosystemPhase 13 taught every piece. This capstone wires them into one production-shaped system: an MCP server with tools + resources + prompts + tasks + UI, OAuth 2.1 at the edge, an RBAC gateway, a multi-server client, an A2A sub-agent call, OTe...Build/~120 minutes