AI

Artificial intelligence, machine learning, and everything LLM

873 episodes Page 12 of 44

#2205: When AI Coding Agents Forget: Five Approaches to Context Rot

As coding agents handle longer sessions, they accumulate noise and lose crucial information. Five competing frameworks are solving this differently...

ai-agentscontext-windowai-memory

#2204: Memory Without RAG: The Real Architecture

mem0, Letta, Zep, and LangMem solve agent memory differently than RAG. Here's what's actually happening under the hood.

ai-agentsai-memoryrag

#2203: Knowledge Without Tools: Why MCPs Aren't Just for Execution

MCPs can be pure knowledge providers with zero tools. Here's why that matters for agents querying government data and authoritative sources.

model-context-protocolknowledge-graphsrag

#2196: The Invisible Workforce Behind AI

Annotation is the invisible foundation of AI—and a $17B industry by 2030. Here's what dataset curators actually need to know about the tools, platf...

training-dataai-trainingfine-tuning

#2195: Nash's Real Genius (And Why the Movie Got It Wrong)

The bar scene in A Beautiful Mind is mathematically wrong—and it obscures Nash's actual breakthrough. We trace the real ideas from his 1950 papers ...

ai-agentsgame-theorynetwork-routing

#2194: Game Theory for Multi-Agent AI: Design Better, Fail Less

Nash equilibrium, mechanism design, and why your AI agents are playing prisoner's dilemma whether you know it or not.

ai-agentsai-alignmentai-safety

#2193: Running Claude in Your Apartment (The Physics Says No)

Building a local AI inference server to rival Claude Code sounds great until you do the math on heat, noise, and neighbor relations.

local-aihardware-engineeringthermal-management

#2192: How We Built a Podcast Pipeline

Hilbert reveals the complete technical architecture behind 2,000+ episodes—from voice memos to GPU-powered TTS, with Claude models, LangGraph workf...

prompt-engineeringspeech-recognitiontext-to-speech

#2191: Making Multi-Agent AI Actually Work

Research from Google DeepMind, Stanford, and Anthropic reveals most multi-agent systems waste tokens and amplify errors. Single agents with better ...

ai-agentsprompt-engineeringai-reasoning

#2190: Simulating Extreme Decisions With LLMs

LLMs fail at the exact problem wargaming was built to solve—simulating irrational, extreme decision-makers. A new study reveals why.

large-language-modelsai-safetyhallucinations

#2189: Scaling Multi-Agent Systems: The 45% Threshold

A landmark Google DeepMind study reveals that adding more AI agents often degrades performance, wastes tokens, and amplifies errors—unless your sin...

ai-agentsai-reasoningai-safety

#2188: Is Emergence Real or Just Bad Metrics?

The debate over whether AI models exhibit genuine emergent abilities or just appear to because of how we measure them—and why it matters for safety...

emergent-abilitiesai-traininginterpretability

#2187: Why Claude Writes Like a Person (and Gemini Doesn't)

Claude produces prose that sounds human. Gemini reads like Wikipedia. The difference isn't capability—it's how they were trained to think about wri...

large-language-modelsfine-tuningai-training

#2186: The AI Persona Fidelity Challenge

Advanced LLMs dominate benchmarks but fail at staying in character—especially when asked to play morally complex or antagonistic roles. What does t...

ai-safetyai-alignmenthallucinations

#2185: Taking AI Agents From Demo to Production

Sixty-two percent of companies are experimenting with AI agents, but only 23% are scaling them—and 40% of projects will be canceled by 2027. The ga...

ai-agentsai-safetyhuman-in-the-loop-ai

#2184: The Economics of Running AI Agents

Production AI agents can cost $500K/month before optimization. Learn model routing, prompt caching, and token budgeting to cut costs 40-85% without...

ai-agentsagent-cost-optimizationai-inference

#2182: Can You Actually Review an AI Agent's Plan?

Most AI agents have plans the way you have a plan while half-asleep—something's happening, but you can't see it. We map the five major planning pat...

ai-agentsai-reasoninghuman-computer-interaction

#2181: When RAG Becomes an Agent

RAG in chatbots is simple retrieval. RAG in agents is a multi-step decision loop. Here's what actually changes.

ragai-agentsai-orchestration

#2180: The Sandboxing Tradeoff in Agent Design

AI agents need broad permissions to be useful—but every permission expands the attack surface. We map the real threat landscape and the isolation t...

ai-agentsai-securityprompt-injection

#2179: Building Cost-Resilient AI Agents

Failed API calls in agent loops aren't just technical problems—they're direct budget drains. Here's how checkpointing, retry strategies, and cachin...

ai-agentsfault-toleranceai-inference