← #ai-agents

#ai-agents

344 episodes · Page 4 of 15

#2195: Nash's Real Genius (And Why the Movie Got It Wrong)

The bar scene in A Beautiful Mind is mathematically wrong—and it obscures Nash's actual breakthrough. We trace the real ideas from his 1950 papers ...

ai-agentsgame-theorynetwork-routing

#2194: Game Theory for Multi-Agent AI: Design Better, Fail Less

Nash equilibrium, mechanism design, and why your AI agents are playing prisoner's dilemma whether you know it or not.

ai-agentsai-alignmentai-safety

#2191: Making Multi-Agent AI Actually Work

Research from Google DeepMind, Stanford, and Anthropic reveals most multi-agent systems waste tokens and amplify errors. Single agents with better ...

ai-agentsprompt-engineeringai-reasoning

#2189: Scaling Multi-Agent Systems: The 45% Threshold

A landmark Google DeepMind study reveals that adding more AI agents often degrades performance, wastes tokens, and amplifies errors—unless your sin...

ai-agentsai-reasoningai-safety

#2185: Taking AI Agents From Demo to Production

Sixty-two percent of companies are experimenting with AI agents, but only 23% are scaling them—and 40% of projects will be canceled by 2027. The ga...

ai-agentsai-safetyhuman-in-the-loop-ai

#2184: The Economics of Running AI Agents

Production AI agents can cost $500K/month before optimization. Learn model routing, prompt caching, and token budgeting to cut costs 40-85% without...

ai-agentsagent-cost-optimizationai-inference

#2182: Can You Actually Review an AI Agent's Plan?

Most AI agents have plans the way you have a plan while half-asleep—something's happening, but you can't see it. We map the five major planning pat...

ai-agentsai-reasoninghuman-computer-interaction

#2181: When RAG Becomes an Agent

RAG in chatbots is simple retrieval. RAG in agents is a multi-step decision loop. Here's what actually changes.

ragai-agentsai-orchestration

#2180: The Sandboxing Tradeoff in Agent Design

AI agents need broad permissions to be useful—but every permission expands the attack surface. We map the real threat landscape and the isolation t...

ai-agentsai-securityprompt-injection

#2179: Building Cost-Resilient AI Agents

Failed API calls in agent loops aren't just technical problems—they're direct budget drains. Here's how checkpointing, retry strategies, and cachin...

ai-agentsfault-toleranceai-inference

#2178: How to Actually Evaluate AI Agents

Frontier models score 80% on one agent benchmark and 45% on another. The difference isn't the model—it's contamination, scaffolding, and how the te...

ai-agentsbenchmarksai-safety

#2174: Role-Playing as Orchestration

How a role-playing protocol from NeurIPS 2023 became one of AI's most underrated agent frameworks—and what happens when you scale it to a million a...

ai-agentsprompt-engineeringai-orchestration

#2173: Inside MiroFish's Agent Simulation Architecture

MiroFish generates thousands of AI agents with distinct personalities to predict social dynamics. But research reveals a critical flaw: LLM agents ...

ai-agentsknowledge-graphsai-reasoning

#2171: How IQT Labs Built a Wargaming LLM (Then Archived It)

A deep code review of Snowglobe, IQT Labs' open-source LLM wargaming system that ran real national security simulations before being archived. What...

ai-agentslarge-language-modelsmilitary-strategy

#2170: Pricing Agentic AI When Nothing's Predictable

How do you charge fixed prices for systems that operate in fundamental uncertainty? Consultants are discovering frameworks that work—but they requi...

ai-agentsai-safetyprompt-engineering

#2169: How Enterprises Are Rethinking Agent Frameworks

Twelve major agentic AI frameworks exist—yet many serious developers avoid them entirely. What patterns emerge in real enterprise adoption?

ai-agentsai-safetysoftware-development

#2168: What Serious Agentic AI Developers Actually Need to Know

Python, TypeScript, LangGraph, and the frameworks reshaping how agents work. A technical map of the skills and concepts that separate prototypes fr...

ai-agentsai-orchestrationsoftware-development

#2167: Sync vs. Async: Architecting Agents for Scale

Why most enterprise AI agents fail in production has less to do with models and more to do with whether they're built synchronously or asynchronously.

ai-agentsmodel-context-protocoldistributed-systems

#2166: Code vs. Canvas: How Developers Pick Their Tools

LangGraph or Flowise? The honest answer isn't obvious. Developers gain speed and integrations with visual builders—but lose version control, testin...

ai-agentssoftware-developmentapi-integration

#2165: Strip Your Agent to Bash

The frameworks matter less than you think. What separates a working agent from a failing one is the harness—the orchestration, memory, and tool des...

ai-agentsai-orchestrationprompt-engineering

#2163: Designing Autonomy Boundaries for AI Agents

Production data reveals a surprising truth: fully autonomous AI agents waste 98% of their context window on tool descriptions. Here's why the indus...

ai-agentsai-orchestrationinference-parameters

#2158: Claude Managed Agents: Brain Versus Hands

Anthropic's new Managed Agents service runs your agent loop on their infrastructure. Here's what you gain, what you lose, and who it's actually for.

ai-agentsanthropicai-orchestration

#2146: The AI Wargame's Flat Hierarchy Problem

AI wargames treat NGOs and nuclear powers as equals. That's a dangerous flaw for real-world policy planning.

ai-agentsgeopolitical-strategymilitary-strategy

#2144: AI Wargaming: One Model or Many?

Should geopolitical AI simulations use one model or many? We debate the pros and cons of a single-model approach.

ai-agentsgeopoliticsmilitary-strategy