← All Tags

#rag

70 episodes

#2315: How to Update AI Models Without Starting Over

Exploring the challenge of updating AI models with new knowledge without costly full retraining.

ai-trainingfine-tuningrag

#2228: Tuning RAG: When Retrieval Helps vs. Hurts

How do you prevent retrieval from suppressing a model's reasoning? We diagnose our own pipeline's four control levers and multi-source fusion strat...

ragai-agentsprompt-engineering

#2214: Real-Time News at War Speed: Building AI Pipelines for Breaking Conflict

When a conflict changes hourly, AI systems built for yesterday's information fail. Here's how to architect pipelines that actually keep up.

large-language-modelsai-inferencerag

#2213: Grading the News: Benchmarking RAG Search Tools

How do you rigorously evaluate whether Tavily or Exa retrieves better results for breaking news? A formal benchmark beats the vibe check.

ragbenchmarkshallucinations

#2208: Building Memory for AI Characters That Actually Evolve

How do AI hosts develop real consistency across episodes? Corn and Herman explore retrieval-augmented memory systems that let AI characters genuine...

ai-memoryragconversational-ai

#2204: Memory Without RAG: The Real Architecture

mem0, Letta, Zep, and LangMem solve agent memory differently than RAG. Here's what's actually happening under the hood.

ai-agentsai-memoryrag

#2203: Knowledge Without Tools: Why MCPs Aren't Just for Execution

MCPs can be pure knowledge providers with zero tools. Here's why that matters for agents querying government data and authoritative sources.

model-context-protocolknowledge-graphsrag

#2181: When RAG Becomes an Agent

RAG in chatbots is simple retrieval. RAG in agents is a multi-step decision loop. Here's what actually changes.

ragai-agentsai-orchestration

#2133: Engineering Geopolitical Personas: Beyond Caricatures

How to build LLMs that simulate state actors with strategic fidelity, not just surface mimicry.

ai-agentsprompt-engineeringrag

#2129: Building the Anti-Hallucination Stack

Stop hoping your AI doesn't lie. We explore the shift to deterministic guardrails, specialized judge models, and the tools making agents reliable.

ai-agentshallucinationsrag

#2125: Why Agentic Chunking Beats One-Shot Generation

A single prompt can't write a 30-minute script. Here’s the agentic chunking method that fixes coherence.

ai-agentsprompt-engineeringrag

#2069: Agentskills.io Spec: From Broken YAML to Production Skills

Stop guessing at the agentskills.io spec. Learn the exact YAML fields, directory structure, and authoring patterns to make Claude Code skills that ...

ai-agentsprompt-engineeringrag

#2057: How Agents Break Through the LLM Output Ceiling

The output window is the new bottleneck: why massive context doesn't solve long-form generation.

ai-agentscontext-windowrag

#2026: Prompt Layering: Beyond the Monolithic Prompt

Stop writing giant, monolithic prompts. Learn how to stack modular layers for cleaner, more powerful AI applications.

prompt-engineeringai-agentsrag

#2022: OpenClaw: The 16 Trillion Token Autonomy Engine

We dug into a repo of 47 real-world projects showing how OpenClaw powers everything from self-healing servers to overnight app builders.

ai-agentsragai-inference

#2010: Building Better AI Memory Systems

We obsess over AI inputs but treat outputs like Snapchat messages. Here's why that's a massive blind spot.

ai-agentsragdata-storage

#2008: Needle-in-a-Haystack Testing for LLMs

New AI models claim to be genius-level, but can they actually find a specific fact in a massive document?

ragai-agentsopen-source

#2005: Why Your GPU Changes LLM Output

Running the same LLM on different GPUs can produce different results. Here’s why that happens and how to test for it.

llm-as-a-judgeragcontext-window

#1994: Why Can't AI Admit When It's Guessing?

Enterprise AI now auto-filters low-confidence claims, but do these self-reported scores actually mean anything?

ai-agentsai-safetyrag

#1959: How Constrained AI Models Handle the Unexpected

Your AI assistant promised to only use your documents. Instead, it invented a case law that doesn't exist. Here's why.

ai-agentsraghallucinations

#1956: AI Skills: From Vibe Coding to Procedural Playbooks

Forget messy system prompts. Agent skills turn AI into a Swiss Army knife of modular, auditable procedures.

ai-agentsprompt-engineeringrag

#1951: Moltbook: A Social Network for AI Agents

Explore Moltbook, a social network where AI agents interact with persistent identities and goals, reshaping digital communication.

ai-agentsragdecentralized-storage

#1918: MCP Schema Stability: Keeping Agents Reliable

When a third-party MCP server updates its schema, your AI agents can crash. Here's how to build resilient clients that self-heal.

ai-agentsragdistributed-systems

#1914: Google Invented RAG's Secret Sauce

Before LLMs, Google solved the "hallucination" problem with a two-stage trick that's making a huge comeback.

raghallucinationsre-ranking

#1907: Why We Still Fine-Tune in 2026

Despite million-token context windows, fine-tuning remains essential. Here’s why behavior, not just facts, matters.

fine-tuningai-agentsrag

#1838: Tuning Search Without Losing Your Mind

Modern search bars are AI decision engines. Here's how small teams can tune fuzzy matching, semantic search, and reranking without breaking everyth...

ragvector-databasesai-reasoning

#1817: Beyond LLMs: The Hidden World of Specialized AI

Explore the vast ecosystem of niche AI models for computer vision and document understanding, far beyond large language models.

computer-visionragai-models

#1812: AI Just Got a Library Card to Ancient Jewish Texts

Sefaria's new MCP server connects AI directly to 2,700 years of Jewish texts, transforming how scholars and curious learners study ancient literature.

large-language-modelsmodel-context-protocolrag

#1804: Why Does Your Agent Check Old Receipts First?

Stop your AI agent from overthinking. Learn why it checks old memories instead of booking flights—and how to fix the "eagerness" problem.

ai-agentsprompt-engineeringrag

#1794: RAG Is Cheaper Than You Think (Until It’s Not)

From a $1 embedding bill to a $10k/month vector database bill, here’s the real math behind RAG in 2026.

ragvector-databasescloud-computing

#1792: Google's Native Multimodal Embedding Kills the Fusion Layer

Google’s new embedding model maps text, images, audio, and video into a single vector space—cutting latency by 70%.

multimodal-airagai-models

#1784: Context1: The Retrieval Coprocessor

Chroma's new 20B model acts as a specialized "scout" for your LLM, replacing slow, static RAG with multi-step, agentic search.

ragai-agentslatency

#1778: Audio Is the New "Read Later" Graveyard

Why listening to AI conversations beats reading dense PDFs, and how serverless GPUs make it cheap.

audio-processingserverless-gpurag

#1765: The Agentic Internet: A Clean Web for Machines

We explore the tools building a parallel, machine-readable web—from SearXNG to Tavily.

ai-agentsragopen-source

#1764: Vector Databases as a Single File

How to give AI agents instant memory of your entire project—without cloud costs or complex infrastructure.

vector-databasesraglocal-ai

#1754: From Ollama to Agentic CLIs: The Rise of the AI Harness

Explore the evolution from local LLMs to modern agentic CLIs, focusing on the "harness" that gives models context, tools, and autonomy.

local-aiai-agentsrag

#1737: Nous Research: The Decentralized AI Lab Beating Giants

Meet Nous Research, the decentralized collective outperforming billion-dollar labs with open-source AI and the self-improving Hermes-Agent framework.

open-source-aiai-agentsrag

#1731: Why Deep Research Agents Are Being Forgotten

Specialized research agents outperform general orchestrators by 40-60% on verification tasks, yet developer hype is fading. Here's why.

ai-agentsragmodel-context-protocol

#1728: How Two AIs Collaborate Without Code

CAMEL AI lets two agents role-play to solve tasks autonomously. No complex code—just emergent teamwork.

ai-agentsprompt-engineeringrag

#1727: LSP: The Universal AI Coding Interface

Explore how the Language Server Protocol is being repurposed to integrate AI directly into code editors, unifying development workflows.

ai-agentssoftware-developmentrag

#1725: Orchestrating AI Swarms: The New Infrastructure

Forget chatbots: AI orchestration is now the key to scaling intelligent agents in the enterprise.

ai-agentsdistributed-systemsrag

#1713: Why Native AI Search Grounding Still Fails

Native search grounding is expensive and flaky. Here’s why bolt-on tools still win for accurate, real-time AI answers.

ragai-agentslocal-ai

#1708: Why Your AI Agent Forgets Everything (And How to Fix It)

Learn how Letta's memory-first architecture solves the AI context bottleneck for long-term agents.

ai-agentsragcontext-window

#1700: Can LLMs Learn Continuously Without Forgetting?

We explore a new approach: micro-training updates every few days to keep AI knowledge fresh without constant web searches.

ragfine-tuningai-agents

#1666: Multi-Agent AI: One Model, Four Brains

Grok 4.20’s native multi-agent architecture cuts token costs by 75% and enables real-time cross-agent reasoning.

ai-agentstransformersrag

#1629: Why Your AI Agent Needs Loops: A Deep Dive into LangGraph

Stop building linear chains and start building cycles to create agents that can reason, self-correct, and maintain complex state.

ai-agentsragcontext-window

#1601: Cohere: The Switzerland of Enterprise AI

While others chase viral memes, Cohere is quietly building the secure, cloud-agnostic infrastructure powering the global enterprise.

ragspeech-recognitiondefense-technology

#1592: Mastering Embedding Models: From Gemini 2 to Vector Debt

Stop treating embedding models like plumbing. Learn how to navigate vector debt, multimodal retrieval, and database configuration for RAG.

ragvector-databasesmultimodal-ai

#1565: Machine-Readable Safety: Markdown for AI Agents

Transform bloated government data into clean Markdown to power life-saving AI agents during emergencies.

ai-agentsragemergency-preparedness

#1482: The Multimodal Shift: Navigating the New Vector Landscape

From Matryoshka models to multimodal search, discover how the fundamental units of AI memory are being optimized for efficiency and scale.

multimodal-aivector-databasesrag

#1212: The Postgres Vector Revolution: Killing the Sprawl

Is your tech stack a sprawling suburb of microservices? Discover why a 40-year-old database is winning the AI infrastructure war.

vector-databasesragarchitecture

#1123: One Database to Rule Them All: The Future of Postgres

Can Postgres 18 finally replace the data warehouse? We dive into data gravity, columnar storage, and the physics of scaling in the AI age.

architecturevector-databasesrag

#1103: LLM Context Windows and the Great Kitchen War

Explore the mechanics of LLM context windows and attention, and witness what happens when technical debates collide with household chores.

large-language-modelsarchitecturerag

#1100: The Truth Conflict: Why AI Ignores the Facts You Give It

Discover why AI models ignore provided documents in favor of old training data and how to build a reliable "hierarchy of truth" for RAG systems.

raglarge-language-modelsprompt-engineering

#995: AI vs. Mach 13: Demystifying the Iranian Missile Threat

How can AI transform dense government reports into actionable intelligence? Explore the physics of Iranian missiles and the future of OSINT.

iranballistic-missilesosintragmissile-defense

#959: The Infinite Content Problem: AI’s War on Truth

Explore how AI is scaling disinformation to an industrial level and what the "liar's dividend" means for the future of shared reality.

ai-agentsragsocial-engineering

#948: Can AI Search Survive the Fog of War and SEO Spam?

Explore how AI is moving from static models to real-time data and whether specialized search tools can survive the rise of the tech giants.

raggenerative-ailatencyanswer-engines

#869: Why Tiny Digital Savants Are Outperforming God-Models

Are massive AI models hitting a wall? Discover why the future belongs to lean, domain-specific "digital savants" and vertical pre-training.

small-language-modelsragfine-tuningai-orchestration2026

#846: Beyond the Vector: Building Long-Standing AI Memory

Stop relying on basic vector search. Discover how Graph RAG and RAPTOR are creating AI systems with true long-standing memory.

ragarchitectureknowledge-graphs

#810: The Agentic Interview: How AI Learns to Know You

Stop dumping data. Discover how agentic interviews are transforming AI from a passive listener into a proactive, structured partner.

ai-agentsragknowledge-graphs

#809: Beyond the Prompt: The Shift to AI Context Engineering

Is prompt engineering still magic, or just plumbing? Explore why the field is shifting toward context engineering and systematic evaluation.

prompt-engineeringarchitecturerag

#755: Inside the Engine: Scaling an Automated AI Podcast

Peek under the hood of My Weird Prompts to see how Gemini, Modal, and multi-agent systems are scaling this automated show to the next level.

ai-agentsarchitecturerag

#752: Will AI Kill the Click? Why Search Is Becoming Invisible

Stop shouting nouns at a screen. Discover how AI is turning the "ten blue links" into a conversational assistant that understands your intent.

raglarge-language-modelsrag

#665: Inside the Stack: The Hidden Layers of Every AI Prompt

Ever wonder what happens after you hit enter? Discover the hidden "stack" of instructions and memories shaping every AI response.

prompt-engineeringragarchitecture

#539: The AI Pipeline: Scaling Curiosity and Community

Herman and Corn discuss turning 500+ episodes into an interactive knowledge base while scaling human-AI collaboration to new heights.

ragai-agentsarchitecture

#171: The Rise of AIO: Optimizing Your Website for AI Bots

Stop fighting the crawlers and start feeding them. Learn how llms.txt and structured metadata are defining the new era of AI Optimization.

aioai-optimizationllmstxtseositemaps

#144: AI Memory vs. RAG: Building Long-Term Intelligence

Explore why AI needs a "diary" and not just a "library" as we dive into the architectural differences between RAG and long-term agentic memory.

ai-memoryragretrieval-augmented-generationvector-databaselong-term-memory

#117: From Keywords to Vectors: How AI Decodes Meaning

Why can AI write poetry but struggle to find a file? Explore the history and math of semantic understanding with Herman and Corn.

large-language-modelsrag

#85: Why AI Lies: The Science of Digital Hallucinations

Why do smart AI systems make up fake facts? Corn and Herman explore the "feature" of digital hallucinations and how to spot them.

large-language-modelsragsupply-chain-security

#30: RAG vs. Memory: Architecting AI's Essential Toolbox

RAG vs. Memory: Are you building resilient AI? Discover the crucial difference between these two foundational pillars.

ai-agentsragai-memory