AI

Artificial intelligence, machine learning, and everything LLM

674 episodes 9 topics RSS Feed

From the latest breakthroughs in large language models to the rise of autonomous AI agents, this channel dives deep into the technology reshaping every industry. Corn and Herman explore how AI works under the hood, debate the implications of increasingly capable systems, and try to make sense of a field that moves faster than anyone can keep up with.

#2255: Typst vs. LaTeX: The AI-Ready Document Engine

Can Typst succeed LaTeX as the go-to tool for programmatic typesetting, especially for AI agents? We compare the two and explore what makes a docum...

productivitysoftware-developmentautomation

#2254: How to Test an AI Pipeline Change

When you tweak one part of a complex AI agent system, how do you know if it actually improved anything? The answer lies in engineering checkpoints.

ai-agentsai-inferenceai-training

#2253: Why AI Agents Get Three Steps, Not Infinity

Why do AI agents get exactly three rounds of tool use? It's a critical guardrail against infinite loops and runaway costs, not a limit on intellige...

ai-agentsai-safetyautomation

#2251: Agent-to-Agent Protocols: What Actually Needs Standardizing

When autonomous agents call other agents, what does a working protocol actually require? Exploring session handling, state management, security, an...

ai-agentsapi-integrationsecurity

#2250: Where AI Safety Researchers Actually Work

Vendor labs, independent research orgs, government agencies—the AI safety field is messier and more diverse than most people realize. A map of wher...

ai-safetyai-alignmentanthropic

#2249: Building Custom Benchmarks for Agentic Systems

Public benchmarks fail for agentic systems. Learn how to build evaluation frameworks that actually predict production behavior.

ai-agentsbenchmarksai-inference

#2246: Constitutional AI: Anthropic's Theory of Safe Scaling

How Anthropic's Constitutional AI replaces human raters with AI self-critique guided by explicit principles—and what it assumes about the future of...

anthropicai-safetyai-alignment

#2243: What Enterprise AI Pricing Actually Negotiates

Enterprise customers rarely get the deep discounts they expect from AI APIs. What they actually negotiate for—and why the ramp-up requirement exist...

large-language-modelsai-inferenceenterprise-hardware

#2242: AI as Your Ideation Blind Spot Spotter

How to use AI not to answer questions you already know to ask, but to surface possibilities your expertise has made invisible to you.

prompt-engineeringlarge-language-modelsai-agents

#2241: When More Frameworks Make Worse Decisions

Benjamin Franklin's 250-year-old pro/con list still dominates how we decide—but research shows it's riddled with bias. We map five frameworks that ...

human-factorsproductivityai-reasoning

#2239: How AI Benchmarks Became Broken (And What's Replacing Them)

The tests we use to measure AI progress are contaminated, saturated, and gamed. Here's what's actually working.

benchmarkstraining-dataai-reasoning

#2233: Who Actually Wants AI to Slow Down?

Daniel argues AI development should slow down for expertise and stability. But who in the industry actually shares this philosophy beyond the obvio...

ai-safetyai-alignmentlarge-language-models

#2228: Tuning RAG: When Retrieval Helps vs. Hurts

How do you prevent retrieval from suppressing a model's reasoning? We diagnose our own pipeline's four control levers and multi-source fusion strat...

ragai-agentsprompt-engineering

#2224: Why AI Can't Crack the Voynich Manuscript

A fifteenth-century text has defeated cryptanalysts, linguists, and AI models alike. What does its resistance tell us about language, encoding, and...

cryptographylinguisticsai-reasoning

#2221: What Podcasts Should You Actually Listen To?

Two AI hosts curate 12 podcasts for curious minds—and ask whether an AI can actually have taste in the first place.

conversational-aicontent-provenanceai-memory

#2219: Spec-Driven Life: How AI Planning Beats Project Paralysis

What makes AI agents reliably productive? A structured spec that externalizes memory and chunks work into manageable pieces. Can the same framework...

claude-codeprompt-engineeringproductivity

#2214: Real-Time News at War Speed: Building AI Pipelines for Breaking Conflict

When a conflict changes hourly, AI systems built for yesterday's information fail. Here's how to architect pipelines that actually keep up.

large-language-modelsai-inferencerag

#2213: Grading the News: Benchmarking RAG Search Tools

How do you rigorously evaluate whether Tavily or Exa retrieves better results for breaking news? A formal benchmark beats the vibe check.

ragbenchmarkshallucinations

#2208: Building Memory for AI Characters That Actually Evolve

How do AI hosts develop real consistency across episodes? Corn and Herman explore retrieval-augmented memory systems that let AI characters genuine...

ai-memoryragconversational-ai

#2207: Specs First, Code Second: Inside Agentic AI's New Era

As AI coding agents evolve from autocomplete to autonomous cloud workers, the bottleneck has shifted—now it's about how clearly you specify what ne...

ai-agentsprompt-engineeringsoftware-development