AI

Artificial intelligence, machine learning, and everything LLM

656 episodes Page 5 of 33

#2065: Why Run One AI When You Can Run Two?

Speculative decoding makes LLMs 2-3x faster with zero quality loss by using a small draft model to guess tokens that a large model verifies in para...

latencygpu-accelerationai-inference

#2064: Why GPT-5 Is Stuck: The Data Wall Explained

The "bigger is better" era of AI is over. Here's why the industry hit a data wall and shifted to a new scaling law.

large-language-modelsai-trainingdata-storage

#2063: That $500M Chatbot Is Just a Base Model

That polite chatbot? It started as a raw, chaotic autocomplete engine costing half a billion dollars to build.

large-language-modelsgpu-accelerationai-training

#2062: How Transformers Learn Word Order: From Sine Waves to RoPE

Transformers can’t see word order by default. Here’s how positional encoding fixes that—from sine waves to RoPE and massive context windows.

transformerscontext-windowlarge-language-models

#2061: How Attention Variants Keep LLMs From Collapsing

Attention is the engine of modern AI, but it’s also a memory hog. Here’s how MQA, GQA, and MLA evolved to fix it.

transformersai-modelsattention-mechanisms

#2060: The Tokenizer's Hidden Tax on Non-English Text

Why does a simple greeting in Mandarin cost more to process than in English? It's the tokenizer's hidden inefficiency.

linguisticstokenizationai-inference

#2059: npm Cache and Stale Dependencies in Agentic Pipelines

npx is silently running old versions of your AI tools. Here's why your updates vanish into a cache black hole.

ai-agentscybersecuritysoftware-development

#2057: How Agents Break Through the LLM Output Ceiling

The output window is the new bottleneck: why massive context doesn't solve long-form generation.

ai-agentscontext-windowrag

#2056: How Music Models Turn Sound Into Language

A look at how AI music models use audio tokens, transformers, and diffusion to turn text into songs.

audio-processingtransformersgenerative-ai

#2050: Is Impact Investing Just a Cult?

We explore the structural parallels between high-control groups and the ESG industry, from loaded language to isolation tactics.

impact-investingsocial-impact-bondsfinancial-privacy

#2046: AI Hallucinations Are Just How Brains Work

We asked an AI to curate films about AI and reality, exploring the psychedelic overlap between machine hallucinations and human perception.

hallucinationsgenerative-aiai-ethics

#2045: Anonymity Isn't the Problem, The Architecture Is

Why does Reddit amplify toxicity while other anonymous spaces stay healthy? It's not the mask—it's the room's shape.

digital-privacysocial-engineeringhuman-computer-interaction

#2044: Teaching Physics with Sabotage and SimShield

Why the next generation of engineers must learn to "break" simulations and design for failure.

israelmilitary-strategyopen-source

#2043: Python, TypeScript, Rust: The Agent Engineer's Stack

Skip no-code traps. Learn the real stack for building agentic AI: Python, TypeScript, and Rust.

ai-agentssoftware-developmentrust

#2041: The "MPEG Moment" for AI: Llamafile & Native Models

Why are we squeezing massive cloud models onto desktops? Meet the "native" AI revolution.

local-aiquantizationhardware-engineering

#2040: The AI Inference Engine Rebellion

Why run LLMs locally? We break down Ollama, llama.cpp, vLLM, and llamafile—and when to use each.

local-aiopen-sourceai-inference

#2039: CLIs vs. MCPs: How AI Agents Actually Talk to Services

Why give an AI agent a terminal? We compare CLIs and MCPs for AI integration.

ai-agentsmodel-context-protocollocal-ai

#2038: The Self-Hosted AI Agent Buyer’s Guide

LobeHub vs. Dify vs. n8n: We break down the chaotic landscape of local AI agents to find the right "brain" for your workflow.

local-aiai-agentssmart-home

#2037: Claude Code Extensions: Slash Commands vs. Skills vs. Agents

Stop manually typing slash commands. Here’s the definitive hierarchy of Claude Code extensions—from legacy shortcuts to autonomous agents.

claude-codeai-agentsprompt-engineering

#2029: ADHD Brains: Why Willpower Fails & How to Hack It

Stop blaming yourself for half-used planners. Here’s the neurobiology behind ADHD time management.

adhdneuroscienceexecutive-function