#interpretability

6 episodes

Apr 25

#2405: LLM Benchmarks Are Full of Noise: Statistical Rigor in AI Evals

Why most benchmark claims in AI are statistically indefensible — and what to do about it.

benchmarksinterpretabilityllm-as-a-judge

Apr 12

#2188: Is Emergence Real or Just Bad Metrics?

The debate over whether AI models exhibit genuine emergent abilities or just appear to because of how we measure them—and why it matters for safety...

emergent-abilitiesai-traininginterpretability

Mar 26

#1561: Abliteration: The High-Dimensional Lobotomy of AI

Discover how researchers are surgically removing refusal filters from AI models using a mathematical process called abliteration.

ai-safetyinterpretabilityopen-source-ai

Mar 17

#1328: Silicon Sigils: Why We Treat AI Like an Occult Force

Is AI a tool or a digital demon? Explore why technical illiteracy is turning neural networks into a modern-day moral panic.

human-computer-interactionai-safetyinterpretability

Mar 6

#1001: Why Your 1990s Credit Card Was Smarter Than ChatGPT

Think AI started with ChatGPT? Discover the "long haulers" in defense, medicine, and finance who have used machine learning for decades.

ai-historymachine-learning-historydefense-technologyfinancial-fraudinterpretability

Mar 6

#974: Inside the Black Box: The Mystery of Emergent AI Logic

We build digital cathedrals but lack the blueprints. Explore the "black box" of AI, emergent abilities, and the mystery of double descent.

large-language-modelsai-reasoninglatent-spaceinterpretabilityemergent-abilities