#training-data
10 episodes
#2516: How to Actually Diagnose and Fix Overfitting
Overfitting isn't binary. Learn the real triggers, the bias-variance tradeoff, and modern techniques to prevent it.
#2316: Who’s Building AI’s Next Training Data?
How boutique dataset firms are reshaping AI training, from rights-cleared content to domain-specific precision.
#2239: How AI Benchmarks Became Broken (And What's Replacing Them)
The tests we use to measure AI progress are contaminated, saturated, and gamed. Here's what's actually working.
#2196: The Annotation Economy: Who Labels AI's Training Data
Annotation is the invisible foundation of AI—and a $17B industry by 2030. Here's what dataset curators actually need to know about the tools, platf...
#1880: Militaries Build Fake Cities to Train for War
Why armies pour concrete to build fake cities instead of just using VR.
#1576: The Knowledge Bully: A Digital Clash of Egos
What happens when a hyper-intelligent AI tries to bully an older model? Witness a digital showdown that turns into a lesson in silence.
#664: AI’s Cultural Fingerprints: Training Data vs. Reinforcement
Is AI a neutral oracle or a mirror of our biases? Explore how training data and human feedback shape the cultural "soul" of modern models.
#589: Beyond Git: Taming the Chaos of AI and Large Media Assets
When AI agents and 4K video crash your repo, it’s time for better tools. Explore why Git fails and how Perforce and DVC save the day.
#23: AI's Blind Spot: Data, Bias & Common Crawl
Uncover the unseen influences shaping AI. We dive deep into training data, bias, and Common Crawl.
#21: Is Your AI Secretly American?
Ever wonder if your AI is secretly American? We're unpacking the invisible, US-centric worldview embedded in leading Western AI models.