#automatic-speech-recognition
6 episodes
#2754: Why Your Dictation Setup Might Be Wrong
Modern ASR is shockingly robust. The biggest predictor of accuracy? How well your audio matches its training data.
#2590: The Uncanny Valley of Clean Speech
How transformer models distinguish "um" from meaningful speech — and why removing too much makes you sound like a robot.
#2486: Why Noise Reduction Can Ruin Transcription Accuracy
Cleaning audio before transcription can increase errors by up to 46%. Here's the right approach for your voice app.
#2337: When Diarization Fails Silently
Discover how PyAnnote and other tools tackle the critical task of identifying "who spoke when" in audio—and why it’s harder than it sounds.
#109: Beating Context Bloat with Dynamic Dictionaries
Tired of AI mishearing brand names? Learn how to build efficient custom dictionaries for Gemini 1.5 without breaking the bank.
#10: How ASR Went From Frustration To ... Whisper Magic
Speech to text: from frustrating to fantastic. Uncover the magic behind its rapid rise and connection to the AI boom!