#audio-processing
5 episodes
#64: AI's Senses: Seeing, Hearing, Understanding
AI is evolving beyond text, learning to see, hear, and understand our world. Discover the future of human-AI interaction!
#58: Clean Audio, Messy Reality: Noise Removal for Voice-to-Text
Fussy baby, clean audio? We dive into noise removal for voice-to-text. Discover why cleaner audio can transcribe worse.
#54: Tokenizing Everything: How Omnimodal AI Handles Any Input
Omnimodal AI: How do models process images, audio, video, and text all at once? Discover the engineering behind AI that accepts anything.
#33: The Unseen Magic of AI's Ears: Decoding VAD
Ever wonder how your AI knows you're talking? We're diving deep into VAD, the unseen magic behind AI's ears.
#8: Building Your Own Whisper
Ever wondered if you could build your own speech recognition tool? We dive deep into crafting custom ASR.