← All Tags

#multimodal-ai

7 episodes

#786: Mastering the Hoard: AI-Powered Inventory Management

Learn how to manage thousands of parts without losing your mind using AI, QR codes, and professional logistics strategies.

security-logisticsmultimodal-aidata-integrity

#749: Breaking the Fourth Wall: Moving to Real-Time AI Audio

Can AI podcasts move from polished scripts to raw, real-time conversation? Explore the technical and financial shift to live multimodal models.

large-language-modelsarchitecturemultimodal-ai

#132: Beyond Frames: The Rise of Real-Time Video AI

Discover how spatial-temporal tokenization and 3D world modeling are revolutionizing real-time video-to-video AI interaction.

video-aimultimodal-aireal-time-videospatial-temporal-tokenizationslam

#64: AI's Senses: Seeing, Hearing, Understanding

AI is evolving beyond text, learning to see, hear, and understand our world. Discover the future of human-AI interaction!

multimodal-aiartificial-intelligenceai-sensescomputer-visionaudio-processing

#54: Tokenizing Everything: How Omnimodal AI Handles Any Input

Omnimodal AI: How do models process images, audio, video, and text all at once? Discover the engineering behind AI that accepts anything.

omnimodal-aitokenizationai-modelsmultimodal-aidata-types

#53: Instructional vs. Conversational AI: The Distinction Nobody Talks About

Instructional vs. conversational AI: a crucial distinction reshaping how AI is built. Discover why it matters for the future of AI development.

instructional-aiconversational-aiai-modelsai-trainingnatural-language-processing

#46: Pixels, Prompts & Pseudo-Text: AI's Word Problem

AI paints stunning images, but can't spell "cat." Why do advanced models struggle with simple text? Dive into AI's weird word problem!

image-generationpseudo-texttext-in-imagesmultimodal-ailanguage-models