#computer-vision

21 episodes

Jun 17

#3641: Archaeology’s Ray Gun Era: Drones, LiDAR & AI on Digs

Drones, ground-penetrating radar, and AI are transforming archaeology. The fine brush is just 5% of the story.

satellite-imagerycomputer-visioncultural-bias

May 20

#2939: Can a Security Camera Detect a Baby Not Moving?

Can AI tell when a baby is about to fall—or has stopped moving? We break down what's possible and what's not.

computer-visionai-ethicschild-development

May 14

#2825: The Patient Who Filmed His Own Bloating

How to set up cameras, markers, and time-lapse to capture abdominal distension for clinical or AI analysis.

computer-visiondigestive-healthpost-cholecystectomy-syndrome

May 7

#2688: Intelligent Frame Extraction for Multimodal AI

Use multimodal AI and smart frame extraction to turn a walk-through video into an actionable decluttering plan.

multimodal-aicomputer-visionprompt-engineering

May 6

#2668: When a Flamethrower Is Overkill

Tesseract, EasyOCR, or a cloud vision model? How to build a fast, reliable label scanner for real-world conditions.

computer-visionedge-computinglatency

May 5

#2657: When Puppeteers Stopped Hiding

Background removal isn't magic — it's multiple AI systems working in sequence. Here's what's actually happening under the hood.

image-generationcomputer-visiongenerative-ai

Apr 30

#2546: The Invisible Engineering Behind a Single Click

The technical stack behind click-to-edit features in tools like Canva and Google Photos — from segmentation to inpainting.

image-generationcomputer-visiongenerative-ai

Apr 29

#2539: When Does AI Stop Hallucinating and Start Reconstructing?

What happens when you feed hundreds of photos into an AI world generator — do you capture reality or just a convincing dream?

urban-planningcultural-biascomputer-vision

Apr 20

#2352: The Structured Output Gap in Vision APIs

How do object detection APIs like Gemini, AWS Rekognition, and YOLO compare for automated annotation workflows?

computer-visionapi-integrationbenchmarks

Apr 19

#2325: Why Depth Is the Hardest Thing for AI to See

Can AI turn your apartment photos into a precise 3D model? Explore the tech behind photogrammetry and spatial reconstruction.

spatial-audiocomputer-visiondigital-twins

Apr 7

#2089: Open-Source vs. Military ATR: The Drone Recognition Gap

A public GitHub model spotted by a listener reveals the massive gap between hobbyist AI and lethal military drone detection systems.

computer-visionmilitary-strategyai-agents

Apr 3

#1964: The Three Layers That Make AR Finally Work

See a 3D arrow pointing to the exact bolt you need, or read a street sign in real-time translation.

multimodal-aiaugmented-realitycomputer-vision

Apr 3

#1963: RPA: Dead or Just Getting Smart?

Traditional RPA is brittle and blind. See how AI vision and agentic orchestration are turning it into a self-healing powerhouse.

ai-agentslegacy-systemscomputer-vision

Apr 3

#1962: Moravec's Paradox: Why Robots Can Write Poetry but Can't Fold a Fitted Sheet

We explore the tech letting robots "reason" about physical tasks using vision-language-action models.

ai-agentscomputer-visionreasoning-models

Mar 31

#1855: When AI Makes Game Assets, Who Owns the Art?

From blocky polygons to photorealistic assets, AI is transforming how 3D models are made.

generative-aigaussian-splattingcomputer-vision

Mar 31

#1817: The Hidden Taxonomy of AI: Why Specialized Models Outperform Giants

Explore the vast ecosystem of niche AI models for computer vision and document understanding, far beyond large language models.

computer-visionragai-models

Mar 31

#1799: The Original AI Blueprints: BERT & CLIP

Before GPT, two models changed everything. Discover how BERT and CLIP taught machines to read and see the world.

transformersai-historycomputer-vision

Mar 25

#1541: Why Your Phone Beats Your PC at Video

Explore why mobile devices handle real-time video AI better than desktops and how the NPU gap is finally closing in 2026.

npuedge-computingcomputer-vision

Feb 22

#769: When Manuals Learn to See in 3D

Discover how AI and spatial computing are turning complex hardware repairs into real-time, interactive experiences.

multimodal-aicomputer-visionhardware-engineeringindustrial-automationaugmented-reality

Feb 22

#768: The Missing Nail: When Tiny Parts Stop Big Projects

From tiny laptop screws to industrial rivnuts, discover why the smallest components are often the biggest hurdles in any DIY project.

structural-engineeringcomputer-visionhardware-standards

Dec 18

#64: How AI Learns to See, Hear, and Think Together

AI is evolving beyond text, learning to see, hear, and understand our world. Discover the future of human-AI interaction!

multimodal-aiai-sensescomputer-visionaudio-processingdata-integration

#3641: Archaeology’s Ray Gun Era: Drones, LiDAR & AI on Digs

#2939: Can a Security Camera Detect a Baby Not Moving?

#2825: The Patient Who Filmed His Own Bloating

#2688: Intelligent Frame Extraction for Multimodal AI

#2668: When a Flamethrower Is Overkill

#2657: When Puppeteers Stopped Hiding

#2546: The Invisible Engineering Behind a Single Click

#2539: When Does AI Stop Hallucinating and Start Reconstructing?

#2352: The Structured Output Gap in Vision APIs

#2325: Why Depth Is the Hardest Thing for AI to See

#2089: Open-Source vs. Military ATR: The Drone Recognition Gap

#1964: The Three Layers That Make AR Finally Work

#1963: RPA: Dead or Just Getting Smart?

#1962: Moravec's Paradox: Why Robots Can Write Poetry but Can't Fold a Fitted Sheet

#1855: When AI Makes Game Assets, Who Owns the Art?

#1817: The Hidden Taxonomy of AI: Why Specialized Models Outperform Giants

#1799: The Original AI Blueprints: BERT & CLIP

#1541: Why Your Phone Beats Your PC at Video

#769: When Manuals Learn to See in 3D

#768: The Missing Nail: When Tiny Parts Stop Big Projects

#64: How AI Learns to See, Hear, and Think Together

Related Topics