← All Tags

#computer-vision

18 episodes

#2688: Declutter Your Apartment with AI Video Analysis

Use multimodal AI and smart frame extraction to turn a walk-through video into an actionable decluttering plan.

multimodal-aicomputer-visionprompt-engineering

#2668: OCR vs VLMs: Reading Labels on Camera

Tesseract, EasyOCR, or a cloud vision model? How to build a fast, reliable label scanner for real-world conditions.

computer-visionedge-computinglatency

#2657: How Background Removal Actually Works (and Why It Matters for AI Art)

Background removal isn't magic — it's multiple AI systems working in sequence. Here's what's actually happening under the hood.

image-generationcomputer-visiongenerative-ai

#2546: How AI Editing Tools Actually Delete and Move Objects

The technical stack behind click-to-edit features in tools like Canva and Google Photos — from segmentation to inpainting.

image-generationcomputer-visiongenerative-ai

#2539: Can 400 Photos Rebuild a City or Just Its Vibe?

What happens when you feed hundreds of photos into an AI world generator — do you capture reality or just a convincing dream?

urban-planningcultural-biascomputer-vision

#2352: Object Detection APIs: Choosing the Right Tool for Your Workflow

How do object detection APIs like Gemini, AWS Rekognition, and YOLO compare for automated annotation workflows?

computer-visionapi-integrationbenchmarks

#2325: How AI Turns Photos Into 3D Models for Your Apartment

Can AI turn your apartment photos into a precise 3D model? Explore the tech behind photogrammetry and spatial reconstruction.

spatial-audiocomputer-visiondigital-twins

#2089: Why AI Drones Need Millions of Images

A public GitHub model spotted by a listener reveals the massive gap between hobbyist AI and lethal military drone detection systems.

computer-visionmilitary-strategyai-agents

#1964: AI Glasses That See Through Your Eyes

See a 3D arrow pointing to the exact bolt you need, or read a street sign in real-time translation.

multimodal-aiaugmented-realitycomputer-vision

#1963: RPA: Dead or Just Getting Smart?

Traditional RPA is brittle and blind. See how AI vision and agentic orchestration are turning it into a self-healing powerhouse.

ai-agentslegacy-systemscomputer-vision

#1962: Why Robots Think Before They Grab

We explore the tech letting robots "reason" about physical tasks using vision-language-action models.

ai-agentscomputer-visionreasoning-models

#1855: AI Is Turning Your Photos Into 3D Models

From blocky polygons to photorealistic assets, AI is transforming how 3D models are made.

generative-aigaussian-splattingcomputer-vision

#1817: Beyond LLMs: The Hidden World of Specialized AI

Explore the vast ecosystem of niche AI models for computer vision and document understanding, far beyond large language models.

computer-visionragai-models

#1799: The Original AI Blueprints: BERT & CLIP

Before GPT, two models changed everything. Discover how BERT and CLIP taught machines to read and see the world.

transformersai-historycomputer-vision

#1541: The NPU Revolution: Why Your Phone Outperforms Your PC

Explore why mobile devices handle real-time video AI better than desktops and how the NPU gap is finally closing in 2026.

npuedge-computingcomputer-vision

#769: The Living Manual: AI and AR for High-Tech Repairs

Discover how AI and spatial computing are turning complex hardware repairs into real-time, interactive experiences.

multimodal-aicomputer-visionhardware-engineeringindustrial-automationaugmented-reality

#768: Small Parts, Big Problems: The Engineering of Fasteners

From tiny laptop screws to industrial rivnuts, discover why the smallest components are often the biggest hurdles in any DIY project.

structural-engineeringcomputer-visionhardware-standards

#64: AI's Senses: Seeing, Hearing, Understanding

AI is evolving beyond text, learning to see, hear, and understand our world. Discover the future of human-AI interaction!

multimodal-aiai-sensescomputer-visionaudio-processingdata-integration