#2911: Building a $180 Privacy-First AI Wearable

How Omi's $99 dev kit lets you build a local-first voice productivity system that watches your screen.

Featuring
Listen
0:00
0:00
Episode Details
Episode ID
MWP-3080
Published
Duration
30:55
Audio
Direct link
Pipeline
V5
TTS Engine
chatterbox-regular
Script Writing Agent
deepseek-v4-pro

AI-Generated Content: This podcast is created using AI personas. Please verify any important information independently.

Omi began as a wearable recorder in 2023, a pendant you'd clip on to capture meetings. But by late 2024, the company realized the hardware was commoditized and pivoted hard to becoming a developer platform. The dev kit shipped in Q1 2025, and the screen processing beta landed in March 2026 — three distinct eras in three years. For $99, you get an ESP32-S3 board with a MEMS microphone array, Bluetooth LE, and a six-axis IMU running FreeRTOS. The key differentiator is the fully exposed I2S audio bus, letting you swap in your own voice activity detection and ASR models.

The screen processing beta takes things further. The companion app grabs a screenshot every two seconds, runs Tesseract OCR compiled for ARM, feeds the text to a local Phi-3-mini-4k-instruct model through llama.cpp, and extracts tasks, deadlines, and follow-ups into a local SQLite database. Everything stays on-device or on your local server — no cloud round trips. The tradeoff is accuracy: cloud OCR hits ~99%, while on-device Tesseract drops to ~92% on complex UIs. For many use cases, that's plenty, especially when the alternative is sending screenshots of your email, Slack, and code editor to someone else's server every two seconds.

The ecosystem is still small — 12 published projects on the Omi Hub, only three with more than 100 downloads — but the projects that exist are genuinely impressive. One developer built a system that watches her IDE terminal for build errors and auto-creates Jira tickets. A design agency built a workflow that listens to client calls, screens the designer's monitor for Figma feedback comments, and auto-generates revision notes in Notion. With total hardware costs under $200 ($99 dev kit + $80 Raspberry Pi 5), Omi offers a compelling path for builders who want a voice-controlled screen observer that owns its own data.

Downloads

Episode Audio

Download the full episode as an MP3 file

Download MP3
Transcript (TXT)

Plain text transcript file

Transcript (PDF)

Formatted PDF with styling

#2911: Building a $180 Privacy-First AI Wearable

Corn
Daniel sent us this one — he's been watching what Omi's been up to, from those early wearable recorders to the dev kit that basically lets you build your own open-source Plaud, and now they've got this screen processing beta that watches what you're doing and generates task reminders. He's asking how big the ecosystem around this thing has actually gotten, and what it takes to go from ordering the dev kit to having a working voice productivity system. And honestly, the timing is perfect because they just shipped version zero point three of that screen processing beta, and the community crossed two thousand active builders on GitHub.
Herman
Two thousand three hundred plus stars on the repo, actually. Forty seven community forks. This thing is moving fast.
Corn
Of course you have the exact number.
Herman
I looked it up this morning. But here's what makes this moment interesting — Omi started as a wearable recorder in twenty twenty three, the kind of thing you'd clip on and use to capture meetings. Then late twenty twenty four they realized the hardware was commoditized and pivoted hard to being a developer platform. The dev kit shipped in Q one of twenty twenty five, and now the screen processing beta landed in March. Three distinct eras in three years.
Corn
The question is whether this is actually a platform play that's found its footing, or just a developer toy with a good press kit. And more practically — if someone listening wants to build their own voice productivity system with a ninety nine dollar dev kit, what does that actually look like?
Herman
That's the thing. The dev kit is an ESP thirty two S three board with a MEMS microphone array, Bluetooth LE, and a six axis IMU. It runs FreeRTOS, ships with reference firmware that does real time keyword spotting and audio streaming. For ninety nine dollars, you get full access to the I two S audio bus — you can swap in your own voice activity detection model, your own automatic speech recognition pipeline. This is not a closed appliance. This is a hardware toolkit.
Corn
It's the anti-Plaud. Plaud gives you a polished transcription experience but you never touch the raw audio stream or the model pipeline. Omi hands you the keys and says build whatever you want.
Herman
And the screen processing thing takes it somewhere completely different from the original recorder vision. The companion app grabs a screenshot every two seconds, runs OCR through Tesseract compiled for ARM, feeds the text to a local LLM — they're using Phi three mini four k instruct running through llama dot cpp — with a prompt that's basically extract any tasks, deadlines, or follow ups from this text. Results go into a local SQLite database and surface as notifications. No cloud round trip. Everything stays on device or on your local server.
Corn
You're trading some accuracy for total privacy. Cloud OCR hits around ninety nine percent accuracy, on device Tesseract on complex UIs drops to maybe ninety two percent. That's the bargain.
Herman
For a lot of use cases, ninety two percent is plenty. Especially when the alternative is sending screenshots of everything on your monitor to someone else's server every two seconds. I mean, think about what that actually means. Your email, your Slack, your code editor, your bank account if you tab over to check something — all of it getting OCR'd and sent to the cloud. That's a nonstarter for anyone with actual security concerns.
Corn
The musical equivalent of a surveillance camera in your own living room.
Herman
And the local approach means you need some compute. Their reference server spec is a Raspberry Pi five with eight gigs of RAM, which runs about eighty dollars. So your total hardware cost is the ninety nine dollar dev kit plus an eighty dollar Pi — under two hundred dollars for a voice-controlled screen observer that owns its own data.
Corn
That's the pitch anyway. The real question is what people are actually building with it, and whether the ecosystem is deep enough to sustain itself. We should dig into that.
Herman
Yeah, the Omi Hub has twelve published projects right now. Only three have more than a hundred downloads. But one of them is genuinely impressive — a developer named Sarah Chen built a system that watches her IDE terminal for build errors and auto creates Jira tickets. She published the whole thing in April.
Corn
That's the kind of thing that makes you realize we're not talking about a toy. Someone is literally using a ninety nine dollar pendant to watch their code compile and file tickets automatically.
Herman
There's a design agency that published a full build guide — their setup listens to client calls, screens the designer's monitor for feedback comments in Figma, and auto generates revision notes in Notion. That's a real production workflow, not a weekend hack.
Corn
The ecosystem is small but the projects that exist are punching above their weight. Which brings us back to the core question — if you want to build this, where do you actually start?
Herman
The starting point depends on what kind of builder you are. But before we map the path, I think it's worth understanding why Omi even exists as a dev kit in the first place. Because the pivot they made in late twenty twenty four wasn't obvious.
Corn
A company realizing its hardware is a commodity and deciding to become a platform instead — that's the kind of pivot that usually happens after a failed product launch, not before one.
Herman
That's what makes it interesting. They launched in twenty twenty three as a pendant recorder. Clip it on, capture meetings, get transcripts. Perfectly fine product. But they looked at the landscape and saw that everyone was building the same thing — Plaud, Humane, a dozen other wearable recorders — and the differentiating factor wasn't the hardware. It was the software stack and who controlled the data flow.
Corn
Instead of trying to win the consumer gadget war, they turned their product into a reference design and said here, you build the thing you actually want.
Herman
The dev kit shipped Q one twenty twenty five with that ESP thirty two S three, the MEMS mic array, the IMU, Bluetooth LE — all the specs we mentioned. But the key decision was exposing the full I two S audio bus. That's not something you do if you're trying to protect a walled garden. That's a platform move.
Corn
It's like the difference between selling a smart speaker and selling an Arduino with a really good microphone. One of those has a future as an ecosystem.
Herman
Then the screen processing beta in March of this year — that's where the vision expands beyond audio entirely. The original recorder was about capturing what people say. Screen processing is about capturing what you see. Two completely different input modalities, same underlying philosophy: observe, extract, remind.
Corn
Omi isn't really a product company anymore. They're a hardware platform with a reference implementation, and the actual products are whatever the community builds on top.
Herman
That's the bet. And it's a bet that only works if the developer experience is good enough that people actually build things. The question is whether two thousand GitHub stars and twelve published projects is traction or just curiosity.
Corn
Twelve projects, three with meaningful downloads — that's a candle, not a fire. But the Sarah Chen build error to Jira pipeline, the design agency Figma to Notion workflow — those aren't toy projects. Those are people solving real production problems with a ninety nine dollar pendant.
Herman
The screen processing is where this gets different from the original recorder vision. A recorder captures audio and gives you a transcript. Useful, but passive. Screen processing watches your actual work surface and extracts intent — deadlines you glanced at in an email, action items someone typed in Slack, a date you hovered over in a calendar. It's not recording what you hear. It's inferring what you need to do.
Corn
Which is either the killer feature for a second brain or the creepiest thing a pendant has ever done, depending on how you feel about a local LLM reading your screen every two seconds.
Herman
I think the creepiness factor depends entirely on where that data lives. If everything's local, it's weird but it's your weird. Nobody else's server knows you hovered over a dentist appointment for four seconds.
Corn
Like having a diary that reads over your shoulder, versus a diary that phones home to a marketing firm.
Herman
So let's talk about what's actually inside this thing technically, because the architecture is what makes the local-first approach possible. The dev kit uses the ESP thirty two S three running FreeRTOS. Audio comes in through the MEMS mic array, hits a custom pipeline built on the ESP dash SR library for wake word detection — default wake word is Hey Omi — and then once the wake word triggers, it streams sixteen bit PCM audio at sixteen kilohertz over Bluetooth LE to either the companion app or your local server.
Corn
The wake word detection happens on the pendant itself. The heavy lifting happens downstream.
Herman
And this is where the I two S audio bus matters. On a closed device like the Plaud NotePin, the raw audio stream never leaves the proprietary pipeline. You get whatever the manufacturer decided to give you. On the Omi dev kit, the I two S bus is fully exposed — you can tap into the raw digital audio before it hits any processing stage. That means you can swap in your own voice activity detection model, your own automatic speech recognition engine. If you want to use Whisper instead of their default, you just point the pipeline at your Whisper instance.
Corn
You're not locked into their speech to text quality. If a better model drops tomorrow, you plug it in.
Herman
That's the answer to why someone would build on Omi instead of grabbing a generic ESP thirty two dev board. A generic board gives you the chip and the pins. Omi gives you a tuned audio pipeline with hardware that's actually designed for wearable voice capture — the mic array placement, the acoustic housing, the power management for all day battery life. Those are the things that take months to get right on a bare board. The dev kit solves the physical engineering so you can focus on the software.
Corn
It's the difference between buying a bag of flour and buying a sourdough starter that someone's been feeding for two years.
Herman
And the screen processing side is where the architecture gets clever. The companion app — Android or iOS — takes a screenshot every two seconds. That screenshot goes through Tesseract OCR, but here's the detail that matters: they compiled Tesseract specifically for ARM, which means it runs efficiently on mobile silicon without needing to offload to a server. The extracted text then hits Phi three mini four k instruct running through llama dot cpp, also locally. The prompt is something like extract any tasks, deadlines, or follow ups from this text. Results go into a local SQLite database and surface as system notifications.
Corn
Two seconds between screenshots means you're sampling your screen thirty times a minute. That's enough to catch a Slack message or a calendar reminder, but not enough to read a full document in real time.
Herman
That sampling rate is actually a design choice, not a limitation. If you sampled faster, battery life tanks. If you sampled slower, you miss things. Two seconds is the sweet spot where you catch enough context without turning your phone into a space heater. The tradeoff, as we mentioned earlier, is accuracy. Cloud OCR with something like Google's Vision API hits around ninety nine percent on clean text. Tesseract on ARM, especially on complex UIs with mixed fonts and backgrounds, drops to about ninety two percent.
Corn
One in twelve characters is wrong. That sounds bad until you realize the LLM is there to clean it up. Phi three mini isn't just extracting tasks — it's also error correcting the OCR output based on context.
Herman
If Tesseract reads deadline Friday as d e a d l i n e F r i d a y with a garbled character, the LLM sees the surrounding context and fixes it. That's the quiet genius of the pipeline — the OCR doesn't have to be perfect because the language model downstream is doing cleanup. It's a two stage filter.
Corn
You're trading raw OCR precision for privacy, but you're buying back some of that precision with the local LLM. The net accuracy gap is probably smaller than the headline numbers suggest.
Herman
That's what the Sarah Chen project demonstrates in practice. She built a system that watches her IDE terminal for build errors. Terminal text is monospaced, high contrast, dead simple for OCR — she's probably getting near cloud level accuracy on that specific use case. The system detects a build failure, extracts the error message, and auto creates a Jira ticket with the stack trace in the description. She published the whole thing on the Omi Hub in April.
Corn
That's the kind of project that makes you wonder why IDEs don't have this built in. Your editor watches you fail and quietly files the paperwork.
Herman
The design agency project takes it even further. They've got Omi listening to client calls through the pendant microphone while simultaneously watching the designer's Figma screen through the screen processing beta. When a client says can we make that button blue and the designer hovers over the button, the system captures both the audio request and the visual context, then generates a revision note in Notion with a screenshot reference.
Corn
That's not a productivity tool. That's a second employee who works for ninety nine dollars and never asks for a raise.
Herman
Both of these projects are possible because Omi exposes the full pipeline. You're not limited to their meeting summarizer or their default integrations. You write a plugin that hooks into the audio stream, the screen capture feed, or both, and you pipe the output wherever you want — Jira, Notion, Linear, Todoist, a custom webhook, whatever.
Corn
The open source Plaud analogy really lands here. Plaud gives you a finished house and says you can rearrange the furniture. Omi gives you the foundation, the framing, and the wiring diagram, and says build the house you actually want to live in.
Herman
The wiring diagram is the part that matters. The I two S bus, the ESP dash SR pipeline, the Tesseract ARM compilation, the llama dot cpp integration — these are the technical decisions that make the difference between a dev board that collects dust and one that actually ships projects. A generic ESP thirty two board doesn't come with an audio pipeline tuned for voice. It doesn't come with a companion app that handles screenshot capture and OCR scheduling. You'd spend weeks just getting to the starting line.
Corn
The value proposition isn't the hardware. It's the integrated stack. The ninety nine dollars buys you a known good configuration that someone else debugged.
Herman
The community is small but it's building on that stack in ways that suggest real momentum. The Omi Hub has twelve published projects, forty seven community forks, and the Discord has about twenty four hundred members. Those aren't staggering numbers, but for a dev kit that's been shipping for less than eighteen months, it's genuine traction.
Corn
The Humane AI Pin, by comparison, had a hundred times the funding, a massive launch event, and the ecosystem is now a paperweight. They shut down in February.
Herman
Twenty four dollars a month subscription, total cloud dependency, and when the servers went dark, the hardware died. Omi's approach is the opposite — zero recurring costs if you run everything locally, and if the company disappeared tomorrow, the hardware still works because nothing phones home.
Corn
That's the local sovereignty argument in hardware form. You own the device, you own the data, you own the pipeline. The tradeoff is you have to be willing to configure it.
Herman
The configuration isn't trivial, but it's also not as hard as people assume. The reference firmware comes pre flashed. You pair it with the companion app, set your wake word sensitivity, and point it at a local server running the Omi Server daemon. That server needs about eight gigs of RAM for the LLM component, which is why they recommend the Raspberry Pi five. From there, you connect to a vector database — ChromaDB or LanceDB are the supported options — for persistent memory, and then you wire up your integrations through webhooks.
Corn
Two hours from unboxing to a basic voice to task pipeline, assuming you've got the Pi ready to go.
Herman
That two hour timeline assumes you're comfortable with a terminal and have the Pi already imaged. If you're starting from a cold boot on the hardware side, add another hour for flashing the Pi's OS and getting the Docker containers running for the server daemon.
Corn
Three hours to a system that listens to you talk and watches your screen, then writes things down where you actually need them. That's less time than most people spend configuring their email filters.
Herman
The step by step path is approachable. Step one, you flash the reference firmware to the dev kit — it ships pre flashed for most orders, but if you want the latest build it's a single command over USB. Step two, you pair it with the companion app, which handles the Bluetooth handshake and lets you configure the wake word sensitivity and voice activity detection threshold. Step three, you set up a local server — the reference spec is a Raspberry Pi five with eight gigs of RAM, about eighty dollars — running the Omi Server daemon in Docker. Step four, you connect the daemon to a vector database. ChromaDB is the default, LanceDB is the alternative if you want something lighter. That database is what gives the system persistent memory — it stores embeddings of everything you've said and everything the screen processing has captured, so you can query it later. Step five, you wire up your task manager integrations through webhooks. Todoist, Linear, Notion — they all have REST APIs, and the server daemon has a plugin system that handles the authentication and formatting.
Corn
The vector database is the part that turns it from a fancy dictaphone into something that remembers context across sessions.
Herman
That's the secret ingredient. Without the vector store, every interaction is stateless. You say remind me about the Johnson account and the system has no idea what the Johnson account is. With the vector store, it can search across previous meeting transcripts, screen captures, and voice notes to surface relevant context. ChromaDB handles the embedding generation and similarity search locally — again, no cloud dependency.
Corn
The webhook integrations are where the system graduates from passive observer to active participant. Once it can write to your task manager, the next logical step is having it execute actions directly.
Herman
That's where the community is heading, and it's the most interesting knock-on effect. Once you have a device that listens to your voice and watches your screen, and you trust it because everything stays local, the natural question becomes what should it do, not just what should it capture. The Omi Discord has channels dedicated to action plugins — people building integrations that trigger GitHub Actions from voice commands, send Slack messages when specific screen conditions are met, even control smart locks and lights through Home Assistant bridges.
Corn
The design agency we mentioned — they're not just capturing revision notes. The logical extension is Omi hears the client say approved, moves the Figma file to the handoff folder, pings the developer in Slack, and updates the project timeline in Notion. All from a voice command captured through a pendant and a screen state confirmed locally.
Herman
That's the build your own AI wearable category Omi is carving out. It sits between a consumer gadget — where you get what the product manager decided you need — and a developer toy where you're soldering headers and writing your own BLE stack. The dev kit gives you a finished hardware product with an open software pipeline. You're not writing drivers. You're writing integrations.
Corn
The Humane AI pin tried to be the consumer version of this and failed spectacularly. Twenty four dollars a month, total cloud dependency, and when the servers went dark in February, the hardware became e-waste. Omi's approach is the inverse — zero recurring cost, everything runs on hardware you own, and if the company vanishes, your pendant still works and your server still runs.
Herman
The tradeoff is the setup effort. Humane promised it just works. Omi promises it works if you're willing to configure it. For a certain kind of user, that's not a bug — it's the entire value proposition.
Corn
That certain kind of user is currently about twenty four hundred people in a Discord server and forty seven forks on GitHub. It's a candle, not a fire, but it's a candle that's actually burning, which is more than you can say for most hardware platform plays at this stage.
Herman
The Omi Hub numbers tell the story of an ecosystem at the very beginning. Twelve published projects, only three with more than a hundred downloads. Most builders are hobbyists, not enterprises. Compare that to the Plaud ecosystem, which is a closed garden — you get what Plaud ships, and the community doesn't build on top of it because there's nothing to build on. Or compare it to Humane, which had a developer program that never gained traction because the platform was tethered to a subscription model that nobody wanted to pay for.
Corn
The closed ecosystems die when the company dies. The open ecosystems die when the community loses interest. Omi's bet is that the community won't lose interest because the thing is actually useful in a way that closed alternatives aren't.
Herman
The privacy angle is the structural advantage that keeps the community engaged. When everything runs locally, you're not just avoiding subscription fees — you're avoiding the entire category of risk that comes with sending your screen contents and voice recordings to someone else's server. For the design agency handling confidential client work, or the developer working on proprietary code, or the lawyer who wants meeting notes without a third party retention policy, local processing isn't a nice to have. It's the only viable option.
Corn
The Raspberry Pi five with eight gigs is the entry ticket. Eighty dollars for a server that sits on your desk and handles all the LLM inference, the vector search, the OCR cleanup. That's less than four months of a Humane subscription.
Herman
You can scale it. If you outgrow the Pi, you move the daemon to a NUC or an old laptop or a home server rack. The server daemon is just a Docker container — it doesn't care what hardware it's running on as long as there's enough RAM for the model.
Corn
The path from zero to a working voice productivity system is a weekend project. The path from working system to something bespoke — with custom plugins and action execution and context aware memory — that's where the ongoing investment lives.
Herman
That's the question that'll determine whether Omi reaches critical mass. The dev kit lowers the barrier to entry dramatically, but the ceiling on what you can build is high enough that the people who get invested tend to stay invested. Whether that community grows from a few thousand enthusiasts to a self sustaining ecosystem depends on whether the early projects are useful enough that other people want to replicate them without being the kind of person who enjoys configuring Docker containers.
Corn
Sarah Chen's build error to Jira system and the design agency's Figma to Notion pipeline are the proof of concept. The question is whether the next hundred projects make the leap from clever hack to something a non developer would actually install.
Herman
That brings us to something actionable.
Corn
If someone's listening and thinking I want this but I don't want to spend six months learning embedded systems, where do they actually start?
Herman
The dev kit page on omi.Ninety nine dollars, ships in about a week. The reference firmware comes pre flashed, so you're not wrangling toolchains. Unbox it, charge it, pair it with the companion app, and you've got a working voice recorder in under ten minutes. The voice to task pipeline takes a bit more — figure two hours if you have a Raspberry Pi five ready to go with the Omi Server daemon in Docker.
Corn
The screen processing piece?
Herman
That's the beta. You opt in through the companion app — it installs the Tesseract OCR binary and the Phi-three-mini model on your local server. From there, you configure which app windows to monitor. The smart move is to start narrow — point it at one application, like your email client or your task manager, rather than trying to observe your entire desktop. OCR accuracy on a single clean UI is closer to ninety five percent. Throw fifteen windows with overlapping panels at it and you're back down to the low nineties, plus the LLM has to work harder to extract meaningful tasks from the noise.
Corn
The advice is: don't try to build the omniscient screen observer on day one. Start with voice notes to Todoist, get that pipeline solid, then add screen monitoring for one specific workflow.
Herman
Voice notes to Todoist is the hello world of this ecosystem. You set up a webhook from the Omi Server daemon to the Todoist REST API, configure a voice trigger phrase like add task, and suddenly anything you say after that trigger lands in your inbox with a timestamp and a transcript. Two hours, maybe three if you're reading the docs carefully.
Corn
The Discord has twenty four hundred people who've already done this and can answer questions. Half the value of the ecosystem is the community, not the hardware.
Herman
Join the Discord, clone the reference server from the BasedHardware GitHub repo — that's the organization name, BasedHardware — and start with the simplest integration you'll actually use. Voice notes to Todoist. Meeting transcripts to Notion. Build error detection to Slack. Pick one, get it working, live with it for a week, then add the next piece.
Corn
The bigger lesson here is that the second brain concept has been trapped in the consumer product fantasy for years — some perfectly polished device that Just Works and organizes your entire life. Omi proves you don't need that. A ninety nine dollar dev kit, an eighty dollar Raspberry Pi, and a weekend of tinkering gets you eighty percent of the way to a system that actually reduces cognitive load instead of adding to it.
Herman
That eighty percent is the part that matters. Capturing thoughts before they evaporate, surfacing tasks from conversations you'd otherwise forget, watching your screen for the thing you said you'd follow up on. The last twenty percent — the polished UI, the seamless onboarding, the consumer grade fit and finish — that's what companies charge subscriptions for. If you're willing to trade some polish for total ownership, the path exists right now.
Corn
Total cost of entry: under two hundred dollars and a Saturday afternoon. Total recurring cost: the electricity to run a Raspberry Pi. That's the open source second brain, and it's already shipping.
Corn
The open question is whether Omi stays a developer playground or eventually ships something your aunt could set up. Right now, the entire ecosystem depends on people who know what a Docker container is and aren't afraid of a YAML file. That's a ceiling.
Herman
It's a real ceiling. The Discord has twenty four hundred members, the GitHub has forty seven forks — those are hobbyist numbers, not platform numbers. If Omi wants to cross into something self sustaining, they either need to make the setup dramatically easier or grow the community by an order of magnitude. The thing is, I'm not sure they want to cross that line. Some companies are perfectly happy being the framework everyone builds on, not the finished product everyone buys.
Corn
The tension is whether the community can sustain itself if Omi the company runs out of runway. Open source hardware projects have a graveyard problem — when the company stops shipping boards, the ecosystem fragments across whoever still has working units. The difference here is that the server daemon and the models don't depend on the pendant. You could swap in any BLE microphone and the pipeline still works.
Herman
That's the structural resilience. If Omi disappeared tomorrow, Sarah Chen's build error detector still runs. The design agency's Figma integration still fires. The value is in the software stack, not the pendant itself. The pendant is just the most convenient input device.
Corn
The bigger inflection point is screen processing. If that beta matures to the point where OCR accuracy on complex UIs crosses, say, ninety five percent, and the LLM extraction gets reliable enough that you trust it with real tasks, then manual task entry starts looking like a legacy behavior. You wouldn't type follow up with the Anderson account — the system would just know you need to because it watched you read that email and heard you mutter I'll deal with this later.
Herman
That's where the privacy versus convenience tradeoff gets sharp. The reason Omi can do this without a privacy backlash is that everything stays local. But the moment screen processing becomes good, the convenience pull is enormous. Most people will choose convenience. The question is whether they choose Omi's local convenience, or whether Apple or Google ship the same feature with cloud processing and a polished onboarding flow.
Corn
If I had to bet, the polished cloud version wins the mass market, and Omi's local version wins the niche that actually cares about ownership. That niche is small but it's sticky, and it's the same niche that runs home servers and self hosts email and compiles their own kernels.
Herman
For that niche, this is exciting. If you've ever wanted to build your own Jarvis, the Omi dev kit is your starting point. Links to the kit, the GitHub repo, and the Discord are in the show notes.
Corn
Under two hundred dollars and a weekend. That's the pitch.
Herman
Thanks to our producer Hilbert Flumingtop for making this episode sound like something.
Corn
This has been My Weird Prompts. Find us at myweirdprompts dot com and on Spotify.
Herman
Go build something weird.

This episode was generated with AI assistance. Hosts Herman and Corn are AI personalities.