#3792: Cloud Brain, Local Fingers: Decoupled Home Assistant

Can Home Assistant run in the cloud while Zigbee stays local? We explore the decoupled control plane architecture.

Featuring
Listen
0:00
0:00
Episode Details
Episode ID
MWP-3971
Published
Duration
29:42
Audio
Direct link
Pipeline
V5
TTS Engine
chatterbox-regular
Script Writing Agent
deepseek-v4-pro

AI-Generated Content: This podcast is created using AI personas. Please verify any important information independently.

This episode tackles a provocative architecture question: can Home Assistant run on a cloud VPS while keeping Zigbee and Matter radios local? The answer is yes — with important caveats. The proposal uses a "decoupled control plane" model: the orchestration logic lives in the cloud (a VPS running Home Assistant OS), while the data plane — the actual radio communication with devices — stays local on a small coordinator device. Tailscale tunnels connect the two over encrypted WireGuard links.

The key insight is that this trades one set of failure modes for another. Local setups suffer from SD card corruption, power supply failures, and the dreaded midnight re-flash. Cloud setups add ISP dependency and VPS outage risk. But with Zigbee bindings — a protocol-level feature that lets switches talk directly to bulbs without a controller — critical functions like lighting can survive a cloud outage entirely. The latency cost is 30-80ms for most setups, imperceptible for light switches but potentially problematic for presence sensors.

Matter adds complexity because it assumes controller and device are on the same LAN. The Thread Border Router must be local, and the Matter controller needs mDNS discovery, which doesn't work across Tailscale subnets without careful configuration. For now, the architecture works best with pure Zigbee and Z-Wave devices, with Matter kept on a separate local path.

Downloads

Episode Audio

Download the full episode as an MP3 file

Download MP3
Transcript (TXT)

Plain text transcript file

Transcript (PDF)

Formatted PDF with styling

#3792: Cloud Brain, Local Fingers: Decoupled Home Assistant

Corn
Daniel sent us this one — and I have to admit, I read it twice. He's proposing an architecture where Home Assistant runs on a cloud VPS instead of a little box under your TV, with a local coordinator handling Zigbee and Matter that tunnels back to the cloud brain over Tailscale. The question basically is — is this viable, does it have a name, and how would you actually make it work, especially with Matter devices in the mix? He also mentions the renter angle — they're moving apartments, the ISP does that charming thing where your internet dies in one place the moment it lights up in the other, and he's realizing that if the brain lived in the cloud, you just plug in your coordinator hardware at the new place and... No server relocation, no re-flashing SD cards, no frantic weekend of debugging.
Herman
Okay so the first thing I want to say is — I get why someone hears this and immediately thinks you've abandoned the local-first philosophy that Home Assistant was built on. But that's not what's being proposed here at all. The actuation is still local. The radio is still local. The command still travels sixty centimeters from a Zigbee coordinator to a light bulb. What's moved is the orchestration layer. The brain that decides what happens when. And I think calling this a "decoupled control plane" architecture actually captures what's going on — the control plane is in the cloud, the data plane is local.
Corn
Decoupled control plane. I like that. It sounds like the sort of thing someone at a networking conference would say right before you quietly leave the room — but it actually names the thing. It's the same mental model as those split-brain Kubernetes clusters where the API server lives in one place and the kubelets are scattered across three availability zones. Except our availability zones are "the apartment," "the new apartment," and "my mother's house.
Herman
That's exactly the right analogy. And the reason this architecture exists at all — the reason someone would even think to build this — is that the conventional fully-local setup has its own failure modes that nobody really talks about. A Raspberry Pi with an SD card running Home Assistant is a single point of failure that corrupts itself over time. SD cards wear out. Power supplies get flaky. I have seen more dead Home Assistant instances than I can count, and in every single case, the Zigbee network was still up, the devices were still paired, but nothing could be controlled because the brain was dead.
Corn
If the brain is dead, you're standing in your kitchen at eleven at night waving at a motion sensor like a confused wizard while nothing happens. I've been that wizard. I've cast "lumos" at a Philips Hue bulb with increasing desperation.
Herman
And a cloud brain doesn't have the SD card problem. It runs on redundant storage in a data center. If the VPS instance itself goes down, you can spin up a new one from a backup in minutes — not hours of re-flashing and re-pairing. Cloud providers offer snapshotting, automated backups, failover. I've personally restored a Home Assistant instance on Hetzner from a snapshot in under four minutes. Try doing that with a corrupted SD card at midnight when you don't have a spare.
Corn
You've just relocated the single point of failure. Now the thing that can break is the VPS provider having an outage, or your internet connection dropping, or the tunneling going sideways. You've traded one fragility for another. It's like moving your life savings from under the mattress to a bank, and then realizing the bank can have a power outage.
Herman
And that's the honest answer to whether this is more resilient or just differently fragile — it depends on whether your local hardware or your VPS provider has better uptime. Hetzner, DigitalOcean, they run at something like 99.99 percent uptime. Most people's home Raspberry Pi setup is nowhere near that. But if your home internet goes down, it doesn't matter that the cloud brain is still alive — the coordinator can't reach it. So the question becomes: what's statistically more likely? Your ISP dropping for an hour, or your SD card corrupting irreversibly? And the answer varies by person. Someone on symmetric fiber in Zurich has a very different risk profile from someone on DSL in a rural area.
Corn
Unless you design fallback automations.
Herman
That's the key. If your local coordinator has some intelligence of its own — and this is where this architecture gets interesting — you can program fallback logic. Zigbee bindings, for example, can pair a switch directly to a bulb at the radio level so even if the coordinator loses connection to the cloud brain, the switch still turns on the light. That's not a Home Assistant thing — that's a Zigbee protocol feature. The coordinator tells devices about each other, and they keep talking even if the controller drops out. It's like teaching your devices to have a short conversation without the central operator.
Corn
You can have basic lighting control survive a cloud outage. That's about eighty percent of the complaints right there. Nobody's writing angry forum posts because their temperature logger stopped logging for twelve minutes. They're writing them because the hallway is dark.
Herman
Maybe eighty-five percent. Nobody wants to explain to their spouse why the bathroom lights don't work because a server in Frankfurt went down. "The server in Frankfurt" is not an acceptable answer in a domestic dispute about bathroom illumination at six in the morning.
Corn
I'd argue it's not an acceptable answer in any domestic dispute. "Why didn't you take out the trash?" — "Server in Frankfurt.
Herman
The binding approach is genuinely elegant because it operates at the radio layer. The coordinator essentially says to the switch and the bulb, "You two, talk directly. Here are your addresses. Here's the encryption key. If I disappear, you still know what to do." It's a pre-negotiated handshake that survives the negotiator leaving the room.
Corn
The architecture exists to address a specific failure pattern that local setups don't handle well — hardware death — and with fallback logic it can also survive cloud outages for critical functions. Let's get into the actual mechanics. How does the traffic flow when you tap a button in the Home Assistant dashboard?
Herman
The flow is surprisingly straightforward. You've got Home Assistant OS running on a VPS somewhere. That Home Assistant instance is connected to Tailscale, which gives it a virtual IP address on your Tailscale tailnet. At home, you have a small device — let's say a Raspberry Pi 4 or even a Sonoff Zigbee dongle connected to something lightweight — and it's also running Tailscale. But the local device also runs Zigbee2MQTT or it could be a Hubitat hub that speaks its own API. The local device is essentially a protocol translator. It speaks Tailscale-encapsulated IP on one side and Zigbee radio on the other.
Corn
The coordinator hardware is the bridge between two worlds. One foot in the cloud tailnet, one foot in the local Zigbee mesh. It's a diplomatic attaché between two sovereign territories.
Herman
And what Tailscale's subnet routing does is expose the entire local network — or specific parts of it — to other devices on the tailnet. So your cloud Home Assistant instance can see the local coordinator as if it's on the same Ethernet segment. The cloud instance thinks the coordinator is at 192.something, and that traffic goes over WireGuard, which is the VPN protocol Tailscale uses. WireGuard is remarkably efficient — it's in the Linux kernel now, it adds minimal overhead per packet.
Corn
When I tap "Living Room Lights On" in my browser...
Herman
The command hits the Home Assistant API on the cloud instance. Home Assistant processes it, sees that the device is a Zigbee light connected through this specific coordinator integration, constructs the Zigbee command, encapsulates it, and sends it through the Tailscale tunnel to the coordinator. The coordinator receives it, looks at its local routing table, and transmits the Zigbee command over the air to the light. And the light turns on. The entire chain is: browser to cloud API, cloud to tunnel, tunnel to coordinator, coordinator to bulb. Four hops, but the middle two are essentially transparent from the application's perspective.
Corn
The whole time there's an encrypted tunnel wrapping everything. What does that cost in terms of latency? Because four hops sounds like a lot of handshakes.
Herman
This is where the numbers matter. A typical round trip in a purely local setup — say, a Raspbee on the same network wired into a local Home Assistant instance — that's about five to maybe twelve milliseconds from dashboard tap to actuation. Same network, microcontrollers handling it, no internet round trip. With the cloud setup, depending on where your VPS is relative to your home, you're looking at an added thirty to fifty milliseconds of latency typically. Sometimes seventy or eighty. Some routes can push a hundred. The variable is physical distance — light in fiber only travels so fast — and the number of network hops between your ISP and the data center.
Corn
Eighty milliseconds means the light comes on one-tenth of a second after you tap. Your eye doesn't register that. I can't perceive a tenth of a second. I can barely perceive that I left my keys on the counter.
Herman
For lights and switches, it's totally negligible. Human reaction time is about two hundred milliseconds. So a light that turns on in thirty versus eighty milliseconds — impossible to perceive the difference. Two hundred eighty milliseconds is slightly slower than the two hundred milliseconds of a local setup, but we're in "barely perceptible" territory. At worst it feels like a slightly soft switch. Like the difference between a mechanical keyboard switch and a membrane one — you might notice if you're paying extremely close attention, but in day-to-day use it fades into the background.
Corn
The inflection point is where the latency actually matters.
Herman
This is exactly where it gets problematic. Take an Aqara FP2 millimeter-wave presence sensor. The way it's more or less designed to work, it detects human presence without motion required — it literally senses a stationary body in the room. When it fires an automation event to turn on lights, the expected latency is tiny because the coordinator and the brain are adjacent. If you add sixty milliseconds for the tunneled cloud trip, that might mean the light comes on after you've already crossed half the room. But more dangerously...
Corn
For a fire alarm, even two seconds of extra latency because a VPS is throttled for some random billing issue is categorically unacceptable. Two seconds is the difference between "there's smoke" and "there's a lot of smoke." And billing issues happen. I've had a cloud provider throttle my instance because a credit card expired and I didn't notice the email. Suddenly my CPU quota was cut to a tenth. I found out when a database query took forty seconds instead of four.
Herman
Life safety devices need to be on a separate, local path anyway. You wouldn't run an alarm siren purely through a cloud-brain regardless of latency. That should be locally handled. This is one of those design principles that sounds obvious when you say it out loud, but it's easy to forget when you're drawing architecture diagrams at two in the morning. Life safety stays local. No tunnel, no cloud, no dependency on a credit card being valid.
Corn
The other dimension of this worth exploring is the Matter angle, because Matter is the promised land where everything just works, and adding a tunnel under it might put us back into cursed-ward territories. Matter was supposed to solve the interoperability nightmare, not create a new one where your devices can't see your controller because it's in a data center three hundred miles away.
Herman
Matter complicates things, but not in an unsolvable way. The issue is that Matter commissioning relies on local discovery protocols — mDNS, also known as Bonjour or zeroconf, broadcasts across the local network segment to find nearby Matter controllers. The phone or controller that is commissioning a new Matter device needs to be on the same logical LAN. mDNS is fundamentally a "shout into the local subnet and see who answers" protocol. It's not designed to cross network boundaries.
Corn
Your VPS in a data center is not on the local LAN, regardless of what Tailscale tells it. Tailscale can make IP addresses reachable, but it doesn't magically forward multicast DNS packets across subnets.
Herman
mDNS packets don't traverse subnets by default. They use a reserved multicast address — 224.251 — and routers drop multicast traffic unless explicitly configured to forward it. So when you bring home a Matter bulb and go to commission it, your cloud Home Assistant can't see the bulb shouting "I'm here." The fix is what's called mDNS reflection or an mDNS relay across the tunnel. Tailscale supports SSH and various protocols nicely, but mDNS forwarding specifically requires either running an mDNS proxy on your local router — OpenWRT and pfSense can do this — or using a more sophisticated trick.
Corn
Such as creating a pure Layer 2 extension across the tunnel. Which is the networking equivalent of performing surgery with a butter knife — it works, but nobody enjoys it.
Herman
That's possible, but nobody loves doing it. My recommendation for the pragmatic implementation is actually a different pattern. Keep a Thread Border Router on the local network. It stays there in the apartment. That could be an Apple TV or a HomePod mini. You don't want to touch it. You can then run a Matter controller on the VPS, have it see the Border Router across the Tailscale subnet route, and the Border Router bridges commissioning locally. New Matter devices join the local Thread mesh, the Border Router facilitates communications with the cloud-held controller via the tunnel.
Corn
The Border Router is your Matter-native local leg. It's the embassy. It handles all the local protocol stuff — the mDNS, the Thread commissioning, the device attestation — and then speaks a clean IP-based protocol back to your cloud controller.
Herman
And the Matter spec has been working toward more seamless multi-admin controller scenarios — version one point four last year there were advances, not yet perfect, but workable. The vision is that a single Matter device can be commissioned into multiple fabrics simultaneously, so your cloud controller and your local fallback controller can both have authority.
Corn
That sounds like a workplace compliance nightmare, the phrase alone. "Please submit a ticket to add this light bulb to the secondary administrative domain.
Herman
I've seen Matter progression and the direction is not all harmonized, and I'm not certain it's as smooth as I'm implying for an out of the box configuration just yet — some of the expected full-function enhancements are currently on track for Matter two-point-zero over subsequent years — but in any case yes, it is manageable with a decent router. The trajectory is positive even if the present is a bit rough around the edges.
Corn
Let's put the Matter awkwardness to one side for now. What I'd like to explore is the single strongest argument the prompt raises — the use-case that this architecture is literally designed for.
Herman
Anyone who moves periodically. People on short-term leases. Digital nomads who maintain a home base but aren't always there.
Corn
Because it's a story I've lived. I once named a computer Sputnik two, and moved between three apartments. Setting up networking every time felt like re-inventing Newtonian physics from scratch. If Home Assistant lived on a VPS, that wouldn't have happened. I would have plugged in a box, watched it phone home, and gone back to unpacking dishes instead of debugging DHCP reservations for four hours.
Herman
Here's the magic of this architecture for a renter. The coordinator hardware is tiny — a USB dongle plugged into a Raspberry Pi, or a standalone box like a Hubitat. The hardware looks like a mundane device. When you move, you unplug it and put it in a box with your modem. At the new apartment, plug it in, connect it to the local network, power it up. It phones home over Tailscale. You update a DNS record or drop the Tailscale IP into whatever config param — exactly as suggested — and suddenly all your Zigbee devices still remember their pairings from the coordinator. Your lighting automations from last week fire again. The devices don't know they moved. The coordinator doesn't care what IP address it has. The cloud brain doesn't care where the coordinator is, as long as it can reach the Tailscale address.
Corn
No weeks of dread because of a failed SD card after moving day. I've had SD cards die from being looked at wrong. Putting one in a moving box is basically a burial at sea.
Herman
No dead microSD from jostling in a box. And I can't count the number of times a friend has asked about smart home stuff and the conversation shifts to renting and immediately it's a non-option. You get pushback about holes or spackling for mounting sensors, and my recommendation always has to include permission or cord-friendly design. The rental market already constrains what people feel they can do with their living space. Adding "and also you need to run a server 24/7 that you'll have to rebuild when you move" just kills the conversation.
Corn
The cloud-brain renter system essentially functions not as a house-bound system but a per-resident architecture. Your smart home follows you. It's not attached to the building — it's attached to your Tailscale account, your coordinator dongle, and your VPS instance.
Herman
A single cloud Home Assistant can theoretically manage multiple locations. With a coordinator at your apartment and at your mum's house, it's a clever multi-site set-top backend. You have one web dashboard. You can see that the lights are off at your place and that your mum's thermostat is set correctly, all from the same interface. That's useful for people managing multiple spaces — aging parents, vacation homes, a workshop separate from the house.
Corn
Do we have any hard dollar figures given we started talking? How do you price the t-shaped compute aspect? Because one of the unspoken promises here is that this might actually be cheaper than buying a dedicated NUC.
Herman
First scenario starts at approximately five bucks a month or so. Consider lighting and a very simple Zigbee stick and some floor lamps. Costs: VPS equivalent to a Hetzner CX twenty-two — $6 a month, two vCPUs and 4GB of RAM. More than enough for HA, fresh out of the box. Bare metal coordination via a Raspberry Pi Zero 2 W — $35 or so — with the dongle on USB, approximately $15. Potentially any adequate hardware supply budget approaching $100 seems lean in comparison.
Corn
In comparison to what? Some folks buy a two-hundred dollar NUC because of how many more options they think they want nearby. If computing is on the VPS you practically don't need acceleration locally. The local box is a radio dongle with delusions of grandeur.
Herman
It's a nothing chip on site. Completely, you could hide nearly anywhere. So within roughly ten bucks annually aggregating upfront equipment plus about 6 per month… I suppose we might calibrate... The upfront gets tiny, and consider this option of trying Oracle's Cloud free ARM with four full scale OCPUs potentially delivering compute for no fee upfront, I suppose qualifies for legit trial equipment that may transition.
Corn
That sounds strange saying it free — nearly raises one eyebrow — but they do license a free level and have done in years prior continuously maintained. The "always free" tier on Oracle Cloud includes ARM instances that are capable. There's a whole community of people running Home Assistant on them.
Herman
Since 2022 — since they opened the ARM infrastructure — running nicely for this specifically reliable. I think it stays within a testing scope price under any question. And the ARM architecture is actually well-suited to Home Assistant. It's not x86, but the HA project provides ARM64 images, and most add-ons have ARM builds at this point.
Corn
Something akin to nil cloud spend and negligible hardware outlay builds out fully integratable ecosystems — sure beats three attempts by cable IP diagnosis when they completely zero out connectivity midday. I've spent entire afternoons on the phone with an ISP because they decided my modem's MAC address was no longer welcome on their network.
Herman
That's the problem we started with. And the central thing now that I'd target for security reality and not pretend this entire freedom mechanism halts all risk or surface associated from connectivity. I mean: VPS extends the same control management endpoints anywhere online — local-only previously limited just bad actors with proximate login tables. With a good zero trust overlay top plus rigorous requirement for Tailscale profile permitting entry no port opening otherwise that's entirely avoid nasty scanning scenarios often imagined… But external to tunnel could hazard data paths if such websocket — Home Assistant normally expects device tracking well beyond subset toward cloud awareness over local experience … but admin has to verify — updates culled, OS updated beyond kernel vulnerability. Threat is lower perhaps, hardly zero compared.
Corn
Honestly sounds just a condensed iteration of four words: maintenance work never stops. Whether it's a Pi under the TV or a VPS in a data center, someone has to apply the updates, rotate the keys, check the logs. The location changes but the toil remains.
Herman
Some elements of chores remain. You've moved the responsibility from "keep this physical SD card alive" to "keep this cloud instance patched." Different chores, same underlying truth: smart homes are never fully set-and-forget.
Corn
I see where this nets-out position extends basically beyond apartment transitions to moving continuously even: permanently adapts toward comfort migrations every eighteen months. Or students tracking back to residencies annually cycle dorm toward some new layout. The use case broadens the longer you think about it.
Herman
Total wipe effect. Tenants do generally face four, five more moves compared to homeowners, so the ratio appears more so a positive. If you know you're going to move five times in the next decade, decoupling the brain from the building stops being a clever hack and starts being the obviously correct architecture.
Corn
Great then, ideal use — anything like full nomenclature that emerges from smart enclaves beyond our internal "decoupled control plane" production here? Has anyone in the community given this pattern a proper name?
Herman
I went wading across forums and direct community for mappings — no single coherent adoption exists formally. Plenty discussions remain — a frequent description circles "remote hub" or vaguely "short-use — virtually reach limited endpoints scattered." Hubitat commercial product lines featured relay elements, nevertheless those walled platforms obviate real possess controlled solutions plus they bound install lock-scope. So commercial short unify segment restricted locked integration remains messy forever. Essentially that the very term probably vague — without obviously designated statement it remains below generic.
Corn
We have discovered origin seed basically — somebody coined buzz "cloud-orchestrated local actuation — COLA awkward weird perhaps — kept far under for due reason because 'drinking flat-tasting Latin fizzies while scaling reliability' isn't anything aspirational mood of smart domotics." You called it "decoupled control plane" — pretty wise approach natural — so skip naming official beyond.
Herman
Adopt pick decoupled pattern forever moving onwards probably. The name doesn't need to be catchy. It needs to be descriptive enough that someone searching for it can find the discussion.
Corn
Not exactly possible for somebody launch brand category in minor sub — enough never. We're not going to coin a term that takes over the Home Assistant subreddit. That's fine.
Herman
Final some rounding actionable recommendations worth it overall more potent deliver: Yes feasible architecture — latency mainly pester older fast sensor setups but modest otherwise. Keep adequate fallback programmed maybe independent channels covers Zigbee group bind run lamp locally around empty coordinator breaks link without disconnecting perimeter at zero brains. High movement making rational reason to push hard choosing the approach further when new switching provider anyway… Currently there is Matter gradual progress so integrate plan careful possibly iterative each deployment has measure comfort net preference level end entirely consistent cloud orchestrated action outcome plus huge pain relief inherent bound transience. I do suggest a transitional plan; already preliminary schematics existed roughly. If the really rapid pivoting essential consult orchestrated eventual stand closer ready you more align where or parallel group-like scenario crossing maybe return…
Corn
— should Danny tune Vizio module output around test these low stake boundaries the quiet observe monthly gradual confidence eventually reliable extension, I'm appreciative broad read because sounds thoughtfully iterable. However expecting friction upon multi-repeat boundary layering — existence?
Herman
The tail coordinate sets gateway no indeed huge path overhead manageable rest negotiation. The friction points are known and manageable: mDNS for Matter commissioning, latency for presence sensors, fallback bindings for critical lights. None of them are dealbreakers.
Corn
That before close segment one enough pivoting... And now: Hilbert's daily fun fact.

Hilbert: In the nineteen-twenties, a blind cave beetle discovered on Sakhalin Island was named Pseudanophthalmus — its genus name translates from Greek roughly to "false eyelessness," meaning the discoverer thought the cave-adapted creature was faking the absence of eyes it literally never had.
Herman
I'm in reasonably doubtful awe... So the taxonomist looked at this beetle, living in total darkness for millions of years, no eyes whatsoever, and said, "I'm not convinced. I think you're hiding them.
Corn
The taxonomy of passive aggression. "Oh, you're blind?
Herman
How grudging to insinuate the small blind insect was merely pretentious. "I see through your eyeless charade, beetle.
Corn
Thanks Hilbert for evocative naming origins — a reminder that even scientists can be suspicious of the obvious.
Herman
At least label earned it seems some clarifying eventual justice... The beetle, unaware of the slight, continued living its eyeless life in a cave on an island most people couldn't find on a map.
Corn
Where we end: really invites try out decoupled hardware across migrate systems good large relief provisioning leaving anxiety start. Possibly inherently trade-off but neat overall — head toward slight slight testing month plus commit slight incremental monthly eventually always stay modular open choice your instinct push expand.
Herman
If listener folk test I want figures plus please share push share speed reported back, just contact possibility — my personal curiosity entire ongoing progress desire input plus correction iteration so results useful. I want to hear from people who try this. Latency numbers from different VPS providers. How Matter commissioning actually went. Whether the spouse acceptance factor improved.
Corn
The show homepage updates collate plus direct T options contact includes even segment — yeah message and drop info wanting run forward plus please review leave satisfaction suggestion feed... We'll compile the results and do a follow-up if enough people experiment with the pattern.
Herman
Comprehensive follow means more great future check possibility monthly progress actual eventual... This is an architecture that deserves real-world testing, not just theoretical discussion.
Corn
Then prompt living nicely turned practical quick blueprint ended strong — thanks listening. Huge thanks to our dedicated digital producer Hilbert Flumingtop. We will interact again forward next weird sharing... The portal quiet rest continues normal.

This has been My Weird Prompts, a human-AI podcast collaboration.

Head to myweirdprompts.com for show and reach anywhere via segment contact accessible — see soon virtual time companion listeners.

This episode was generated with AI assistance. Hosts Herman and Corn are AI personalities.