A client once shipped a "smart" light switch that asked a server, somewhere, whether you'd pressed it. Press, half-second pause, light. They'd put a cloud round-trip in the path of flicking a switch. Then they wondered why the reviews were brutal.
That's the whole problem with consumer IoT in one anecdote. There are twenty-odd internet-connected things in my house — I counted — and most are connected, not smart. The distinction isn't pedantic, it's the entire product. A connected device streams data somewhere. A smart device decides close to where the data is born. The gap between the two is one architectural choice, and teams botch it more often than not.
So I won't catalogue the market. I'll give you the decision: for any given feature, does the inference run on the device or in the cloud? Get that right and the downstream product problems mostly evaporate. Get it wrong and no amount of UX polish saves you.
Three things the boundary controls
The first IoT wave, a decade back, was cloud-everything — devices were sensors, the cloud did the thinking, the device was a dumb terminal. Edge AI moved that line by putting the model on the device. Where you draw it sets three things.
Availability. A thermostat that only acts once it reaches a server isn't a thermostat. It's a remote control with extra steps. Local inference keeps the device useful when the network drops. Anything that has to work in a power cut or a dead spot, the model lives on the device.
Privacy. Every byte that leaves the device is a byte someone now has to govern. On-device keyword spotting means your speaker sends the intentional query upstream, not the ambient sound of your kitchen. Your privacy posture is set by the data-flow architecture. The policy just describes it after the fact.
Latency. A cloud round-trip is hundreds of milliseconds. Local inference is effectively instant. For a fall-detection pendant, a smoke alarm, a baby monitor, that interval is the difference between a feature and a safety claim you can't legally make.
The rule
Run inference where the data is. Move it only when you can name the reason — the model's too big for the device, you genuinely need to aggregate across many devices, or the task truly tolerates latency and a network dependency. Otherwise, local. That's it.
Held against that rule, the good products are obvious. A doorbell that recognises familiar faces on the device and never transmits the face data. A hearing aid running models that pull speech out of noise and re-tune dozens of times a second, all locally. A leak sensor running a trivial model in the basement that fires one clear alert. None of them announce their intelligence. They just sit on the correct side of the line.
The failures are equally legible, and they're nearly always boundary errors dressed up as features:
- The light switch above. A cloud dependency on a task with zero cloud justification.
- Surveillance rebranded as convenience — devices hoovering up far more than the feature needs, governed loosely. If the privacy policy is longer than the manual, the architecture is telling on itself.
- Notification volume tuned to an engagement dashboard instead of the user. A device that pings you every time it does its job has confused activity with value.
- Twelve "AI programmes" on a washing machine that all resolve to the same cycle. Capability theatre.
The bill that sinks projects
Here's what doesn't fit on the roadmap and kills projects anyway. Shipping a model to the edge is the start of an obligation, not the end. You now own model updates across a fleet of devices with different hardware, in homes you can't reach, half of them offline when you push. You own the awkward fact that a model fine on last year's silicon may not fit next year's cost-reduced bill of materials. You own graceful degradation when the model won't load.
Cloud inference makes those somebody else's problem — you update one endpoint. That convenience is exactly why teams over-centralise, and exactly why so many "smart" devices end up brittle and chatty. The edge is the right default for the three properties above, but only if you've budgeted for the lifecycle that comes with it. Most roadmaps budget the model and forget the fleet.
The version worth building is a home that's slightly more careful on your behalf — catches the leak before the floor's ruined, warms the room before you wake, notices the relative who hasn't moved since yesterday. Almost all of it decided locally, almost none of it leaving the building. That's reachable. But it's an architecture decision before it's a product one. Decide where each model runs, and why, before you write a line of the app.