What Local AI Can Do

OpenHuman's local AI feature routes selected workloads to Ollama or LM Studio on your machine instead of cloud providers. It is opt-in and off by default. When enabled, memory embeddings, summary-tree building, and lightweight chat run locally. Heavy reasoning, coding, vision, and voice still route to cloud frontier models unless you explicitly configure otherwise.

Local: memory embeddings, summary-tree building, background reflection, lightweight chat, classification.
Cloud (default): heavy reasoning, coding, agentic tasks, vision, ElevenLabs TTS, web search.
Default model: Gemma 3 1B, requiring ~2 GB RAM and a few GB disk space.
Override: set local_ai.provider and local_ai.base_url in config.toml for custom setups.

Setup

Enable local AI in Settings, then choose a preset matching how much workload stays on-device.

Path: Settings → AI & Skills → Local AI.
Three presets: embeddings only; memory plus reflection; or full local where supported.
Override provider and base URL with local_ai.provider and local_ai.base_url in config.toml.
The default local model is Gemma 3 1B, which needs a few gigabytes of RAM and disk space.

Hardware and Tradeoffs

Local AI pays off when you have heavy email or chat ingestion, need offline summaries, or want privacy-sensitive background reflection. Skip it on limited hardware with only a few connected sources.

Worth it: heavy mail or chat ingestion, offline summaries, privacy-sensitive reflection, or an existing GPU.
Skip it: few connected sources, no GPU, limited RAM, or workflows locked to cloud features like vision and voice.
Known issue: on Windows 11 with an existing Ollama server, the UI shows the Ollama URL as read-only. Edit config.toml directly.
LM Studio support: check current release notes for confirmed vs. planned features.

OpenHuman Local AI

What Local AI Can Do

Setup

Hardware and Tradeoffs

Related Guides

Local LLM vs Cloud LLM

AI Persistent Memory

OpenHuman Setup Guide