What Local AI Can Do
Local AI is opt-in and off by default. When enabled with Ollama or LM Studio, supported workloads run on-device instead of through the bundled cloud subscription.
- Memory embeddings and summary-tree building run on a local model.
- Background reasoning loops and reflective tasks stay on-device.
- Lightweight chat and classification tasks route locally when the workload hint matches.
- By default, heavy reasoning, agentic tasks, coding, vision, voice TTS, and web search still route to cloud providers.
Setup
Enable local AI in Settings, then choose a preset matching how much workload stays on-device.
- Path: Settings → AI & Skills → Local AI.
- Three presets: embeddings only; memory plus reflection; or full local where supported.
- Override provider and base URL with local_ai.provider and local_ai.base_url in config.toml.
- The default local model is Gemma 3 1B, which needs a few gigabytes of RAM and disk space.
Hardware and Tradeoffs
Local AI pays off when you have heavy email or chat ingestion, need offline summaries, or want privacy-sensitive background reflection. Skip it on limited hardware with only a few connected sources.
- Worth it: heavy mail or chat ingestion, offline summaries, privacy-sensitive reflection, or an existing GPU.
- Skip it: few connected sources, no GPU, limited RAM, or workflows locked to cloud features like vision and voice.
- Known issue: on Windows 11 with an existing Ollama server, the UI shows the Ollama URL as read-only. Edit config.toml directly.
- LM Studio support: check current release notes for confirmed vs. planned features.