Guide / 2026
Self-Hosted AI Assistant: The 2026 Practical Guide
A practical guide to running your own AI assistant on your own machine in 2026. What it actually is, why people are moving off hosted chat products, what to look for, what hardware you need, and how to get a first one running in about two minutes.
What is a self-hosted AI assistant?
Definition
A self-hosted AI assistant is a program you run on your own machine that takes requests, calls AI models, remembers context across sessions, runs tools, and keeps all its data local. You own the process, the memory, and the credentials - no third party stores your conversations.
Self-hosted vs cloud AI assistant
A cloud product like a hosted chat app is a SaaS window into somebody else's model. Your messages, your memory, your integrations, and increasingly your files sit on their servers. You get convenience; they get the data.
A self-hosted AI assistant inverts that. You run a daemon - usually in Docker, sometimes as a systemd service - on a Mac, a Linux box, a Raspberry Pi, or a small VPS. It exposes a chat interface (web dashboard, Telegram bot, CLI) and talks to whichever model you pick. The model can be a hosted API (Claude, GPT) or a local one running through Ollama or llama.cpp.
Memory, scheduling, secrets, tool execution, logs - all of it sits on your side of the wire. Unplug the internet and the local-model path still works. If the vendor pivots or gets acquired, your setup does not break.
Why self-host your AI assistant in 2026?
Privacy by default
Conversations never leave your machine unless you send them somewhere. Medical notes, client data, half-written contracts, credentials - none of it ends up in someone else's logs. A self-hosted assistant with an encrypted vault is a different privacy surface than a hosted chat product with a retention policy written by a legal team.
Cost control
Hosted assistants bundle a model, a UI, a memory layer, and a bill. At $20 per month per product, three subscriptions is $60 that mostly pays for a frontend. Self-hosting lets you pay only the underlying model provider, and only for the tokens you actually use. Many people end up spending $3 to $8 per month on API calls plus a small home server they already owned.
Always-on automation
A chat window is ephemeral. A daemon is not. A self-hosted AI assistant can run scheduled tasks, watch a mailbox, react to a webhook, or post to Telegram at 7am. The assistant becomes a background worker, not a tab you reopen.
Data sovereignty and compliance
For EU users, for regulated industries, or for anyone who takes their own data seriously, location matters. Self-hosting on hardware you control removes entire categories of questions: cross-border transfer, sub-processor lists, vendor breach notifications. The data is in your house - literally or logically.
What to look for in a self-hosted AI assistant
Self-hosted does not automatically mean safe or good. Most projects in this space are weekend experiments wrapped in a README. Five things separate a serious self-hosted AI assistant from a toy:
- Encrypted storage for memory and credentials. API keys, tokens, long-term memory - all of it should be encrypted at rest with a key you control. Plaintext JSON files in a home directory are a non-starter. Look for an integrated vault, not an afterthought plugin.
- Sandboxed skill and plugin execution. The moment an assistant can run third-party code - Python scripts, shell commands, browser skills - that code becomes part of your attack surface. A good project executes skills in containers or WASM sandboxes, with explicit permissions, by default. Not optional, not a plugin.
- Model flexibility. The assistant should treat models as interchangeable backends: Claude, GPT, Gemini, Ollama local models, llama.cpp, whatever ships next year. Any project that hard-couples to one provider will age badly.
- Channel integrations. You need to reach it without opening a laptop. A Telegram or Signal bot, a CLI, a simple web dashboard over Tailscale. If the only interface is a local browser tab, you will stop using it.
- Audit posture. How many lines of code, how many dependencies, what is the CVE track record, how are new skills reviewed? A 430k-line codebase with an optional sandbox is not auditable by a normal developer. A smaller, opinionated project with a clear security model is.
Hardware requirements
The hardware question splits cleanly depending on whether you plan to run local models or only call hosted APIs.
Mac or Linux desktop
For most people, the box they already use is fine. A modern Mac (M-series) or a Linux desktop with 16 GB of RAM can run the assistant daemon plus a 7B or 8B local model comfortably. If you only call hosted APIs, 8 GB is enough and any machine from the last five years qualifies.
The advantage of a desktop host is that the assistant is there whenever the machine is on. The disadvantage is that you have to remember to leave the machine on, or accept that the assistant sleeps with it.
Raspberry Pi 5 or homelab
A Raspberry Pi 5 with 8 GB of RAM runs a self-hosted AI assistant daemon happily for the always-on case, as long as you delegate the heavy inference to a hosted model or to another machine on the network. Power draw is a few watts. For homelabs with a NAS or a small server, the assistant is one more Docker container that costs essentially nothing to run.
Good fit for automation-first setups: the assistant is reachable 24/7 over Telegram, it runs scheduled jobs, and the model calls go out over the internet only when needed.
VPS
A $5 to $10 per month VPS at Hetzner or DigitalOcean gives you a permanent home on the public internet. Good for developers who want a stable DNS entry, webhook endpoints, and a tidy CI/CD story. Less good from a privacy standpoint: the data is on someone else's machine, even if that machine is under your control.
If you go this route, encrypt the disk, turn on full-disk swap encryption, keep the assistant behind a VPN or authenticated reverse proxy, and do not store anything on the VPS you would not be comfortable losing.
Options in 2026
The space consolidated through 2025 and early 2026. Five projects worth knowing, with one-paragraph assessments each. For a focused comparison with the largest incumbent, see our detailed OpenClaw comparison.
ALF OS
Self-hosted AI assistant built around an encrypted credential vault, namespace-isolated skills, and always-on automation. CLI plus a Docker Compose stack, Telegram bot with voice, web dashboard, model-agnostic (Claude, GPT, Ollama, OpenRouter), installs with one curl command. Opinionated and MIT-licensed. What this site is about.
OpenClaw
The largest ecosystem - thousands of community skills and the most integrations. Has taken serious hits on security in 2026 (ClawHavoc, CVE-2026-25253), and the codebase at 430k lines is effectively unauditable for an individual developer. Still the default if you need a specific OpenClaw-only skill; otherwise, increasingly hard to recommend. See the detailed OpenClaw comparison for the full picture.
Nanobot
A deliberately minimalist project, roughly 4k lines of code. Easy to read end to end, which is a real security property. The tradeoff is that features like long-term memory, vault, and channel integrations are either absent or rough. Good base for a hacker who wants to extend. Not a finished product.
ZeroClaw
A Rust implementation with portability as the first principle. Runs well on ARM boards and low-power hardware, ships as a static binary, boots fast. Skill ecosystem is smaller and still mostly developer-facing. A good pick for Raspberry Pi setups where every MB of memory counts.
Moltis
Production-minded: proper metrics, structured logs, reasonable defaults, clear upgrade path. Targets small teams as much as individuals, which shows in the configuration surface - heavier to set up than the others, but holds up under load. Worth a look if you want a self-hosted AI assistant for a team of three to ten people.
How to install your first self-hosted AI assistant
The fastest way to make the above concrete is to install one. Walkthrough below uses ALF OS. Assumes Docker is installed and that you have either a hosted model API key (Claude or GPT) or an Ollama instance reachable on your network.
- Step 1 - Install the CLI
curl -fsSL https://install.alfos.ai | shThis fetches the binary, verifies the checksum, pulls the Docker image, and puts
alfon your PATH. No root, no system-wide services. - Step 2 - Initialise and launch
alf initWalks you through timezone, one model provider, an optional Telegram bot token, and creates the encrypted vault with a passphrase you pick. Starts the daemon automatically when init finishes - no separate start command. From there the local dashboard is up, and the Telegram bot is live if you configured one.
Two commands, same on a Mac laptop, a Linux desktop, a Raspberry Pi 5, or a $5 VPS. The daemon is identical; only the host changes.
Common pitfalls
Most bad self-hosted AI assistant setups fail in the same four ways. None of these are exotic. They are the boring basics that get skipped in a hurry.
- Exposing it to the public internet without auth. An AI assistant that can run shell commands and read your files is a remote code execution primitive with extra steps. If it is reachable on a public IP with no authentication, it will be found. Put it behind a VPN (Tailscale, WireGuard), a reverse proxy with auth, or a messaging channel like Telegram that does the auth for you.
- Running it as root. The assistant runs arbitrary model-generated commands. Root multiplies every mistake by the whole filesystem. Use a dedicated user, or - better - a container with a non-root user and a read-only root filesystem.
- Installing unaudited skills (the ClawHavoc lesson). The February 2026 ClawHavoc incident - 1,184 malicious skills, 9,000+ compromised installs in the OpenClaw ecosystem - happened because skill marketplaces shipped third-party code with no sandbox and no review. A self-hosted AI assistant is only as safe as the weakest skill you install. Pin versions, read the code, prefer sandboxed execution.
- Forgetting the vault key. The encrypted vault is useless if you write the passphrase on a sticky note, and catastrophic if you lose it - the API keys and credentials encrypted with that key go with it. Store the recovery material in a password manager, not on the same machine that runs the assistant.
FAQ
Is a self-hosted AI assistant free?
The software can be free and open source. You still pay for the hardware it runs on and, if you plug in a hosted model like Claude or GPT, for that API usage. If you run a local model via Ollama or llama.cpp, the marginal cost is electricity.
Do I need a GPU?
No. A self-hosted AI assistant is an orchestration layer. It can call hosted models over the network, or small local models on CPU. A GPU only matters if you want to run large local models at reasonable speed.
Is it safe to expose my self-hosted AI assistant on the internet?
Only if you put it behind authentication, keep it on a private network or VPN, and review the skills it runs. Most people should use a Telegram bot, Tailscale, or a reverse proxy with auth rather than opening a public port.
Can I use it from my phone?
Yes. The common patterns are a Telegram bot that talks to your assistant, or a web dashboard reachable over a VPN like Tailscale or WireGuard.
What happens to my data?
With a properly self-hosted assistant, conversations, memory and credentials stay on your machine. The only data that leaves is whatever you send to a hosted model provider, and only the prompt for that specific call.
How is a self-hosted AI assistant different from ChatGPT?
ChatGPT is a hosted chat product. A self-hosted AI assistant is a process you run that can remember past conversations, run on a schedule, call tools, and use any model you plug in. It is closer to a personal automation daemon than to a chat window.
Install a self-hosted AI assistant in under two minutes
ALF OS is free, self-hosted, and opinionated. Encrypted vault by default, scoped-permission skills, Telegram bot, runs on Mac, Linux, Raspberry Pi, or a VPS.
curl -fsSL https://install.alfos.ai | sh