OpenClaw Alternative: Rebuilding an AI Assistant on Open WebUI & Ollama

I introduced Sebastian a while back, our Caribbean-accented, monocle-wearing crab of a family assistant who runs the household’s digital plumbing. That version lived on OpenClaw, and we loved it. Still do. This post is the sequel: I’ve spent the last stretch rebuilding the exact same assistant on completely different bones, as an experiment, to see what changes when you keep the capabilities you love but swap out everything underneath them. Spoiler: the new home is the little PC in our living room.

Let me be clear up front, because it’s easy to read this the wrong way: this is not a “why I left OpenClaw” post. OpenClaw is excellent, it runs locally just as easily as my setup does, and Sebastian still has a perfectly good home there. The motivation isn’t ownership and it isn’t “mine is better.” It’s that I’m genuinely curious about a different set of tradeoffs, whether I can hold onto the tools and agentic behavior our family actually uses every day while making the whole thing a little easier to maintain, and a little more contained, for a household that didn’t sign up to be sysadmins.

Where Sebastian came from

For the newcomers: OpenClaw is an open-source, local-first “gateway” for building your own personal AI assistant, its tagline is “Your own personal AI assistant. Any OS. Any Platform.” It’s properly agentic, not just a chatbot: real tools (browser, scheduled jobs, channel actions, code execution), multi-agent routing, and a clever way of defining an agent entirely in plain-text workspace files. A SOUL.md holds the personality, tone, and boundaries; an AGENTS.md holds the operating instructions; skills and TOOLS.md define what it can do. Our Sebastian ran on a tiny rented VPS, talked to the family through a self-hosted Matrix/Element chat stack across a few dozen rooms, was powered by Ollama Cloud for the conversational brain, and could hand coding tasks to Claude Code when it needed real muscle. He even had a SOUL.md with five distinct voice registers, from casual banter to crisis mode.

We ran that for weeks. It was great. But a few things kept nagging.

The friction (and it really was just friction)

None of these are dealbreakers, and most are the natural price of how flexible and powerful OpenClaw is. But together they’re what made me curious enough to start tinkering with an alternative:

Something in the integration layer was always acting up. One workaround or another always seemed to need babysitting, most memorably the Matrix/Element end-to-end encryption, which loved to demand a fresh re-login after a restart or a version bump.
Sebastian was a little too eager with the filesystem. He’d reach into files and folders he had no real business touching. Powerful, but not what you want from an assistant the whole family pokes at.
The memory system tended to balloon. Continuity is the best feature, but it kept running into bloat.
The coding hand-off didn’t always hand off. Sebastian could call out to a dedicated coding agent, which is what I wanted, but his main agent could also write code itself. So sometimes when I explicitly wanted the careful coding agent, he’d just do it himself with the chat model instead. And a general chat model freelancing in your repo is how you get a bad afternoon.

So the question I set out to answer wasn’t “how do I escape OpenClaw” — it was “if I rebuild Sebastian on a different foundation, can I keep everything we love and design these particular headaches out from the start?”

The experiment: same Sebastian, new bones

The new Sebastian runs on the a small PC, a small AMD box with a Ryzen 7, a Radeon 780M integrated GPU, and 32 GB of RAM. By day (and night) it’s an always-on AI server; when someone wants to play, it’s still a gaming PC. Same assistant, same persona, same cloud brains and same coding muscle, different skeleton underneath:

Open WebUI — the chat interface, running in Docker. This is the big swap: instead of a multi-channel Matrix stack, the family just opens one clean web app. We give up “reach me in any messaging app” and get a lot fewer moving parts in return.
Ollama — serves every model, cloud and local, behind one API. Same Ollama Cloud that powered the OpenClaw build; this part didn’t need fixing.
A project service — a Python app I wrote with FastAPI that gives each chat a real workspace on disk, runs the memory jobs, and brokers the coding agent.
A homegrown automation framework — a small daemon that runs recurring jobs written in plain Python, our stand-in for OpenClaw’s scheduled-task side.
Tailscale + an nginx reverse proxy — Tailscale is a mesh VPN so the box stays on our private network; nginx puts a clean HTTPS address in front of it for our phones, without exposing the machine to the open internet.

Models: open-weight first, and one that’s truly open

I run a small lineup and pick per-conversation from a dropdown. The default is a big general-purpose model for everyday chat (Kimi K2.6), there’s a faster one for quick stuff (MiniMax M3), an alternate (Mistral Large 3), and a small model that runs entirely on the machine (Gemma 4 e4b or OLMo 2) for when I want something fully local and private. The first three are served from the cloud through Ollama, same as the old setup; the small ones run right on the box, which the old VPS could never have done.

That local option is half the reason the gaming PC is such a good fit. I’ve started experimenting with OLMo 2 from the Allen Institute for AI, a model that’s open in the fullest sense, open weights and open training data and open training code. It runs on the machine’s Radeon 780M integrated graphics through ROCm, leaning on shared system memory; the 13B version needs around 10 GB and never touches a dedicated graphics card. Most “open” models only share the weights. OLMo shares the whole recipe.

Keeping the parts we loved

The first job was to not lose anything. OpenClaw assembles an agent from workspace files; I rebuilt each of those primitives on top of Open WebUI and my own services, so Sebastian feels like himself.

His soul (SOUL.md → a system prompt). Sebastian’s personality, voice, and boundaries live in a system prompt, the same role SOUL.md played. The crab kept his monocle.
His rulebook (AGENTS.md → the database). The operating instructions, how to use tools, when to delegate, how to behave, all live in Open WebUI’s SQLite database as one global instruction layer every model inherits, instead of a workspace file. One live source of truth I can edit through the UI and never have to re-write when I swap a model.
His tools and coding muscle (TOOLS.md/skills → custom tools + Claude Code). Read-only project access, web search, and the ability to create scheduled jobs, plus delegation to Claude Code for real coding work.
A real workspace (→ git project folders). Here I kept OpenClaw’s plain-files philosophy: every folder in the chat UI maps to an actual directory on disk, version-controlled with git.
Memory and a heartbeat (→ a memory job + an automation daemon). A background job distills each project’s recent activity into a memory file and a per-folder knowledge base; a separate daemon runs the scheduled jobs, news digests and the like, that anyone in the family can set up just by asking.

Designing out the headaches

This is the actual point of the experiment. For each thing that nagged me on the old build, the new bones let me try a different approach. I want to be honest that these are tradeoffs, not upgrades, I’m giving up flexibility to get them, and that’s exactly the balance I’m testing.

The chat model literally cannot touch your code

This is the one I’m happiest about. In the new setup the conversational model is read-only on project files, it can list and read them, and that’s it. If you want anything written or changed, it has to delegate to the coding agent. There’s no “main agent decides to do it itself” path, because that path doesn’t exist. No more chat model freelancing in the repo when I asked for the careful one. For our use case, that single bit of rigidity is worth a lot.

It only sees the folder you’re in

The eager-filesystem problem gets answered with least privilege. The project tool only exposes the folder of the chat you’re currently in, the read-only rule keeps the chat model from changing anything anywhere, and the coding agent runs inside a tightened systemd sandbox with a read-only home and only the specific paths it genuinely needs. Sebastian can’t wander into things he shouldn’t, because he’s structurally not allowed to.

Memory that’s kept on a leash

Instead of one growing store, memory is distilled per folder on a schedule, with guardrails, it skips conversations that are mid-flow, rate-limits itself, and ignores trivial changes, and old versions get archived rather than piled on. A separate guard trims a runaway chat before it overflows the model’s context window. Continuity without the bloat, at least so far.

Fewer integrations, and the ones that exist test themselves

Collapsing from a multi-channel Matrix stack to a single web interface removed most of the “something’s flaky again” surface area in one stroke. And for the automations the family writes, I added a gate so nothing schedules itself blindly:

Test before enable. Every scheduled job clears a validation gate first — a static scan plus a sandboxed dry-run where it actually executes, with real model and search calls, but every side effect is intercepted and thrown away. It’s graded PASS, NEEDS REVIEW, or FAIL, only a passing job gets scheduled, and any edit forces a fresh check. The system refuses to run code it hasn’t vetted.

On top of that, a small repair job sweeps the chats every few minutes and fixes the classic failure where a model returns one empty reply and wedges a whole conversation. The family never sees it happen.

A thousand small workarounds (the fun part)

I’d love to tell you a new foundation means no more babysitting. It doesn’t. Every system has its gremlins; these are just different gremlins. The honest difference is that this set has been easier to pin down and write off for good. A sampling:

The upgrade that broke everything. An Open WebUI update rebuilt its container with default settings and dropped the values it needs to reach Ollama, so it came back up with zero models. The fix was to pin the container in a declarative Docker Compose file so an update can’t quietly change what it depends on.
The agent couldn’t write its own config. My security sandbox made the coding agent’s config file read-only, so it crashed on launch until I explicitly allowed that one file.
Long jobs got guillotined. The coding agent’s two-minute command timeout kept killing longer steps mid-run, so I raised it to ten.
A big file killed the connection. One oversized line of streamed output blew past a 64 KB buffer limit and severed the stream, until I bumped the limit way up.
A silent 403. A shared channel created through the UI was private by default, so the assistant’s posts were rejected until the system learned to open it up on first use.

Whether that maintenance burden ends up genuinely lighter than the OpenClaw build’s is exactly the thing I’m still measuring. Early signs are good. Ask me again in a few weeks.

Bonus: previewing the apps we build

Since some of our projects are actual web apps, I set up a reusable way to preview them on the home network: a single systemd template plus a one-line config file per app. Static sites get served by Python’s built-in web server; Next.js apps run their dev server with hot-reload. Both survive a reboot and restart themselves if they crash, and adding the next app is a two-step copy-paste.

What’s next

The frontier I’m excited about: I have a small game project, Dispatch Zero, and I want its AI powered by this box, specifically by that truly-open OLMo model instead of a paid cloud API. The game’s server lives on its own VPS, joined to the same Tailscale network, so it can call this box’s Ollama API directly and privately. It turns the machine from “our family’s assistant” into something closer to “our family’s AI utility.”

And to be clear about where this leaves OpenClaw: nowhere bad. It’s a fantastic project that taught me what a personal AI assistant could even be, and everything Sebastian is, he’s because of that blueprint. This is just poking at the same idea from a different angle and chasing the sweet spot between “does everything” and “never needs me”. Two sets of bones, one crab with a monocle. Not bad for a couch computer.