TL;DR
I rebuilt my self-hosted AI chat app from the ground up. Helix AI Studio v2.0 now connects 7 AI providers, runs a 3-step automated pipeline (Plan → Execute → Final Answer), and supports CrewAI multi-agent teams — all in a single lightweight web UI you can run entirely on your own hardware.
Live Demo | GitHub | MIT License
Why I Built This
I was tired of switching between ChatGPT, Claude, Ollama’s terminal, and various other AI tools throughout my day. I wanted one UI that could talk to all of them.
The first version was a good start, but as I kept using it daily, I realized the app needed to go beyond just “chat with multiple providers.” I needed:
- Automated workflows — not just Q&A, but multi-step task execution
- Team-based AI — multiple agents collaborating on complex problems
- CLI integration — using Claude Code, Codex, and Gemini CLI directly from the web UI
So I rebuilt it. Here’s what v2.0 looks like.
What’s New in v2.0
1. 3-Step Pipeline: Plan → Execute → Final Answer
Instead of just sending a prompt and getting a response, v2.0 can run an automated pipeline:
Step 1: Plan — A cloud/CLI model analyzes your task and generates a plan
Step 2: Execute — A local model (or CrewAI team) executes the plan
Step 3: Final Answer — A cloud/CLI model verifies results and delivers the answer
Different models are good at different things. A powerful cloud model like Claude can create an excellent plan, a fast local model can do the heavy lifting, and then Claude can verify the output. You get cloud-quality reasoning with local-model execution.
2. CrewAI Multi-Agent Teams
v2.0 integrates CrewAI for multi-agent collaboration, running entirely on local models via Ollama. Three preset teams are ready to go:
- dev_team — for coding tasks (architect, developer, reviewer)
- research_team — for research and analysis
- writing_team — for content creation
Each agent can use a different model, and the system estimates VRAM usage so you know if your GPU can handle it. This is all Ollama-only — no cloud API costs.
3. CLI Agent Integration
v2.0 can use Claude Code CLI, Codex CLI, and Gemini CLI as providers, directly from the web UI.
The CLI tools are auto-detected. If you have them installed, they appear in the provider dropdown. If not, they’re hidden.
The Full Feature Set
7 AI Providers in One UI
| Provider | Method | Streaming |
|---|---|---|
| Ollama | HTTP API (localhost) | Yes |
| Claude API | Anthropic SDK | Yes |
| OpenAI API | OpenAI SDK | Yes |
| vLLM / llama.cpp / LM Studio | OpenAI-compatible API | Yes |
| Claude Code CLI | claude -p | Pseudo |
| Codex CLI | codex exec | Pseudo |
| Gemini CLI | gemini -p | Pseudo |
RAG Knowledge Base
- Docling Parser for PDF, Office docs, and images
- Hybrid search — dense vector + BM25 sparse + RRF fusion
- TEI Reranker (bge-reranker-v2-m3) for precision re-scoring
- Ollama embedding — runs locally, zero API cost
Mem0 Shared Memory
Persistent, cross-session memory backed by Qdrant. The memory is shared across tools — Claude Code CLI, Codex CLI, and Open WebUI all read from the same Qdrant collection.
Web Search
Click the search button or let the LLM decide on its own when it needs current information.
Tech Stack
- Backend: FastAPI + Python 3.12
- Frontend: Jinja2 templates + Tailwind CSS + Alpine.js (no React, no build step)
- Database: SQLite (chat history) + Qdrant (vectors)
- Streaming: WebSocket
- Deployment: Docker Compose or bare metal
Getting Started
One-Click Deploy (Free)
Or try the Live Demo directly.
Local Install
git clone https://github.com/tsunamayo7/helix-ai-studio.git
cd helix-ai-studio
uv sync
uv run python run.py
Open http://localhost:8504.
Docker Compose (Full Stack)
git clone https://github.com/tsunamayo7/helix-ai-studio.git
cd helix-ai-studio
docker compose up -d
100% Self-Hosted
Every feature can run entirely on your hardware. Ollama for inference, Qdrant for vectors, SQLite for history. You can add cloud APIs when you want, but the baseline is fully local. No vendor lock-in.
Try It Out
Live Demo — no setup needed.
GitHub — star the repo if you find it useful.
If you’re building something similar or have questions about the architecture, drop a comment below. And if you find Helix useful, a star on GitHub really helps with visibility.
Thanks for reading!





Top comments (0)