DEV Community

Cover image for Helix AI Studio v2.0: 7 AI Providers, Pipeline, and CrewAI in One Self-Hosted App
Tsunamayo
Tsunamayo

Posted on

Helix AI Studio v2.0: 7 AI Providers, Pipeline, and CrewAI in One Self-Hosted App

TL;DR

I rebuilt my self-hosted AI chat app from the ground up. Helix AI Studio v2.0 now connects 7 AI providers, runs a 3-step automated pipeline (Plan → Execute → Final Answer), and supports CrewAI multi-agent teams — all in a single lightweight web UI you can run entirely on your own hardware.

Live Demo | GitHub | MIT License


Why I Built This

I was tired of switching between ChatGPT, Claude, Ollama’s terminal, and various other AI tools throughout my day. I wanted one UI that could talk to all of them.

The first version was a good start, but as I kept using it daily, I realized the app needed to go beyond just “chat with multiple providers.” I needed:

  • Automated workflows — not just Q&A, but multi-step task execution
  • Team-based AI — multiple agents collaborating on complex problems
  • CLI integration — using Claude Code, Codex, and Gemini CLI directly from the web UI

So I rebuilt it. Here’s what v2.0 looks like.


What’s New in v2.0

1. 3-Step Pipeline: Plan → Execute → Final Answer

Instead of just sending a prompt and getting a response, v2.0 can run an automated pipeline:

Step 1: Plan — A cloud/CLI model analyzes your task and generates a plan
Step 2: Execute — A local model (or CrewAI team) executes the plan
Step 3: Final Answer — A cloud/CLI model verifies results and delivers the answer
Enter fullscreen mode Exit fullscreen mode

Pipeline Demo

Different models are good at different things. A powerful cloud model like Claude can create an excellent plan, a fast local model can do the heavy lifting, and then Claude can verify the output. You get cloud-quality reasoning with local-model execution.

2. CrewAI Multi-Agent Teams

v2.0 integrates CrewAI for multi-agent collaboration, running entirely on local models via Ollama. Three preset teams are ready to go:

  • dev_team — for coding tasks (architect, developer, reviewer)
  • research_team — for research and analysis
  • writing_team — for content creation

Each agent can use a different model, and the system estimates VRAM usage so you know if your GPU can handle it. This is all Ollama-only — no cloud API costs.

3. CLI Agent Integration

v2.0 can use Claude Code CLI, Codex CLI, and Gemini CLI as providers, directly from the web UI.

Provider Switch

The CLI tools are auto-detected. If you have them installed, they appear in the provider dropdown. If not, they’re hidden.


The Full Feature Set

7 AI Providers in One UI

Provider Method Streaming
Ollama HTTP API (localhost) Yes
Claude API Anthropic SDK Yes
OpenAI API OpenAI SDK Yes
vLLM / llama.cpp / LM Studio OpenAI-compatible API Yes
Claude Code CLI claude -p Pseudo
Codex CLI codex exec Pseudo
Gemini CLI gemini -p Pseudo

Streaming Demo

RAG Knowledge Base

  • Docling Parser for PDF, Office docs, and images
  • Hybrid search — dense vector + BM25 sparse + RRF fusion
  • TEI Reranker (bge-reranker-v2-m3) for precision re-scoring
  • Ollama embedding — runs locally, zero API cost

Mem0 Shared Memory

Persistent, cross-session memory backed by Qdrant. The memory is shared across tools — Claude Code CLI, Codex CLI, and Open WebUI all read from the same Qdrant collection.

Web Search

Click the search button or let the LLM decide on its own when it needs current information.

Search Demo


Tech Stack

  • Backend: FastAPI + Python 3.12
  • Frontend: Jinja2 templates + Tailwind CSS + Alpine.js (no React, no build step)
  • Database: SQLite (chat history) + Qdrant (vectors)
  • Streaming: WebSocket
  • Deployment: Docker Compose or bare metal

Getting Started

One-Click Deploy (Free)

Deploy to Render

Or try the Live Demo directly.

Local Install

git clone https://github.com/tsunamayo7/helix-ai-studio.git
cd helix-ai-studio
uv sync
uv run python run.py
Enter fullscreen mode Exit fullscreen mode

Open http://localhost:8504.

Docker Compose (Full Stack)

git clone https://github.com/tsunamayo7/helix-ai-studio.git
cd helix-ai-studio
docker compose up -d
Enter fullscreen mode Exit fullscreen mode

100% Self-Hosted

Every feature can run entirely on your hardware. Ollama for inference, Qdrant for vectors, SQLite for history. You can add cloud APIs when you want, but the baseline is fully local. No vendor lock-in.


Try It Out

Live Demo — no setup needed.

GitHub — star the repo if you find it useful.

App Tour

If you’re building something similar or have questions about the architecture, drop a comment below. And if you find Helix useful, a star on GitHub really helps with visibility.

Thanks for reading!

Top comments (0)