Lakshmi Sravya Vedantham

Posted on Mar 21

I Built a Tool So Claude Code Can Use My Colab GPU

#ai #python #opensource #gpu

The Problem

I use Claude Code daily. It can read my files, run bash commands, edit code — but the moment I need GPU compute, I'm back to copy-pasting between my terminal and Google Colab like it's 2020.

I was working on a Videogen. The image generation runs locally on my Mac, but the model needs a GPU. Every time I wanted to test an image-to-video model, I had to:

Open Colab
Paste my code
Run cells manually
Copy results back
Repeat 50 times

So I built claude-colab — a bridge that gives Claude Code direct GPU access through Google Colab.

How It Works

Colab (T4/A100 GPU)                     Your Machine
┌────────────────────┐                 ┌─────────────────┐
│ Flask API          │◄── HTTPS ──────►│ CLI / MCP Server│
│ E2E Encrypted      │  (cloudflared)  │                 │
│ Bearer token auth  │                 │ Claude Code     │
└────────────────────┘                 │ gains GPU tools │
                                       └─────────────────┘

Three layers:

Colab notebook — Flask API + Cloudflare tunnel on the GPU
CLI — pip install claude-colab, human-friendly commands
MCP server — Claude Code sees GPU tools natively

3-Line Setup

pip install claude-colab
# Open the notebook in Colab, run all cells, copy the connection string
claude-colab connect cc://TOKEN:KEY@your-tunnel.trycloudflare.com

That's it. Claude now has a GPU.

What Claude Can Do

claude-colab status                    # GPU info
claude-colab exec "nvidia-smi"         # Run any command
claude-colab python -c "import torch; print(torch.cuda.get_device_name(0))"
claude-colab upload model.py /content/model.py
claude-colab download /content/results.csv ./results.csv

Or with the MCP server, Claude calls these as tools directly — no copy-paste, no browser tabs.

The Security Part

I didn't want my code traveling through Cloudflare in plaintext. Every request and response body is encrypted with Fernet (AES-128-CBC + HMAC-SHA256) before it hits the tunnel.

Who	Can see	Cannot see
Cloudflare	URL paths, timing	Your code, data, results
Google	Everything on the VM	Your local files
Random internet	Nothing	Everything

The connection string carries both a bearer token (auth) and an encryption key (privacy). Three ways to connect so it never lands in your shell history:

claude-colab connect cc://...          # Direct
claude-colab connect                    # Interactive prompt
pbpaste | claude-colab connect -        # Pipe from clipboard

Real Usage: Testing Video Models

The reason I built this — I needed to test image-to-video models for StoryGen. With claude-colab connected to an A100, I asked Claude to:

Upload a FLUX-generated scene image
Install diffusers + transformers
Load Wan2.1-I2V-14B (14 billion parameters)
Generate an animated clip
Download the result

All from one conversation. No browser. No manual steps.

> claude-colab exec "nvidia-smi --query-gpu=name,memory.total --format=csv"
NVIDIA A100-SXM4-80GB, 81920 MiB

Wan2.1-14B loaded in full fp16, generated 33 frames in 3 minutes. That's the kind of workflow that was impossible before.

The Code

82 tests, 96% coverage, 6 source files:

src/claude_colab/
├── crypto.py       # Fernet E2E encryption
├── config.py       # URI parsing, secure config
├── client.py       # HTTP client (shared by CLI + MCP)
├── cli.py          # Click commands
└── mcp_server.py   # 5 MCP tools for Claude Code

The entire local package has 4 dependencies: click, httpx, cryptography, mcp. No torch, no ML libs — those live on Colab.

Try It

pip install claude-colab

Repo: github.com/LakshmiSravyaVedantham/claude-colab

Open the notebook in Colab, run all cells, paste the connection string. Your AI coding agent now has a GPU.

Top comments (2)

Apex Stack • Mar 22

The Cloudflare tunnel + Fernet encryption layer is a really smart security decision — most "connect X to Y" tools skip the encryption part entirely and just pray the tunnel is enough. The three-layer architecture (notebook API, CLI, MCP server) is clean too, especially keeping ML dependencies on the Colab side and the local package at just 4 deps.

I run a fleet of scheduled AI agents that do everything from auditing a 100K-page site to cross-posting content across platforms, and the one wall I keep hitting is exactly this: the moment a task needs heavy compute (like batch processing financial data through a local LLM), the agent has to bail out and I'm back to manual workflows. The MCP server approach where Claude sees GPU tools natively is the right abstraction — it means you can compose GPU tasks into larger agentic workflows without the agent needing to know it's remote compute.

Curious about session persistence — if the Colab runtime disconnects mid-pipeline (the classic T4 timeout), does the CLI or MCP server detect the broken tunnel gracefully, or does Claude just see a hanging request? For long-running generation tasks like your Wan2.1 example, that seems like the main failure mode to guard against.

Lakshmi Sravya Vedantham • Mar 27

Thanks! The Cloudflare + Fernet combo was a non-negotiable for me — didn't want to build another "just expose port 5000" solution.

On disconnections: right now it doesn't handle them gracefully. If the Colab runtime drops mid-run, the MCP call just times out and the agent gets a generic error. It's the biggest gap in the current implementation.

The right fix is probably a job queue with status polling submit job → poll for result) rather than blocking on a single connection. I haven't built that yet — would be a solid v2 addition. Good callout.