DEV Community

Lakshmi Sravya Vedantham
Lakshmi Sravya Vedantham

Posted on

I Built a Tool So Claude Code Can Use My Colab GPU

The Problem

I use Claude Code daily. It can read my files, run bash commands, edit code — but the moment I need GPU compute, I'm back to copy-pasting between my terminal and Google Colab like it's 2020.

I was working on a Videogen. The image generation runs locally on my Mac, but the model needs a GPU. Every time I wanted to test an image-to-video model, I had to:

  1. Open Colab
  2. Paste my code
  3. Run cells manually
  4. Copy results back
  5. Repeat 50 times

So I built claude-colab — a bridge that gives Claude Code direct GPU access through Google Colab.

How It Works

Colab (T4/A100 GPU)                     Your Machine
┌────────────────────┐                 ┌─────────────────┐
│ Flask API          │◄── HTTPS ──────►│ CLI / MCP Server│
│ E2E Encrypted      │  (cloudflared)  │                 │
│ Bearer token auth  │                 │ Claude Code     │
└────────────────────┘                 │ gains GPU tools │
                                       └─────────────────┘
Enter fullscreen mode Exit fullscreen mode

Three layers:

  1. Colab notebook — Flask API + Cloudflare tunnel on the GPU
  2. CLIpip install claude-colab, human-friendly commands
  3. MCP server — Claude Code sees GPU tools natively

3-Line Setup

pip install claude-colab
# Open the notebook in Colab, run all cells, copy the connection string
claude-colab connect cc://TOKEN:KEY@your-tunnel.trycloudflare.com
Enter fullscreen mode Exit fullscreen mode

That's it. Claude now has a GPU.

What Claude Can Do

claude-colab status                    # GPU info
claude-colab exec "nvidia-smi"         # Run any command
claude-colab python -c "import torch; print(torch.cuda.get_device_name(0))"
claude-colab upload model.py /content/model.py
claude-colab download /content/results.csv ./results.csv
Enter fullscreen mode Exit fullscreen mode

Or with the MCP server, Claude calls these as tools directly — no copy-paste, no browser tabs.

The Security Part

I didn't want my code traveling through Cloudflare in plaintext. Every request and response body is encrypted with Fernet (AES-128-CBC + HMAC-SHA256) before it hits the tunnel.

Who Can see Cannot see
Cloudflare URL paths, timing Your code, data, results
Google Everything on the VM Your local files
Random internet Nothing Everything

The connection string carries both a bearer token (auth) and an encryption key (privacy). Three ways to connect so it never lands in your shell history:

claude-colab connect cc://...          # Direct
claude-colab connect                    # Interactive prompt
pbpaste | claude-colab connect -        # Pipe from clipboard
Enter fullscreen mode Exit fullscreen mode

Real Usage: Testing Video Models

The reason I built this — I needed to test image-to-video models for StoryGen. With claude-colab connected to an A100, I asked Claude to:

  1. Upload a FLUX-generated scene image
  2. Install diffusers + transformers
  3. Load Wan2.1-I2V-14B (14 billion parameters)
  4. Generate an animated clip
  5. Download the result

All from one conversation. No browser. No manual steps.

> claude-colab exec "nvidia-smi --query-gpu=name,memory.total --format=csv"
NVIDIA A100-SXM4-80GB, 81920 MiB
Enter fullscreen mode Exit fullscreen mode

Wan2.1-14B loaded in full fp16, generated 33 frames in 3 minutes. That's the kind of workflow that was impossible before.

The Code

82 tests, 96% coverage, 6 source files:

src/claude_colab/
├── crypto.py       # Fernet E2E encryption
├── config.py       # URI parsing, secure config
├── client.py       # HTTP client (shared by CLI + MCP)
├── cli.py          # Click commands
└── mcp_server.py   # 5 MCP tools for Claude Code
Enter fullscreen mode Exit fullscreen mode

The entire local package has 4 dependencies: click, httpx, cryptography, mcp. No torch, no ML libs — those live on Colab.

Try It

pip install claude-colab
Enter fullscreen mode Exit fullscreen mode

Repo: github.com/LakshmiSravyaVedantham/claude-colab

Open the notebook in Colab, run all cells, paste the connection string. Your AI coding agent now has a GPU.

Top comments (2)

Collapse
 
apex_stack profile image
Apex Stack

The Cloudflare tunnel + Fernet encryption layer is a really smart security decision — most "connect X to Y" tools skip the encryption part entirely and just pray the tunnel is enough. The three-layer architecture (notebook API, CLI, MCP server) is clean too, especially keeping ML dependencies on the Colab side and the local package at just 4 deps.

I run a fleet of scheduled AI agents that do everything from auditing a 100K-page site to cross-posting content across platforms, and the one wall I keep hitting is exactly this: the moment a task needs heavy compute (like batch processing financial data through a local LLM), the agent has to bail out and I'm back to manual workflows. The MCP server approach where Claude sees GPU tools natively is the right abstraction — it means you can compose GPU tasks into larger agentic workflows without the agent needing to know it's remote compute.

Curious about session persistence — if the Colab runtime disconnects mid-pipeline (the classic T4 timeout), does the CLI or MCP server detect the broken tunnel gracefully, or does Claude just see a hanging request? For long-running generation tasks like your Wan2.1 example, that seems like the main failure mode to guard against.

Collapse
 
lakshmisravyavedantham profile image
Lakshmi Sravya Vedantham

Thanks! The Cloudflare + Fernet combo was a non-negotiable for me — didn't want to build another "just expose port 5000" solution.

On disconnections: right now it doesn't handle them gracefully. If the Colab runtime drops mid-run, the MCP call just times out and the agent gets a generic error. It's the biggest gap in the current implementation.

The right fix is probably a job queue with status polling submit job → poll for result) rather than blocking on a single connection. I haven't built that yet — would be a solid v2 addition. Good callout.