Privacy-First Local AI
for Claude Code

Use Claude Code CLI with local Qwen models on your Mac M1/M2/M3.
No cloud, no API fees, your code never leaves your machine.

100%
Private
$0
API Fees
93
Security Tests

Why Qwenvert?

🔒

Privacy First

All inference happens locally. Localhost-only binding with 93 security tests ensures your code never leaves your machine.

💰

Zero API Costs

No subscription fees, no pay-per-token. Use your Mac's hardware for unlimited AI assistance.

Apple Silicon Optimized

Hardware-aware configuration for M1/M2/M3 Macs with Metal acceleration and thermal management.

🔄

Full API Compatibility

Drop-in replacement for Claude Code. Works with existing workflows, just point to localhost.

🎯

Multiple Backends

Choose between Ollama or llama.cpp. Switch backends without changing your setup.

📊

Built-in Monitoring

Real-time dashboard tracks performance, thermal behavior, and token throughput.

Quick Start

Note: Qwenvert is not yet published on PyPI. Install from source as shown below.
1

Installation

# Clone the repository
git clone https://github.com/kmesiab/qwenvert
cd qwenvert

# Install with pip
pip install -e .
2

Initialize & Start

# Detect hardware and download optimal model
qwenvert init

# Start the adapter + backend
qwenvert start
3

Configure Claude Code

export ANTHROPIC_BASE_URL=http://localhost:8088
export ANTHROPIC_API_KEY=local-qwen
export ANTHROPIC_MODEL=qwenvert-default

# Start coding!
claude

First run downloads a 4-10GB model (one-time). Requires Python 3.9-3.12 and Mac M1/M2/M3.

How It Works

Qwenvert is an HTTP adapter that translates between Claude Code and local LLM backends:

Claude Code CLI Your IDE
Qwenvert Adapter localhost:8088
Backend Ollama or llama.cpp
Qwen Model Local inference

What Qwenvert Does

  • Translates APIs: Converts Anthropic Messages API → Ollama/llama.cpp format
  • Validates Security: All URLs/hosts checked for localhost-only access (93 tests)
  • Manages Backends: Launches and monitors Ollama or llama.cpp servers
  • Streams Responses: Server-Sent Events for real-time token streaming
  • Optimizes Hardware: Auto-configures for your Mac's specs and thermal profile

Documentation