Use Claude Code CLI with local Qwen models on your Mac M1/M2/M3.
No cloud, no API fees, your code never leaves your machine.
All inference happens locally. Localhost-only binding with 93 security tests ensures your code never leaves your machine.
No subscription fees, no pay-per-token. Use your Mac's hardware for unlimited AI assistance.
Hardware-aware configuration for M1/M2/M3 Macs with Metal acceleration and thermal management.
Drop-in replacement for Claude Code. Works with existing workflows, just point to localhost.
Choose between Ollama or llama.cpp. Switch backends without changing your setup.
Real-time dashboard tracks performance, thermal behavior, and token throughput.
# Clone the repository
git clone https://github.com/kmesiab/qwenvert
cd qwenvert
# Install with pip
pip install -e .
# Detect hardware and download optimal model
qwenvert init
# Start the adapter + backend
qwenvert start
export ANTHROPIC_BASE_URL=http://localhost:8088
export ANTHROPIC_API_KEY=local-qwen
export ANTHROPIC_MODEL=qwenvert-default
# Start coding!
claude
First run downloads a 4-10GB model (one-time). Requires Python 3.9-3.12 and Mac M1/M2/M3.
Qwenvert is an HTTP adapter that translates between Claude Code and local LLM backends: