Skip to content

KimRasak/Qwopus3.5-27B-v3-deployment

Repository files navigation

Qwopus3.5-27B-v3-deployment

Scripts for deploying Qwopus3.5-27B-v3 with vLLM + Anthropic API proxy, enabling Claude Code to use the model as a drop-in replacement.

Create Conda env

conda create -p <path-to-conda-env> python=3.11
conda activate <path-to-conda-env>

Conda Environment Dependencies

Install the requirements.

Installation

Download weight files to <path-to-weight>, from which you see:

- <path-to-weight>
  - .gitattributes
  - README.md
  - chat_template.jinja
  - config.json
  - model.safetensors-00001-of-00012.safetensors
  - ...

Start vllm and proxy

Start a command tab.

conda activate <path-to-conda-env>
bash start_vllm.sh

Start another command tab.

conda activate <path-to-conda-env>
bash start_proxy.sh

Edit ~/.claude/settings.json (fill in your IP).

{
  "env": {
    "ANTHROPIC_BASE_URL": "http://<your-ip>:8801",
    "ANTHROPIC_AUTH_TOKEN": "sk-placeholder",
    "ANTHROPIC_MODEL": "Qwopus3.5-27B-v3"
  }
}

Configuration

start_vllm.sh

Variable Default Description
CONDA_ENV Path to conda environment
MODEL Path to model weight directory
CUDA_VISIBLE_DEVICES 1,2 GPU device IDs to use
PORT 8767 vLLM server port
TP 2 Tensor parallel size
MAX_LEN 200000 Max model context length

start_proxy.sh

Variable Default Description
VLLM_URL http://localhost:8767 vLLM backend URL
PROXY_PORT 8801 Anthropic proxy listen port
MODEL_NAME Qwopus3.5-27B-v3 Served model name

About

guide for deploying Jackrong/Qwopus3.5-27B-v3

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors