Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 19 additions & 3 deletions .env.example
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,20 @@
# cp .env.example .env

# ========================
# REQUIRED
# REQUIRED: provider credential
# ========================
# Phantom defaults to Anthropic. Set ANTHROPIC_API_KEY for the default setup.
# To use a different provider (Z.AI, OpenRouter, Ollama, vLLM, LiteLLM, custom),
# configure the `provider:` block in phantom.yaml and set the matching env var
# below. See docs/providers.md for the full reference.

# Your Anthropic API key (starts with sk-ant-)
ANTHROPIC_API_KEY=

# Alternative provider keys (set one that matches your provider block in phantom.yaml):
# ZAI_API_KEY=
# OPENROUTER_API_KEY=
# LITELLM_KEY=

# ========================
# OPTIONAL: Slack
# ========================
Expand Down Expand Up @@ -36,12 +44,20 @@ ANTHROPIC_API_KEY=
# Agent role (default: swe). Options: swe, base
# PHANTOM_ROLE=swe

# Claude model for the agent brain.
# Model for the agent brain. Keep a Claude model ID here even when using a
# non-Anthropic provider: the bundled cli.js has hardcoded capability checks
# against Claude model names. Use `provider.model_mappings` in phantom.yaml
# to redirect the wire call to your actual model (e.g., glm-5.1).
# Options:
# claude-sonnet-4-6 - Fast, capable, lower cost (default, recommended)
# claude-opus-4-6 - Most capable, higher cost
# PHANTOM_MODEL=claude-sonnet-4-6

# Provider override via env var (alternative to editing phantom.yaml).
# Options: anthropic (default), zai, openrouter, vllm, ollama, litellm, custom
# PHANTOM_PROVIDER_TYPE=anthropic
# PHANTOM_PROVIDER_BASE_URL=

# Domain for public URL (e.g., ghostwright.dev)
# When set with PHANTOM_NAME, derives public URL as https://<name>.<domain>
# PHANTOM_DOMAIN=
Expand Down
6 changes: 3 additions & 3 deletions CLAUDE.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
# Phantom

Phantom is an autonomous AI co-worker that runs as a persistent Bun process on a VM. It wraps the Claude Agent SDK (Opus 4.6), maintains vector-backed memory across sessions, rewrites its own configuration through a validated self-evolution engine, communicates via Slack/Telegram/Email/Webhook, and exposes all capabilities as an MCP server. 27,000+ lines of TypeScript, 822 tests, v0.18.2. Apache 2.0, repo at ghostwright/phantom.
Phantom is an autonomous AI co-worker that runs as a persistent Bun process on a VM. It wraps the Claude Agent SDK as a subprocess (Anthropic by default, swappable via a `provider:` config block to Z.AI/GLM-5.1, OpenRouter, Ollama, vLLM, LiteLLM, or any Anthropic Messages API compatible endpoint). It maintains vector-backed memory across sessions, rewrites its own configuration through a validated self-evolution engine, communicates via Slack/Telegram/Email/Webhook, and exposes all capabilities as an MCP server. 27,000+ lines of TypeScript, 875 tests, v0.18.2. Apache 2.0, repo at ghostwright/phantom.

## Tech Stack

| Layer | Technology |
|-------|-----------|
| Runtime | Bun (TypeScript-native, built-in SQLite, no bundler) |
| Agent | Claude Agent SDK (`@anthropic-ai/claude-agent-sdk`) with Opus 4.6, 1M context |
| Agent | Claude Agent SDK (`@anthropic-ai/claude-agent-sdk`) subprocess. Provider is configurable via `src/config/providers.ts`: Anthropic (default), Z.AI, OpenRouter, Ollama, vLLM, LiteLLM, custom. |
| Memory | Qdrant (vector DB, Docker) + Ollama (nomic-embed-text, local embeddings) |
| State | SQLite via Bun (sessions, tasks, metrics, evolution versions, scheduled jobs) |
| Channels | Slack (Socket Mode, primary), Telegram (long polling), Email (IMAP/SMTP), Webhook (HMAC-SHA256), CLI |
Expand Down Expand Up @@ -41,7 +41,7 @@ If you find yourself writing a function that does something the agent can do bet

```bash
bun install # Install dependencies
bun test # Run 770 tests
bun test # Run 875 tests
bun run src/index.ts # Start the server
bun run src/cli/main.ts init --yes # Initialize config (reads env vars)
bun run src/cli/main.ts doctor # Check all subsystems
Expand Down
31 changes: 30 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@

<p align="center">
<a href="LICENSE"><img src="https://img.shields.io/badge/license-Apache%202.0-blue.svg" alt="License"></a>
<img src="https://img.shields.io/badge/tests-822%20passed-brightgreen.svg" alt="Tests">
<img src="https://img.shields.io/badge/tests-875%20passed-brightgreen.svg" alt="Tests">
<a href="https://hub.docker.com/r/ghostwright/phantom"><img src="https://img.shields.io/docker/pulls/ghostwright/phantom.svg" alt="Docker Pulls"></a>
<img src="https://img.shields.io/badge/version-0.18.2-orange.svg" alt="Version">
</p>
Expand Down Expand Up @@ -75,6 +75,34 @@ A Phantom discovered [Vigil](https://github.com/baudsmithstudios/vigil), a light

This is what happens when you give an AI its own computer.

## Bring Your Own Model

Phantom is not locked to any single AI backend. It ships with support for seven providers out of the box, configured through a single YAML block:

- **Anthropic** (default) - Claude Opus, Sonnet, Haiku
- **Z.AI** - GLM-5.1 and GLM-4.5-Air via [Z.AI's Anthropic-compatible API](https://docs.z.ai/guides/llm/glm-5). Roughly 15x cheaper than Claude Opus for comparable coding quality.
- **OpenRouter** - 100+ models through one key
- **Ollama** - Any GGUF model on your own GPU, zero API cost
- **vLLM** - Self-hosted inference with OpenAI-compatible endpoints
- **LiteLLM** - Local proxy bridging OpenAI, Gemini, and more
- **Custom** - Any Anthropic Messages API compatible endpoint

Switching providers is two lines of YAML:

```yaml
# phantom.yaml
model: claude-sonnet-4-6
provider:
type: zai
api_key_env: ZAI_API_KEY
model_mappings:
sonnet: glm-5.1
```

Set `ZAI_API_KEY` in `.env`, restart, done. Both the main agent and every evolution judge flow through the chosen provider from that point on. The tools are the same, the memory is the same, the self-evolution pipeline is the same. Only the brain changes.

Anthropic stays the default. Existing deployments continue to work with no configuration changes. See [docs/providers.md](docs/providers.md) for the full reference.

## Quick Start

### Docker (recommended)
Expand Down Expand Up @@ -186,6 +214,7 @@ Because the agent that can only use pre-built tools hits a ceiling. Phantom buil
| Feature | Why it matters |
|---------|----------------|
| **Its own computer** | Your laptop stays yours. The agent installs software, runs 24/7, and builds infrastructure on its own machine. |
| **Bring your own model** | Anthropic, Z.AI (GLM-5.1), OpenRouter, Ollama, vLLM, LiteLLM, or any Anthropic Messages API compatible endpoint. Pick your backend in YAML, same agent everywhere. |
| **Self-evolution** | The agent rewrites its own config after every session, validated by LLM judges. Day 30 knows things Day 1 didn't. |
| **Persistent memory** | Three tiers of vector memory. Mention something on Monday, it uses it on Wednesday. No re-explaining. |
| **Dynamic tools** | Creates and registers its own MCP tools at runtime. Tools survive restarts and work across sessions. |
Expand Down
27 changes: 27 additions & 0 deletions config/phantom.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -16,3 +16,30 @@ timeout_minutes: 240
# url: https://data.ghostwright.dev/mcp
# token: "bearer-token-for-data"
# description: "Data Analyst Phantom"

# Provider selection. Defaults to Anthropic. Uncomment and customize to use
# a different backend. Both the main agent and every LLM judge flow through
# the chosen provider; authentication happens at the env var named below.
#
# provider:
# type: anthropic # anthropic | zai | openrouter | vllm | ollama | litellm | custom
# # api_key_env: ANTHROPIC_API_KEY
#
# Example: GLM-5.1 via Z.AI's Anthropic-compatible API (15x cheaper than Opus)
# provider:
# type: zai
# api_key_env: ZAI_API_KEY
# model_mappings:
# opus: glm-5.1
# sonnet: glm-5.1
# haiku: glm-4.5-air
#
# Example: Local vLLM server hosting any OpenAI-compatible model
# provider:
# type: vllm
# base_url: http://localhost:8000
#
# Example: Local Ollama (free, runs on your GPU)
# provider:
# type: ollama
# base_url: http://localhost:11434
20 changes: 19 additions & 1 deletion docs/getting-started.md
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,25 @@ Open `.env` in your editor and fill in these values:
ANTHROPIC_API_KEY=sk-ant-your-key-here
```

Your Anthropic API key. This is the only value you absolutely must set.
Your Anthropic API key. This is the only value you absolutely must set for the default setup.

**Using a different provider?** Phantom supports Z.AI (GLM-5.1, ~15x cheaper than Claude Opus), OpenRouter, Ollama, vLLM, LiteLLM, and custom endpoints. For example, to run Phantom on Z.AI:

```
ZAI_API_KEY=your-zai-key
```

Then add this to `phantom.yaml`:

```yaml
provider:
type: zai
api_key_env: ZAI_API_KEY
model_mappings:
sonnet: glm-5.1
```

See [docs/providers.md](providers.md) for the full provider reference.

### Slack (recommended)

Expand Down
188 changes: 188 additions & 0 deletions docs/providers.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,188 @@
# Provider Configuration

Phantom routes every LLM query (the main agent and every evolution judge) through the Claude Agent SDK as a subprocess. By setting environment variables that the bundled `cli.js` already honors, you can point that subprocess at any Anthropic Messages API compatible endpoint without changing a line of code.

The `provider:` block in `phantom.yaml` is a small config surface that translates into those environment variables for you.

## Supported Providers

| Type | Base URL | API Key Env | Notes |
|------|----------|-------------|-------|
| `anthropic` (default) | `https://api.anthropic.com` | `ANTHROPIC_API_KEY` | Claude Opus, Sonnet, Haiku |
| `zai` | `https://api.z.ai/api/anthropic` | `ZAI_API_KEY` | GLM-5.1 and GLM-4.5-Air, roughly 15x cheaper than Opus |
| `openrouter` | `https://openrouter.ai/api/v1` | `OPENROUTER_API_KEY` | 100+ models through a single key |
| `vllm` | `http://localhost:8000` | none | Self-hosted OpenAI-compatible inference |
| `ollama` | `http://localhost:11434` | none | Local GGUF models, zero API cost |
| `litellm` | `http://localhost:4000` | `LITELLM_KEY` | Local proxy bridging OpenAI, Gemini, and others |
| `custom` | (you set it) | (you set it) | Any Anthropic Messages API compatible endpoint |

## Quick Reference

### Anthropic (default)

No configuration needed. Existing deployments continue to work unchanged.

```yaml
# phantom.yaml
model: claude-sonnet-4-6
# No provider block = defaults to anthropic
```

```bash
# .env
ANTHROPIC_API_KEY=sk-ant-...
```

### Z.AI / GLM-5.1

Z.AI provides an Anthropic Messages API compatible endpoint at `https://api.z.ai/api/anthropic`. Phantom ships with a `zai` preset that points there automatically. Get a key at [docs.z.ai](https://docs.z.ai/guides/llm/glm-5).

```yaml
# phantom.yaml
model: claude-sonnet-4-6
provider:
type: zai
api_key_env: ZAI_API_KEY
model_mappings:
opus: glm-5.1
sonnet: glm-5.1
haiku: glm-4.5-air
```

```bash
# .env
ZAI_API_KEY=<your-zai-key>
```

Both the main agent and every evolution judge route through Z.AI. The `claude-sonnet-4-6` model name is translated to `glm-5.1` on the wire by the `model_mappings` block.

### Ollama (local, free)

Run any GGUF model on your own GPU. No API key needed.

```yaml
# phantom.yaml
model: claude-sonnet-4-6
provider:
type: ollama
model_mappings:
opus: qwen3-coder:32b
sonnet: qwen3-coder:32b
haiku: qwen3-coder:14b
```

Ollama must be running at `http://localhost:11434` (the preset default). The model must support function calling to work with Phantom's agent loop.

### vLLM (self-hosted)

For organizations running their own inference clusters.

```yaml
# phantom.yaml
model: claude-sonnet-4-6
provider:
type: vllm
base_url: http://your-vllm-server:8000
model_mappings:
sonnet: your-model-name
timeout_ms: 300000 # local models can be slow on first call
```

Start vLLM with `--tool-call-parser` matching your model for tool use to work.

### OpenRouter

Access 100+ models through a single OpenRouter key.

```yaml
# phantom.yaml
model: claude-sonnet-4-6
provider:
type: openrouter
api_key_env: OPENROUTER_API_KEY
model_mappings:
sonnet: anthropic/claude-sonnet-4.5
```

### LiteLLM (proxy)

Run a local LiteLLM proxy to bridge OpenAI, Gemini, and other formats.

```yaml
# phantom.yaml
model: claude-sonnet-4-6
provider:
type: litellm
api_key_env: LITELLM_KEY
# base_url defaults to http://localhost:4000
```

### Custom endpoint

For any Anthropic Messages API compatible proxy (LM Studio, custom internal gateways, etc.).

```yaml
# phantom.yaml
model: claude-sonnet-4-6
provider:
type: custom
base_url: https://your-proxy.internal/anthropic
api_key_env: YOUR_CUSTOM_KEY_ENV
```

## Configuration Fields

| Field | Type | Default | Purpose |
|-------|------|---------|---------|
| `type` | enum | `anthropic` | One of the supported provider types |
| `base_url` | URL | preset default | Override the endpoint URL |
| `api_key_env` | string | preset default | Name of the env var holding the credential |
| `model_mappings.opus` | string | none | Concrete model ID for the opus tier |
| `model_mappings.sonnet` | string | none | Concrete model ID for the sonnet tier |
| `model_mappings.haiku` | string | none | Concrete model ID for the haiku tier |
| `disable_betas` | boolean | preset default | Sets `CLAUDE_CODE_DISABLE_EXPERIMENTAL_BETAS=1`. Defaulted true for every non-anthropic preset. |
| `timeout_ms` | number | none | Sets `API_TIMEOUT_MS` for slow local inference |

## Environment Variable Overrides

For operators who prefer env variables over YAML edits:

| Variable | Effect |
|----------|--------|
| `PHANTOM_PROVIDER_TYPE` | Override `provider.type` (validated against the supported values) |
| `PHANTOM_PROVIDER_BASE_URL` | Override `provider.base_url` (validated as a URL) |
| `PHANTOM_MODEL` | Override `config.model` |

These are applied on top of the YAML-loaded config during startup.

## How It Works

The Claude Agent SDK runs as a subprocess. The SDK's bundled `cli.js` reads `ANTHROPIC_BASE_URL` and the `ANTHROPIC_DEFAULT_*_MODEL` aliases at call time. When `ANTHROPIC_BASE_URL` points at a non-Anthropic host, all Messages API requests go there instead.

The `provider:` block is translated into those environment variables by `buildProviderEnv()` in [`src/config/providers.ts`](../src/config/providers.ts). The resulting map is merged into both the main agent query and the evolution judge query, so changing providers flips both tiers in lockstep.

## Why keep a Claude model name in `model:`?

The bundled `cli.js` has hardcoded model-name arrays for capability detection (thinking tokens, effort levels, compaction, etc.). Passing a literal `glm-5.1` as the model can break those checks. The recommended pattern is:

1. Set `model: claude-sonnet-4-6` (or Opus) in `phantom.yaml` so `cli.js` treats the call as a known Claude model
2. Set `model_mappings.sonnet: glm-5.1` in the provider block so the wire call goes to GLM-5.1

This is the same pattern Z.AI's own documentation recommends.

## Troubleshooting

**Phantom responds but the logs show Claude-shaped costs.**
The bundled `cli.js` calculates `total_cost_usd` from its local Claude pricing table based on the model name string. Cost reporting is not provider-aware, so the logged cost will look like Claude pricing even when the request went to Z.AI or another provider. The actual charge on your provider's bill will differ.

**Auto mode judges fall back to heuristic mode.**
`resolveJudgeMode` in auto mode enables LLM judges when any of these are true: (a) a non-anthropic provider is configured, (b) `provider.base_url` is set, (c) `ANTHROPIC_API_KEY` is present, or (d) `~/.claude/.credentials.json` exists. If none hold, judges run in heuristic mode. Set `judges.enabled: always` in `config/evolution.yaml` to force LLM judges on.

**Third-party proxy rejects a beta header.**
`disable_betas: true` is already the default for every non-anthropic preset, which sets `CLAUDE_CODE_DISABLE_EXPERIMENTAL_BETAS=1`. If you still see beta header errors, explicitly set `disable_betas: true` on your provider block to make sure it overrides any custom `disable_betas: false`.

**Tool calls fail with small local models.**
Phantom's tool system assumes strong function-calling capability. Models like Qwen3-Coder and GLM-5.1 handle it well; smaller models often fail on complex multi-step tool chains. Test with a strong model first, then drop down.

**Subprocess fails with a missing-credential error.**
Phantom does not validate credentials at load time. The subprocess only sees the provider env vars when a query runs. If `api_key_env` names a variable that is not set in the process environment, the subprocess will fail at call time with the provider's own error message.
1 change: 1 addition & 0 deletions src/agent/__tests__/prompt-assembler.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ const baseConfig: PhantomConfig = {
port: 3100,
role: "swe",
model: "claude-opus-4-6",
provider: { type: "anthropic" },
effort: "max",
max_budget_usd: 0,
timeout_minutes: 240,
Expand Down
Loading
Loading