Anthropic Messages API
SwarmLLM provides a full Anthropic Messages API at POST /v1/messages, enabling it to serve as a drop-in backend for Claude Code and other Anthropic-compatible clients.
Claude Code Integration
Use SwarmLLM as your Claude Code backend to access all models (local, network, and cloud) through a single endpoint:
ANTHROPIC_BASE_URL=http://localhost:8800 claude --model qwen2.5-coder-7b
Environment Variables
| Variable | Description |
|---|---|
ANTHROPIC_BASE_URL | Point to your SwarmLLM node (e.g., http://localhost:8800) |
ANTHROPIC_AUTH_TOKEN | Your node's API key (from Settings or /api/admin/api-key) |
ANTHROPIC_MODEL | Default model to use |
POST /v1/messages
Request Body
| Field | Type | Required | Description |
|---|---|---|---|
model | string | yes | Model name (local GGUF, network model, or cloud model like gpt-4o) |
messages | array | yes | Chat messages with role + content |
max_tokens | integer | yes | Maximum tokens to generate (clamped to 1–32768) |
system | string or array | no | System prompt (supports cache_control blocks) |
stream | boolean | no | Enable SSE streaming |
temperature | float | no | Sampling temperature |
top_p | float | no | Nucleus sampling |
stop_sequences | array | no | Stop sequences, 1–256 chars each, max 16 |
tools | array | no | Tool definitions for function calling |
tool_choice | object | no | Tool selection strategy |
metadata | object | no | Request metadata |
thinking | object | no | Extended thinking configuration |
Content Block Types
Messages can contain these content block types:
// Text
{"type": "text", "text": "Hello, world!"}
// Image (base64)
{"type": "image", "source": {"type": "base64", "media_type": "image/png", "data": "..."}}
// Tool use (assistant response)
{"type": "tool_use", "id": "toolu_123", "name": "get_weather", "input": {"location": "NYC"}}
// Tool result (user message)
{"type": "tool_result", "tool_use_id": "toolu_123", "content": "72F, sunny"}
// Thinking (extended thinking)
{"type": "thinking", "thinking": "Let me reason about this..."}
// Redacted thinking
{"type": "redacted_thinking", "data": "..."}
Response
{
"id": "msg_abc123",
"type": "message",
"role": "assistant",
"model": "qwen2.5-coder-7b",
"content": [
{"type": "text", "text": "Here's my response..."}
],
"stop_reason": "end_turn",
"usage": {
"input_tokens": 25,
"output_tokens": 150
}
}
Model Routing
Requests are routed based on the model name:
| Model Pattern | Route | Details |
|---|---|---|
| Local GGUF model | Local inference | Tool calls and thinking blocks converted to text |
claude-* | Anthropic API | Full pass-through (all fields preserved including tools and thinking) |
gpt-*, o1-*, o3-*, o4-* | OpenAI | Anthropic→OpenAI format translation |
deepseek-* | DeepSeek | Anthropic→OpenAI format translation |
mistral-*, codestral-*, pixtral-* | Mistral | Anthropic→OpenAI format translation |
llama-*, groq-* | Groq | Anthropic→OpenAI format translation |
nim-* | NVIDIA NIM | Anthropic→OpenAI format translation |
cerebras-* | Cerebras | Anthropic→OpenAI format translation |
samba-* | SambaNova | Anthropic→OpenAI format translation |
fireworks-*, accounts/fireworks/* | Fireworks AI | Anthropic→OpenAI format translation |
together-* | Together AI | Anthropic→OpenAI format translation |
deepinfra-* | DeepInfra | Anthropic→OpenAI format translation |
moonshot-*, kimi-* | Moonshot/Kimi | Anthropic→OpenAI format translation |
| Network model | Distributed inference | Routed through swarm P2P network |
All 12 cloud providers are supported. Configure API keys via the dashboard Settings page or by placing a .env file in the data directory (~/.local/share/swarmllm/.env) with standard variable names (e.g., OPENAI_API_KEY, DEEPSEEK_API_KEY).
System Blocks with Cache Control
Anthropic-compatible prompt caching:
{
"system": [
{"type": "text", "text": "You are a helpful assistant.", "cache_control": {"type": "ephemeral"}}
]
}
Streaming (SSE)
When stream: true, responses arrive as Server-Sent Events following the Anthropic streaming format:
event: message_start
data: {"type":"message_start","message":{"id":"msg_123","type":"message","role":"assistant","model":"qwen2.5-coder-7b","content":[]}}
event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":"Hello"}}
event: message_stop
data: {"type":"message_stop"}