AI Providers
This page documents the [ai.providers], [ai.routing], and [ai.fallback] sections of an instance’s config.toml. All three are optional — an instance with no [ai.*] section uses a sensible default stack (DeepSeek for chat, Gemini for vision, DeepSeek Reasoner for hard questions, no CENSORED cascade).
Quick reference
# instances/<your-bot>/config.toml
[ai.providers.<name>]
url = "https://..." # required: full chat-completions endpoint
model = "..." # required: model identifier
api_key_env = "..." # required: name of env var holding the bearer token
max_tokens = 8192 # required: per-provider hard cap on response tokens
timeout_secs = 30 # optional, default 30
supports_vision = false # optional, default false
supports_tools = true # optional, default true
is_reasoner = false # optional, default false
spec = "openai" # optional, default "openai"
[ai.routing]
chat = "<provider-name>" # required if section present
vision = "<provider-name>" # optional — if omitted, image messages fall through to chat
reasoner = "<provider-name>" # optional — if omitted, classifier step is skipped
[ai.fallback]
on_censored = ["<name>", "..."] # CENSORED-cascade chain (optional)
Default registry
When [ai.providers] is absent, the bot ships these four definitions:
| Name | URL | Model | Env var | max_tokens | timeout | vision | tools | reasoner |
|---|---|---|---|---|---|---|---|---|
deepseek_chat | https://api.deepseek.com/chat/completions | deepseek-v4-flash | DEEPSEEK_API_KEY | 8192 | 30 | no | yes | no |
deepseek_reasoner | https://api.deepseek.com/chat/completions | deepseek-v4-pro | DEEPSEEK_API_KEY | 65536 | 300 | no | no | yes |
gemini_flash | https://generativelanguage.googleapis.com/v1beta/openai/chat/completions | gemini-3-flash-preview | GEMINI_API_KEY | 16384 | 30 | yes | yes | no |
grok | https://api.x.ai/v1/chat/completions | grok-3 | GROK_API_KEY | 16384 | 30 | no | yes | no |
A provider whose api_key_env resolves to an unset/empty env var is “unavailable” — defined but not usable. The bot starts and runs without it; AI features that depend on it are silently disabled (with a warning at startup if anything references it).
Default routing
When [ai.routing] is absent:
chat = "deepseek_chat"
vision = "gemini_flash"
reasoner = "deepseek_reasoner"
This is exactly the routing behaviour of every release before 0.15.0. Existing instances pick it up automatically with no config changes.
Routing degradation rules
When [ai.routing] IS present, only the keys you write take effect. There’s no field-merge with the defaults above. Specifically:
| Role | If set | If omitted |
|---|---|---|
chat | Must resolve to a configured provider; otherwise the bot panics at startup | Required — bot panics at startup |
vision | Must resolve | Image-bearing requests fall through to chat with a warning log |
reasoner | Must resolve | Classifier step is skipped; every request goes to chat |
This lets you write a one-model config:
[ai.providers.my_local]
url = "http://localhost:11434/v1/chat/completions"
model = "llama3.1:70b"
api_key_env = "LOCAL_LLM_KEY"
max_tokens = 8192
[ai.routing]
chat = "my_local"
# vision and reasoner omitted → graceful degrade
Disabling V4-Pro flagship
deepseek_reasoner defaults to DeepSeek V4-Pro (the 1.6T-parameter
flagship). V4-Pro output costs roughly 12× V4-Flash output per token, so
high-volume reasoner traffic adds up quickly. The existing routing
system already provides the off-switch — no per-feature boolean is
needed.
To skip V4-Pro entirely without redefining a provider, point the reasoner role at the cheaper V4-Flash:
[ai.routing]
reasoner = "deepseek_chat"
To disable the reasoner role altogether — the bot will never invoke a
reasoner provider, and every chat goes through the chat role — set
[ai.routing] and omit reasoner:
[ai.routing]
chat = "deepseek_chat"
vision = "gemini_flash"
# reasoner intentionally omitted — graceful degrade
Either pattern leaves V4-Pro unconfigured by routing and unbilled by DeepSeek.
Provider definitions
Each [ai.providers.<name>] block is independent. The <name> is your handle for the provider — used in [ai.routing] lookups, in [ai.fallback] on_censored lists, and in log lines.
Required fields
url— full HTTPS endpoint (the chat-completions URL, including any version path).model— model identifier the provider expects in the request body.api_key_env— name of an environment variable. The bot readsstd::env::var(api_key_env)at startup; an unset/empty value marks the provider unavailable.max_tokens— per-provider hard cap on response tokens. The orchestration layer asks for whatever budget it wants and clamps to this cap.
Optional fields
timeout_secs(default30) — HTTP timeout for chat-completions calls. DeepSeek Reasoner needs300(5 minutes); fast chat models are fine with the default.supports_vision(defaultfalse) — whether the model accepts image content parts. Today only the Gemini default registry entry sets this totrue.supports_tools(defaulttrue) — whether the model accepts atoolsarray. DeepSeek Reasoner is the standout exception (set tofalse).is_reasoner(defaultfalse) — flags a slow reasoning model. The orchestration layer uses this signal alongside the longertimeout_secsbudget you should also set.spec(default"openai") — request/response shape."openai"(default) and"anthropic"(added in 0.16.0) are supported. See Anthropic spec for details.
Provider name rules
- Non-empty after
.trim() - No internal whitespace characters
- TOML’s bare-key rules already constrain
[ai.providers.<name>]syntax to safe characters (alphanumeric, underscore, hyphen, dot) - A user-defined name that matches a default-registry name (e.g.
gemini_flash) fully replaces the default — no field-level merge
Backward-compatible alias names
For instance configs that pin model-string aliases instead of canonical provider names, the bot recognises a small set of short aliases at lookup time. They were introduced in two waves:
| Alias | Resolves to | Added in |
|---|---|---|
gemini | gemini_flash | 0.14.0 |
deepseek | deepseek_chat | 0.14.0 |
deepseek-chat | deepseek_chat | 0.14.0 |
deepseek-v4 | deepseek_chat | 0.18.0 |
deepseek-v4-flash | deepseek_chat | 0.18.0 |
deepseek-v4-pro | deepseek_reasoner | 0.18.0 |
deepseek-reasoner | deepseek_reasoner | 0.18.0 |
These aliases work in [ai.fallback] on_censored and in any other place
the bot looks up a provider by name at request time. They are not
accepted by [ai.routing] startup validation — using [ai.routing] chat = "gemini" panics at startup with the canonical name (gemini_flash) in
the error message.
The 0.14.0 aliases exist so a [ai.fallback] on_censored = ["grok", "gemini"] line copied from a 0.14.0 example doesn’t silently produce an
empty cascade. The 0.18.0 aliases preserve [ai.fallback] configs that
named DeepSeek’s deepseek-reasoner (retiring 2026-07-24) and add
forward-compatible spellings for the explicit V4 model names.
New configs should use the canonical names (deepseek_chat,
deepseek_reasoner, gemini_flash, grok).
Anthropic spec
As of 0.16.0, spec = "anthropic" enables native Anthropic /v1/messages routing. This is useful for using Claude directly without going through an OpenAI-compat proxy — native routing preserves Claude’s structured tool use, vision content parts, and prompt caching (future work).
An Anthropic provider definition looks like this:
[ai.providers.claude]
spec = "anthropic"
url = "https://api.anthropic.com/v1/messages"
model = "claude-opus-4-7"
api_key_env = "ANTHROPIC_API_KEY"
max_tokens = 8192
supports_vision = true
supports_tools = true
# Anthropic's auth is x-api-key with no scheme prefix.
auth_header = "x-api-key"
auth_scheme = ""
# Anthropic's required version header.
headers = { "anthropic-version" = "2023-06-01" }
New fields (also available to OpenAI providers)
headers—HashMap<String, String>of extra HTTP headers. Default empty. Values must be printable ASCII. Use inline-table syntax (headers = { "x" = "y" }) for 1-2 headers, or a sub-table ([ai.providers.claude.headers]) for longer lists.auth_header— name of the auth header. Default"Authorization". Must be non-empty.auth_scheme— prefix prepended to the API key. Default"Bearer "(with trailing space). Use""for Anthropic.
These fields are respected by both the OpenAI and Anthropic paths — you can use them on any provider that needs custom auth or headers (e.g. a self-hosted endpoint requiring a custom x-internal-auth header).
Translation — what the bot handles automatically
When you route to an Anthropic provider, the bot translates every shape difference transparently. The internal tool definitions, system prompt, and message history are all built in OpenAI shape (the bot’s internal canonical form) and translated to Anthropic’s wire shape on each request. You never need to write Anthropic-specific prompt logic.
Translation covers:
- System prompt → top-level
systemfield on the request body (not arole: "system"message in the array) - Image content parts → base64
sourceblocks with correctmedia_type - Tool definitions → flat
{name, description, input_schema}shape - Tool call responses →
tool_usecontent blocks are extracted into the same flatToolCall { id, name, arguments }shape the bot uses internally - Tool result messages → wrapped in user-content
tool_resultblocks
What works with Claude today
- Text chat (any routing role)
- Vision (when
supports_vision = true) - Tool use (when
supports_tools = true) - Multi-round search via the CENSORED cascade /
[ai.fallback] on_censored = ["claude", ...]as a post-DeepSeek-refusal fallback - Mixed setups: DeepSeek primary + Claude as cascade member, or Claude primary + DeepSeek as reasoner, etc.
What’s not yet available
- Streaming responses
- Anthropic’s
cache_controlephemeral blocks for prompt caching - Structured “thinking” / extended reasoning outputs (Claude’s reasoning models)
These are tracked as future enhancements; today’s integration gives feature parity with DeepSeek/Gemini for the bot’s standard workflow.
Validation behaviour
Performed once at startup, before the bot connects to Discord:
| Case | Behaviour |
|---|---|
[ai.routing] chat unset OR points at unknown provider name | Panic |
[ai.routing] vision / reasoner set to unknown provider name | Panic |
[ai.routing] section present without chat | Panic |
| Provider name contains whitespace | Panic |
Provider with spec = "anthropic" | Fully supported as of 0.16.0 — see Anthropic spec |
Provider’s api_key_env resolves to unset env var | Provider marked unavailable |
| Routing or fallback references an unavailable provider | Warn at startup |
[ai.routing] vision references provider with supports_vision = false | Warn at startup |
[ai.routing] reasoner references provider with is_reasoner = false | Warn at startup |
[ai.fallback] on_censored references defined-but-unavailable provider | Warn at startup; cascade_for skips it at request time |
[ai.fallback] on_censored references completely unknown name | Warn at request time only (via cascade_for); silently produces empty cascade entry. Common when migrating from 0.14.0 configs that used aliases — see Backward-compatible alias names above |
Worked examples
One-model setup (local Ollama)
[ai.providers.local]
url = "http://localhost:11434/v1/chat/completions"
model = "llama3.1:70b"
api_key_env = "OLLAMA_API_KEY" # any non-empty value works for Ollama
max_tokens = 8192
[ai.routing]
chat = "local"
Vision and reasoner gracefully degrade. Image messages will be handed to the local model anyway; classifier is skipped so every prompt goes straight to chat.
Three providers + cascade
[ai.providers.openai_gpt]
url = "https://api.openai.com/v1/chat/completions"
model = "gpt-4o"
api_key_env = "OPENAI_API_KEY"
max_tokens = 16384
supports_vision = true
[ai.routing]
chat = "openai_gpt"
vision = "openai_gpt" # reuse same provider for vision
reasoner = "deepseek_reasoner" # keep the default reasoner
[ai.fallback]
on_censored = ["grok", "gemini_flash"]
Override a default
To use gemini_flash but with a different model:
[ai.providers.gemini_flash]
url = "https://generativelanguage.googleapis.com/v1beta/openai/chat/completions"
model = "gemini-2.5-pro" # not the default flash; check ai.google.dev for newer Pro models
api_key_env = "GEMINI_API_KEY"
max_tokens = 32768
supports_vision = true
The user definition fully replaces the default gemini_flash entry. No [ai.routing] change needed if you’re keeping the same role assignment.
See also
- Environment variables —
*_API_KEYreference - Instance config — full
config.tomlschema defaults/example-providers.toml— copy-paste catalogue for popular endpoints- AI Chat feature page
- Issue #28 — design context + phase 2 follow-up