LLM Providers
Configure OpenAI, Anthropic, Google Gemini, Ollama, AWS Bedrock, or any OpenAI-compatible endpoint.
Wikis uses LangChain interfaces internally, so every provider is interchangeable. Set LLM_PROVIDER in your .env to switch.
Supported providers
Works with GPT-4o, GPT-4o-mini, and o-series reasoning models.
LLM_PROVIDER=openai
LLM_API_KEY=sk-proj-…
LLM_MODEL=gpt-4o-mini # fast and cheap; good default
# LLM_MODEL=gpt-4o # higher quality generation
EMBEDDING_MODEL=text-embedding-3-largeInstall: included in the base Docker image.
Claude 4 models — Haiku 4.5 (fast), Sonnet 4.6 (balanced), Opus 4.6 (highest quality).
LLM_PROVIDER=anthropic
LLM_API_KEY=sk-ant-…
LLM_MODEL=claude-sonnet-4-6
# LLM_MODEL=claude-haiku-4-5-20251001 # faster, lower cost
# Embeddings fall back to a secondary provider or use the LLMInstall (if running outside Docker):
pip install -e ".[all-providers]"Gemini models with native embedding support.
LLM_PROVIDER=gemini
LLM_API_KEY=AIza…
LLM_MODEL=gemini-2.5-pro
# LLM_MODEL=gemini-2.0-flash # faster, lower cost
EMBEDDING_MODEL=models/text-embedding-004Install:
pip install -e ".[all-providers]"Run models completely locally — no API key required.
LLM_PROVIDER=ollama
LLM_API_BASE=http://localhost:11434
LLM_MODEL=llama3.2
EMBEDDING_MODEL=nomic-embed-textPrerequisites:
- Install Ollama
- Pull the models:
ollama pull llama3.2 && ollama pull nomic-embed-text
Install:
pip install -e ".[ollama]"Use Claude or Titan models via AWS.
LLM_PROVIDER=bedrock
AWS_ACCESS_KEY_ID=…
AWS_SECRET_ACCESS_KEY=…
AWS_DEFAULT_REGION=us-east-1
LLM_MODEL=anthropic.claude-3-sonnet-20240229-v1:0
EMBEDDING_MODEL=amazon.titan-embed-text-v1Install:
pip install -e ".[all-providers]"Any server implementing the OpenAI API — Together AI, Groq, Mistral, vLLM, LM Studio, etc.
LLM_PROVIDER=openai
LLM_API_KEY=your-key-or-placeholder
LLM_API_BASE=https://api.together.xyz/v1
LLM_MODEL=meta-llama/Llama-3-70b-chat-hfSet LLM_API_BASE to the endpoint's base URL. The model name must match what the endpoint expects.
Choosing a model
| Use case | Recommendation |
|---|---|
| Best quality wikis | gpt-4o, claude-sonnet-4-6, gemini-2.5-pro |
| Fast + affordable | gpt-4o-mini, claude-haiku-4-5, gemini-2.0-flash |
| Fully local / offline | Ollama with llama3.3 or qwen3.5 |
| Enterprise / on-prem | AWS Bedrock with Claude Sonnet 4 |
Separate embedding model
Wikis uses a separate embedding model for vector search. By default it uses the same provider as the LLM. To use a different provider for embeddings:
LLM_PROVIDER=anthropic
LLM_API_KEY=sk-ant-…
LLM_MODEL=claude-3-5-sonnet-20241022
# Use OpenAI embeddings alongside Anthropic LLM
EMBEDDING_MODEL=text-embedding-3-large
OPENAI_API_KEY=sk-proj-… # Additional key for embeddings onlyVerifying provider health
curl http://localhost:8000/api/v1/healthResponse:
{
"status": "ok",
"llm_provider": "openai",
"llm_model": "gpt-4o-mini",
"embedding_model": "text-embedding-3-large",
"version": "1.0.0"
}If status is not ok, check your API key and model name in .env.
Switching providers on an existing instance
Changing LLM_PROVIDER or EMBEDDING_MODEL on an instance with existing wikis will cause Q&A to work incorrectly because the vector indexes were built with the old embedding model. To fix this:
- Update the provider settings in
.env - Restart the services:
docker compose up -d - Re-generate each wiki using the ↺ Refresh button (this rebuilds the index with the new embedding model)