LLM Providers

Configure OpenAI, Anthropic, Google Gemini, Ollama, AWS Bedrock, or any OpenAI-compatible endpoint.

Wikis uses LangChain interfaces internally, so every provider is interchangeable. Set LLM_PROVIDER in your .env to switch.

Supported providers

Works with GPT-4o, GPT-4o-mini, and o-series reasoning models.

LLM_PROVIDER=openai
LLM_API_KEY=sk-proj-…
LLM_MODEL=gpt-4o-mini          # fast and cheap; good default
# LLM_MODEL=gpt-4o             # higher quality generation
EMBEDDING_MODEL=text-embedding-3-large

Install: included in the base Docker image.

Claude 4 models — Haiku 4.5 (fast), Sonnet 4.6 (balanced), Opus 4.6 (highest quality).

LLM_PROVIDER=anthropic
LLM_API_KEY=sk-ant-…
LLM_MODEL=claude-sonnet-4-6
# LLM_MODEL=claude-haiku-4-5-20251001   # faster, lower cost
# Embeddings fall back to a secondary provider or use the LLM

Install (if running outside Docker):

pip install -e ".[all-providers]"

Gemini models with native embedding support.

LLM_PROVIDER=gemini
LLM_API_KEY=AIza…
LLM_MODEL=gemini-2.5-pro
# LLM_MODEL=gemini-2.0-flash   # faster, lower cost
EMBEDDING_MODEL=models/text-embedding-004

Install:

pip install -e ".[all-providers]"

Run models completely locally — no API key required.

LLM_PROVIDER=ollama
LLM_API_BASE=http://localhost:11434
LLM_MODEL=llama3.2
EMBEDDING_MODEL=nomic-embed-text

Prerequisites:

Install Ollama
Pull the models: ollama pull llama3.2 && ollama pull nomic-embed-text

Install:

pip install -e ".[ollama]"

Use Claude or Titan models via AWS.

LLM_PROVIDER=bedrock
AWS_ACCESS_KEY_ID=…
AWS_SECRET_ACCESS_KEY=…
AWS_DEFAULT_REGION=us-east-1
LLM_MODEL=anthropic.claude-3-sonnet-20240229-v1:0
EMBEDDING_MODEL=amazon.titan-embed-text-v1

Install:

pip install -e ".[all-providers]"

Any server implementing the OpenAI API — Together AI, Groq, Mistral, vLLM, LM Studio, etc.

LLM_PROVIDER=openai
LLM_API_KEY=your-key-or-placeholder
LLM_API_BASE=https://api.together.xyz/v1
LLM_MODEL=meta-llama/Llama-3-70b-chat-hf

Set LLM_API_BASE to the endpoint's base URL. The model name must match what the endpoint expects.

Choosing a model

Use case	Recommendation
Best quality wikis	`gpt-4o`, `claude-sonnet-4-6`, `gemini-2.5-pro`
Fast + affordable	`gpt-4o-mini`, `claude-haiku-4-5`, `gemini-2.0-flash`
Fully local / offline	Ollama with `llama3.3` or `qwen3.5`
Enterprise / on-prem	AWS Bedrock with Claude Sonnet 4

Separate embedding model

Wikis uses a separate embedding model for vector search. By default it uses the same provider as the LLM. To use a different provider for embeddings:

LLM_PROVIDER=anthropic
LLM_API_KEY=sk-ant-…
LLM_MODEL=claude-3-5-sonnet-20241022

# Use OpenAI embeddings alongside Anthropic LLM
EMBEDDING_MODEL=text-embedding-3-large
OPENAI_API_KEY=sk-proj-…   # Additional key for embeddings only

Verifying provider health

curl http://localhost:8000/api/v1/health

Response:

{
  "status": "ok",
  "llm_provider": "openai",
  "llm_model": "gpt-4o-mini",
  "embedding_model": "text-embedding-3-large",
  "version": "1.0.0"
}

If status is not ok, check your API key and model name in .env.

Switching providers on an existing instance

Changing LLM_PROVIDER or EMBEDDING_MODEL on an instance with existing wikis will cause Q&A to work incorrectly because the vector indexes were built with the old embedding model. To fix this:

Update the provider settings in .env
Restart the services: docker compose up -d
Re-generate each wiki using the ↺ Refresh button (this rebuilds the index with the new embedding model)

LLM Providers

Supported providers

Choosing a model

Separate embedding model

Verifying provider health

Switching providers on an existing instance

On this page