Guides
Setting Up LLM Providers
Step-by-step guide for configuring LLM providers in Binexia.
How LLM Keys Work
All provider keys live in .env.testip.local. Two services read them:
- Agno reads keys directly via litellm Python library (in-process, lowest latency)
- LiteLLM Proxy reads keys from
services/litellm/config.yamland exposes an OpenAI-compatible API for Dify and other services
Both use the same keys — no duplication.
Quick Setup
Edit .env.testip.local and add at least one provider key:
# Cheapest option — works for all agents
OPENAI_API_KEY=sk-...
# Better reasoning — recommended for Analytics and Behavioral agents
ANTHROPIC_API_KEY=sk-ant-...
# LiteLLM proxy master key (for Dify access)
LITELLM_MASTER_KEY=sk-litellm-your-secretThen restart containers:
docker compose -f docker-compose.ip-test.yml --env-file .env.testip.local up -dNo rebuild needed — env vars are read on container start.
Verifying Setup
Agno (direct)
curl http://$HOST_IP:8001/healthLiteLLM Proxy
curl http://$HOST_IP:4000/healthTest a completion through the proxy:
curl http://$HOST_IP:4000/v1/chat/completions \
-H "Authorization: Bearer $LITELLM_MASTER_KEY" \
-H "Content-Type: application/json" \
-d '{"model": "gpt-4o-mini", "messages": [{"role": "user", "content": "Hello"}], "max_tokens": 50}'Verifying a Provider Works
Test a simple query from the UI: log in, open a dashboard, and type a question in the Raw Query widget. If you get a chart, the LLM is working.
LiteLLM Proxy Configuration
The proxy config is at services/litellm/config.yaml. It maps friendly model names to provider-specific parameters:
model_list:
- model_name: gpt-4o-mini
litellm_params:
model: openai/gpt-4o-mini
api_key: os.environ/OPENAI_API_KEY
- model_name: claude-sonnet
litellm_params:
model: anthropic/claude-sonnet-4-6
api_key: os.environ/ANTHROPIC_API_KEY
- model_name: glm-4-flash
litellm_params:
model: openai/glm-4-flash
api_key: os.environ/GLM_API_KEY
api_base: https://open.bigmodel.cn/api/paas/v4To add a new model, add it to config.yaml and restart the litellm container:
docker compose -f docker-compose.ip-test.yml restart litellmProvider Details
Providers are grouped by API protocol. Most providers speak the OpenAI API format — same SDK, different endpoint. litellm handles the routing automatically.
OpenAI or Compatible
These providers use the OpenAI /v1/chat/completions API protocol. litellm routes each to the correct endpoint.
OpenAI (Recommended Default)
OPENAI_API_KEY=sk-proj-... # https://platform.openai.com/api-keys- Cost: ~$0.15/1M input tokens (GPT-4o-mini), ~$2.50/1M (GPT-4o)
- Good for: Orchestrator, Knowledge, Context, Document Extraction
- Provider:
openai
DeepSeek
DEEPSEEK_API_KEY=sk-... # https://platform.deepseek.com- Cost: Very cheap (~$0.14/1M tokens for DeepSeek-V3)
- Good for: Budget-conscious deployments
- Provider:
deepseek
Groq
GROQ_API_KEY=gsk_... # https://console.groq.com- Very fast inference via LPU hardware
- Good for: Real-time responses, low latency
- Provider:
groq
Together AI
TOGETHER_API_KEY=... # https://api.together.ai- Open-source models at scale — Llama, Mistral, FLUX
- Provider:
together
Fireworks AI
FIREWORKS_API_KEY=... # https://fireworks.ai- Fast open-source model inference
- Provider:
fireworks
Perplexity
PERPLEXITY_API_KEY=pplx-... # https://perplexity.ai/settings/api- Sonar models — search-augmented generation
- Provider:
perplexity
GLM (Zhipu AI)
GLM_API_KEY=your-glm-key # https://open.bigmodel.cnConfigure in the LLM Routing Editor:
- Provider:
glm - Model:
glm-4-flash(orglm-4-plus,glm-4-long) - API Base URL:
https://open.bigmodel.cn/api/paas/v4 - API Key Env Var:
GLM_API_KEY
GLM uses an OpenAI-compatible API — the glm provider setting automatically routes through the openai/ litellm prefix with the correct api_base.
Azure OpenAI
AZURE_API_KEY=...
AZURE_API_BASE=https://your-resource.openai.azure.com
AZURE_API_VERSION=2024-06-01- Enterprise compliance — regional deployment, data residency
- Provider:
azure - Requires
api_baseset to your Azure endpoint
Custom (Any OpenAI-Compatible Endpoint)
For self-hosted models (Ollama, vLLM, LocalAI, etc.):
CUSTOM_LLM_API_KEY=optional-key- Provider:
custom - Set
api_baseto your endpoint (e.g.,http://localhost:11434/v1for Ollama)
Anthropic
ANTHROPIC_API_KEY=sk-ant-api03-... # https://console.anthropic.com/settings/keys- Cost: ~$3/1M input tokens (Claude Sonnet)
- Good for: Analytics, Behavioral Scoring, Anomaly Detection, Forecast
- Stronger at: Complex SQL generation, reasoning, analysis
- Provider:
anthropic
Google Gemini
GOOGLE_API_KEY=AI... # https://aistudio.google.com/apikey- Multimodal — text, images, video
- Provider:
google - Models:
gemini-2.0-flash,gemini-2.5-pro - Free tier: 15 RPM, 1M TPM — great for development (see Cheap LLM for Dev)
Mistral
MISTRAL_API_KEY=... # https://console.mistral.ai- European provider — data stays in EU
- Provider:
mistral
OpenRouter (Gateway)
OPENROUTER_API_KEY=sk-or-... # https://openrouter.ai/keys- Cost: Varies by model (often cheaper than direct)
- Good for: Accessing 100+ models with one key
- Model format:
anthropic/claude-3.5-sonnet,meta-llama/llama-3-70b, etc. - Provider:
openrouter
Cohere
COHERE_API_KEY=... # https://dashboard.cohere.com- Enterprise NLP — search, summarization, classification
- Provider:
cohere
Mock Mode (No Keys)
Without any LLM keys, Binexia runs in mock mode:
- Dashboard widgets show hardcoded demo data
- AI queries return canned demo responses
- Document extraction skips LLM synthesis
- Scheduled agents don't run
Everything else works: login, CRUD, file upload, dashboard layout, semantic model editing.
Cost Estimates
Approximate cost per agent type with default models:
| Agent | Model | Avg tokens/query | Est. cost/1000 queries |
|---|---|---|---|
| Orchestrator | GPT-4o-mini | ~500 | $0.08 |
| Analytics | Claude Sonnet | ~2000 | $6.00 |
| Knowledge | GPT-4o-mini | ~1500 | $0.23 |
| Context | GPT-4o-mini | ~800 | $0.12 |
| Behavioral | Claude Sonnet | ~3000 | $9.00 |
| Anomaly | Claude Sonnet | ~2000 | $6.00 |
| Forecast | Claude Sonnet | ~2500 | $7.50 |