Core Concepts
LLM Routing
How Binexia routes each agent to the best LLM provider and model using litellm.
Every LLM call in Binexia goes through litellm — a unified API client that supports 100+ providers with a single interface. The routing table in tenant_config.llm_routing maps each agent to a specific provider, model, and configuration.
Two LLM Access Patterns
Binexia uses litellm in two ways:
| Pattern | Used by | How | Why |
|---|---|---|---|
| Python library (in-process) | Agno agents | litellm.completion() | Hot path — lowest latency, many calls per session |
| Proxy server (HTTP) | Dify, other services | POST http://ubios_litellm:4000/v1/chat/completions | OpenAI-compatible endpoint, unified key management |
Both use the same API keys from .env. Agno reads them directly; the proxy reads them from its config.yaml.
How It Works (Agno Agents)
SELECT * FROM ubios_config.llm_routing WHERE agent_name = 'AnalyticsAgent';
-- Result:
-- agent_name: AnalyticsAgent
-- provider: anthropic
-- model: claude-sonnet-4-6
-- max_tokens: 4096
-- temperature: 0.1
-- api_key_env: ANTHROPIC_API_KEY
-- api_base: NULL
-- fallback_route_id: NULL
-- is_active: trueWhen the AnalyticsAgent needs to generate SQL, the router:
- Looks up the route for
AnalyticsAgent - Reads the API key from the environment variable specified in
api_key_env - Calls litellm with the provider prefix + model name
- If the call fails and
fallback_route_idis set, tries the fallback route
Provider Compatibility Tiers
Providers are grouped by API protocol, not individual brands. Most providers speak the OpenAI protocol — same API format, different endpoint.
OpenAI or Compatible
Providers that implement the OpenAI /v1/chat/completions API. litellm handles routing automatically.
| Provider | provider value | Env var | litellm prefix | Notes |
|---|---|---|---|---|
| OpenAI | openai | OPENAI_API_KEY | openai/ | Direct |
| DeepSeek | deepseek | DEEPSEEK_API_KEY | deepseek/ | Native litellm handler |
| Groq | groq | GROQ_API_KEY | groq/ | Native litellm handler |
| Together AI | together | TOGETHER_API_KEY | together/ | Native litellm handler |
| Fireworks AI | fireworks | FIREWORKS_API_KEY | fireworks/ | Native litellm handler |
| Perplexity | perplexity | PERPLEXITY_API_KEY | perplexity/ | Native litellm handler |
| Azure OpenAI | azure | AZURE_API_KEY | azure/ | Native litellm handler, set api_base |
| GLM (Zhipu AI) | glm | GLM_API_KEY | openai/ | Set api_base: https://open.bigmodel.cn/api/paas/v4 |
| Any /v1 endpoint | custom | CUSTOM_LLM_API_KEY | openai/ | Ollama, vLLM, LocalAI, etc. |
Anthropic
| Provider | provider value | Env var | litellm prefix |
|---|---|---|---|
| Anthropic | anthropic | ANTHROPIC_API_KEY | anthropic/ |
Google Gemini
| Provider | provider value | Env var | litellm prefix |
|---|---|---|---|
| Google Gemini | google | GOOGLE_API_KEY | gemini/ |
Mistral
| Provider | provider value | Env var | litellm prefix |
|---|---|---|---|
| Mistral | mistral | MISTRAL_API_KEY | mistral/ |
OpenRouter (Gateway)
| Provider | provider value | Env var | litellm prefix | Notes |
|---|---|---|---|---|
| OpenRouter | openrouter | OPENROUTER_API_KEY | openrouter/ | Access 100+ models with one key |
Cohere
| Provider | provider value | Env var | litellm prefix |
|---|---|---|---|
| Cohere | cohere | COHERE_API_KEY | cohere/ |
Custom Endpoints
The api_base field lets you point any provider at a custom endpoint:
-- GLM via OpenAI-compatible endpoint
UPDATE ubios_config.llm_routing
SET provider = 'glm',
api_base = 'https://open.bigmodel.cn/api/paas/v4',
api_key_env = 'GLM_API_KEY',
model = 'glm-4-flash'
WHERE agent_name = 'KnowledgeAgent';
-- Self-hosted vLLM
UPDATE ubios_config.llm_routing
SET provider = 'custom',
api_base = 'http://localhost:8000/v1',
api_key_env = 'CUSTOM_LLM_API_KEY',
model = 'meta-llama/Llama-3-70B'
WHERE agent_name = 'OrchestratorAgent';Multiple Keys Per Provider
You can define multiple env vars for the same provider and assign them to different agents:
# .env.testip.local
GLM_API_KEY_1=key-for-cheap-models
GLM_API_KEY_2=key-for-premium-models-- Agent 1 uses key 1
UPDATE ubios_config.llm_routing
SET api_key_env = 'GLM_API_KEY_1', model = 'glm-4.5-air'
WHERE agent_name = 'OrchestratorAgent';
-- Agent 2 uses key 2
UPDATE ubios_config.llm_routing
SET api_key_env = 'GLM_API_KEY_2', model = 'glm-5'
WHERE agent_name = 'AnalyticsAgent';Fallback Routes
Set fallback_route_id to an alternate route name. If the primary call fails, the router tries the fallback:
INSERT INTO ubios_config.llm_routing (agent_name, provider, model, max_tokens, temperature, api_key_env, is_active)
VALUES ('analytics-backup', 'openai', 'gpt-4o-mini', 4096, 0.1, 'OPENAI_API_KEY', true);
UPDATE ubios_config.llm_routing
SET fallback_route_id = 'analytics-backup'
WHERE agent_name = 'AnalyticsAgent';The LLM Routing Editor (Admin UI)
Settings → LLM Routing provides a visual editor where you can:
- Change provider and model for each agent
- Set custom API base URLs
- Choose which env var holds the API key
- Configure fallback routes
- Toggle routes active/inactive
Changes take effect immediately — no restart needed.
Default Routes (IntelliTravel Demo)
| Agent | Provider | Model | Why |
|---|---|---|---|
| OrchestratorAgent | openai | gpt-4o-mini | Fast routing, cheap |
| AnalyticsAgent | anthropic | claude-sonnet-4-6 | Strong SQL generation |
| KnowledgeAgent | openai | gpt-4o-mini | Fast RAG responses |
| ContextAgent | openai | gpt-4o-mini | Fast context explanations |
| BehavioralScoringAgent | anthropic | claude-sonnet-4-6 | Complex reasoning |
| AnomalyDetectionAgent | anthropic | claude-sonnet-4-6 | Statistical analysis |
| ForecastAgent | anthropic | claude-sonnet-4-6 | Time-series reasoning |
| DocumentExtractionAgent | openai | gpt-4o-mini | Fast field extraction |