LLM Routing

Every LLM call in Binexia goes through litellm — a unified API client that supports 100+ providers with a single interface. The routing table in tenant_config.llm_routing maps each agent to a specific provider, model, and configuration.

Two LLM Access Patterns

Binexia uses litellm in two ways:

Pattern	Used by	How	Why
Python library (in-process)	Agno agents	`litellm.completion()`	Hot path — lowest latency, many calls per session
Proxy server (HTTP)	Dify, other services	`POST http://ubios_litellm:4000/v1/chat/completions`	OpenAI-compatible endpoint, unified key management

Both use the same API keys from .env. Agno reads them directly; the proxy reads them from its config.yaml.

How It Works (Agno Agents)

SELECT * FROM ubios_config.llm_routing WHERE agent_name = 'AnalyticsAgent';

-- Result:
-- agent_name: AnalyticsAgent
-- provider: anthropic
-- model: claude-sonnet-4-6
-- max_tokens: 4096
-- temperature: 0.1
-- api_key_env: ANTHROPIC_API_KEY
-- api_base: NULL
-- fallback_route_id: NULL
-- is_active: true

When the AnalyticsAgent needs to generate SQL, the router:

Looks up the route for AnalyticsAgent
Reads the API key from the environment variable specified in api_key_env
Calls litellm with the provider prefix + model name
If the call fails and fallback_route_id is set, tries the fallback route

Provider Compatibility Tiers

Providers are grouped by API protocol, not individual brands. Most providers speak the OpenAI protocol — same API format, different endpoint.

OpenAI or Compatible

Providers that implement the OpenAI /v1/chat/completions API. litellm handles routing automatically.

Provider	`provider` value	Env var	litellm prefix	Notes
OpenAI	`openai`	`OPENAI_API_KEY`	`openai/`	Direct
DeepSeek	`deepseek`	`DEEPSEEK_API_KEY`	`deepseek/`	Native litellm handler
Groq	`groq`	`GROQ_API_KEY`	`groq/`	Native litellm handler
Together AI	`together`	`TOGETHER_API_KEY`	`together/`	Native litellm handler
Fireworks AI	`fireworks`	`FIREWORKS_API_KEY`	`fireworks/`	Native litellm handler
Perplexity	`perplexity`	`PERPLEXITY_API_KEY`	`perplexity/`	Native litellm handler
Azure OpenAI	`azure`	`AZURE_API_KEY`	`azure/`	Native litellm handler, set `api_base`
GLM (Zhipu AI)	`glm`	`GLM_API_KEY`	`openai/`	Set `api_base: https://open.bigmodel.cn/api/paas/v4`
Any /v1 endpoint	`custom`	`CUSTOM_LLM_API_KEY`	`openai/`	Ollama, vLLM, LocalAI, etc.

Anthropic

Provider	`provider` value	Env var	litellm prefix
Anthropic	`anthropic`	`ANTHROPIC_API_KEY`	`anthropic/`

Google Gemini

Provider	`provider` value	Env var	litellm prefix
Google Gemini	`google`	`GOOGLE_API_KEY`	`gemini/`

Mistral

Provider	`provider` value	Env var	litellm prefix
Mistral	`mistral`	`MISTRAL_API_KEY`	`mistral/`

OpenRouter (Gateway)

Provider	`provider` value	Env var	litellm prefix	Notes
OpenRouter	`openrouter`	`OPENROUTER_API_KEY`	`openrouter/`	Access 100+ models with one key

Cohere

Provider	`provider` value	Env var	litellm prefix
Cohere	`cohere`	`COHERE_API_KEY`	`cohere/`

Custom Endpoints

The api_base field lets you point any provider at a custom endpoint:

-- GLM via OpenAI-compatible endpoint
UPDATE ubios_config.llm_routing
SET provider = 'glm',
    api_base = 'https://open.bigmodel.cn/api/paas/v4',
    api_key_env = 'GLM_API_KEY',
    model = 'glm-4-flash'
WHERE agent_name = 'KnowledgeAgent';

-- Self-hosted vLLM
UPDATE ubios_config.llm_routing
SET provider = 'custom',
    api_base = 'http://localhost:8000/v1',
    api_key_env = 'CUSTOM_LLM_API_KEY',
    model = 'meta-llama/Llama-3-70B'
WHERE agent_name = 'OrchestratorAgent';

Multiple Keys Per Provider

You can define multiple env vars for the same provider and assign them to different agents:

# .env.testip.local
GLM_API_KEY_1=key-for-cheap-models
GLM_API_KEY_2=key-for-premium-models

-- Agent 1 uses key 1
UPDATE ubios_config.llm_routing
SET api_key_env = 'GLM_API_KEY_1', model = 'glm-4.5-air'
WHERE agent_name = 'OrchestratorAgent';

-- Agent 2 uses key 2
UPDATE ubios_config.llm_routing
SET api_key_env = 'GLM_API_KEY_2', model = 'glm-5'
WHERE agent_name = 'AnalyticsAgent';

Fallback Routes

Set fallback_route_id to an alternate route name. If the primary call fails, the router tries the fallback:

INSERT INTO ubios_config.llm_routing (agent_name, provider, model, max_tokens, temperature, api_key_env, is_active)
VALUES ('analytics-backup', 'openai', 'gpt-4o-mini', 4096, 0.1, 'OPENAI_API_KEY', true);

UPDATE ubios_config.llm_routing
SET fallback_route_id = 'analytics-backup'
WHERE agent_name = 'AnalyticsAgent';

The LLM Routing Editor (Admin UI)

Settings → LLM Routing provides a visual editor where you can:

Change provider and model for each agent
Set custom API base URLs
Choose which env var holds the API key
Configure fallback routes
Toggle routes active/inactive

Changes take effect immediately — no restart needed.

Default Routes (IntelliTravel Demo)

Agent	Provider	Model	Why
OrchestratorAgent	openai	gpt-4o-mini	Fast routing, cheap
AnalyticsAgent	anthropic	claude-sonnet-4-6	Strong SQL generation
KnowledgeAgent	openai	gpt-4o-mini	Fast RAG responses
ContextAgent	openai	gpt-4o-mini	Fast context explanations
BehavioralScoringAgent	anthropic	claude-sonnet-4-6	Complex reasoning
AnomalyDetectionAgent	anthropic	claude-sonnet-4-6	Statistical analysis
ForecastAgent	anthropic	claude-sonnet-4-6	Time-series reasoning
DocumentExtractionAgent	openai	gpt-4o-mini	Fast field extraction