Core Concepts

LLM Routing

How Binexia routes each agent to the best LLM provider and model using litellm.

Every LLM call in Binexia goes through litellm — a unified API client that supports 100+ providers with a single interface. The routing table in tenant_config.llm_routing maps each agent to a specific provider, model, and configuration.

Two LLM Access Patterns

Binexia uses litellm in two ways:

PatternUsed byHowWhy
Python library (in-process)Agno agentslitellm.completion()Hot path — lowest latency, many calls per session
Proxy server (HTTP)Dify, other servicesPOST http://ubios_litellm:4000/v1/chat/completionsOpenAI-compatible endpoint, unified key management

Both use the same API keys from .env. Agno reads them directly; the proxy reads them from its config.yaml.

How It Works (Agno Agents)

SELECT * FROM ubios_config.llm_routing WHERE agent_name = 'AnalyticsAgent';

-- Result:
-- agent_name: AnalyticsAgent
-- provider: anthropic
-- model: claude-sonnet-4-6
-- max_tokens: 4096
-- temperature: 0.1
-- api_key_env: ANTHROPIC_API_KEY
-- api_base: NULL
-- fallback_route_id: NULL
-- is_active: true

When the AnalyticsAgent needs to generate SQL, the router:

  1. Looks up the route for AnalyticsAgent
  2. Reads the API key from the environment variable specified in api_key_env
  3. Calls litellm with the provider prefix + model name
  4. If the call fails and fallback_route_id is set, tries the fallback route

Provider Compatibility Tiers

Providers are grouped by API protocol, not individual brands. Most providers speak the OpenAI protocol — same API format, different endpoint.

OpenAI or Compatible

Providers that implement the OpenAI /v1/chat/completions API. litellm handles routing automatically.

Providerprovider valueEnv varlitellm prefixNotes
OpenAIopenaiOPENAI_API_KEYopenai/Direct
DeepSeekdeepseekDEEPSEEK_API_KEYdeepseek/Native litellm handler
GroqgroqGROQ_API_KEYgroq/Native litellm handler
Together AItogetherTOGETHER_API_KEYtogether/Native litellm handler
Fireworks AIfireworksFIREWORKS_API_KEYfireworks/Native litellm handler
PerplexityperplexityPERPLEXITY_API_KEYperplexity/Native litellm handler
Azure OpenAIazureAZURE_API_KEYazure/Native litellm handler, set api_base
GLM (Zhipu AI)glmGLM_API_KEYopenai/Set api_base: https://open.bigmodel.cn/api/paas/v4
Any /v1 endpointcustomCUSTOM_LLM_API_KEYopenai/Ollama, vLLM, LocalAI, etc.

Anthropic

Providerprovider valueEnv varlitellm prefix
AnthropicanthropicANTHROPIC_API_KEYanthropic/

Google Gemini

Providerprovider valueEnv varlitellm prefix
Google GeminigoogleGOOGLE_API_KEYgemini/

Mistral

Providerprovider valueEnv varlitellm prefix
MistralmistralMISTRAL_API_KEYmistral/

OpenRouter (Gateway)

Providerprovider valueEnv varlitellm prefixNotes
OpenRouteropenrouterOPENROUTER_API_KEYopenrouter/Access 100+ models with one key

Cohere

Providerprovider valueEnv varlitellm prefix
CoherecohereCOHERE_API_KEYcohere/

Custom Endpoints

The api_base field lets you point any provider at a custom endpoint:

-- GLM via OpenAI-compatible endpoint
UPDATE ubios_config.llm_routing
SET provider = 'glm',
    api_base = 'https://open.bigmodel.cn/api/paas/v4',
    api_key_env = 'GLM_API_KEY',
    model = 'glm-4-flash'
WHERE agent_name = 'KnowledgeAgent';

-- Self-hosted vLLM
UPDATE ubios_config.llm_routing
SET provider = 'custom',
    api_base = 'http://localhost:8000/v1',
    api_key_env = 'CUSTOM_LLM_API_KEY',
    model = 'meta-llama/Llama-3-70B'
WHERE agent_name = 'OrchestratorAgent';

Multiple Keys Per Provider

You can define multiple env vars for the same provider and assign them to different agents:

# .env.testip.local
GLM_API_KEY_1=key-for-cheap-models
GLM_API_KEY_2=key-for-premium-models
-- Agent 1 uses key 1
UPDATE ubios_config.llm_routing
SET api_key_env = 'GLM_API_KEY_1', model = 'glm-4.5-air'
WHERE agent_name = 'OrchestratorAgent';

-- Agent 2 uses key 2
UPDATE ubios_config.llm_routing
SET api_key_env = 'GLM_API_KEY_2', model = 'glm-5'
WHERE agent_name = 'AnalyticsAgent';

Fallback Routes

Set fallback_route_id to an alternate route name. If the primary call fails, the router tries the fallback:

INSERT INTO ubios_config.llm_routing (agent_name, provider, model, max_tokens, temperature, api_key_env, is_active)
VALUES ('analytics-backup', 'openai', 'gpt-4o-mini', 4096, 0.1, 'OPENAI_API_KEY', true);

UPDATE ubios_config.llm_routing
SET fallback_route_id = 'analytics-backup'
WHERE agent_name = 'AnalyticsAgent';

The LLM Routing Editor (Admin UI)

Settings → LLM Routing provides a visual editor where you can:

  • Change provider and model for each agent
  • Set custom API base URLs
  • Choose which env var holds the API key
  • Configure fallback routes
  • Toggle routes active/inactive

Changes take effect immediately — no restart needed.

Default Routes (IntelliTravel Demo)

AgentProviderModelWhy
OrchestratorAgentopenaigpt-4o-miniFast routing, cheap
AnalyticsAgentanthropicclaude-sonnet-4-6Strong SQL generation
KnowledgeAgentopenaigpt-4o-miniFast RAG responses
ContextAgentopenaigpt-4o-miniFast context explanations
BehavioralScoringAgentanthropicclaude-sonnet-4-6Complex reasoning
AnomalyDetectionAgentanthropicclaude-sonnet-4-6Statistical analysis
ForecastAgentanthropicclaude-sonnet-4-6Time-series reasoning
DocumentExtractionAgentopenaigpt-4o-miniFast field extraction