Architecture

Request Flow

Three real requests traced end-to-end through the Binexia stack.

How data flows through the system for three common scenarios.

1. User asks a natural language question

"What's our revenue this month compared to last month?"

Browser
  └─ POST /api/v1/agent/query { question: "..." }
      └─► Laravel API
            ├─ Auth check (Sanctum)
            ├─ Load semantic schema_context from tenant_semantic
            └─► POST http://ubios_agno:8001/query
                  └─► Agno OrchestratorAgent
                        ├─ Analyzes: structured data question
                        └─► Routes to AnalyticsAgent
                              ├─ Reads schema_context
                              ├─ Generates SQL via litellm
                              ├─ Executes SQL via ubios_reader
                              └─ Returns { sql, data, chart_config }
            ◄── Agno response
            ├─ Cache result in Redis (5 min TTL)
            └─► JSON response to browser
                  └─► Recharts renders chart

Key points:

  • The semantic model (schema_context) is the LLM prompt — not raw schema DDL
  • SQL executes via ubios_reader (read-only, with statement timeout)
  • Cached results skip the LLM entirely on repeat queries
  • Agno uses litellm Python library (in-process) — no HTTP hop to the proxy
Browser — click on "Revenue" KPI card
  └─ POST /api/v1/agent/context { entity: "booking", metric: "revenue", filters: [...] }
      └─► Laravel API
            └─► POST http://ubios_agno:8001/context
                  └─► ContextAgent
                        ├─ Fetches entity data from Redis (pre-cached by nightly agents)
                        ├─ Generates natural language explanation (cached 5 min)
                        └─ Returns { entity_data, explanation, related_metrics }
            ◄── Agno response
            └─► JSON response to browser
                  └─► Context panel renders overlay

Key points:

  • Entity data is served from Redis cache (pre-populated by scheduled agents)
  • The click triggers a fresh API call — no browser-side caching
  • LLM explanations are cached server-side with 5-minute TTL

3. Scheduled agent runs

Laravel scheduler (runs inside queue container)
  └─ php artisan schedule:run
      └─► Dispatches BehavioralScoringAgent job
            └─► Queue worker picks up job
                  └─► POST http://ubios_agno:8001/score
                        └─► BehavioralScoringAgent
                              ├─ Queries booking patterns from tenant_data
                              ├─ Scores each customer for churn risk
                              └─ Returns { scores: [...] }
                  ◄── Agno response
                  ├─ Writes scores to tenant_agent_state.agent_outputs
                  └─► Creates notifications for high-risk customers
                        ├─ In-app notification (agent_state.notifications)
                        └─ Email via Mailpit (critical alerts only)

Key points:

  • Scheduled agents run in the queue container, not the api container
  • Agent outputs expire after 30 days (expires_at column)
  • Only critical-severity outputs trigger email notifications