Core Concepts

Semantic Model

The core abstraction that makes natural language queries work — entities, metrics, dimensions, and vocabulary.

The semantic model is the most important concept in Binexia. It sits between the raw database schema and the LLM, translating business concepts into queryable SQL. Every natural language query goes through the semantic model.

Info

The semantic model is the single source of truth. Metabase, widgets, and agent queries all derive from it. When you change a metric definition here, downstream consumers update automatically.

The Four Building Blocks

Entities

Map domain tables to natural language names that the LLM understands.

FieldExamplePurpose
namebookingMachine name used in queries
display_nameBookingHuman-readable name
table_namebookingsActual PostgreSQL table
primary_key_columnidPrimary key
display_columnidDefault display value
active_filter_sqlstatus NOT IN ('cancelled')Excludes inactive rows from queries
description"A travel booking linking a customer to a destination"LLM context

Metrics

SQL expressions that compute values. Each metric knows its table, unit, and format.

FieldExamplePurpose
namerevenueMachine name
sql_expressionSUM(CASE WHEN status IN ('confirmed','completed') THEN total_price_eur END)The actual SQL
base_tablebookingsWhich table to query
unitcurrencynumber, currency, percentage
format_hint,2 EURDisplay formatting
is_additivetrueWhether it can be summed across time periods

Dimensions

How metrics get sliced and grouped.

FieldExamplePurpose
namebooking_dateMachine name
dimension_typetimetime or categorical
table_namebookingsSource table
column_namecreated_atSource column
description"Supports grouping by day, week, month, or year"LLM context

Time dimensions support automatic roll-up (day → week → month → year). Categorical dimensions support filtering and grouping.

Vocabulary

Business terms mapped to SQL filters — the bridge between how people talk and how data is structured.

-- "high-value customer" becomes:
ltv_eur > 5000
AND id IN (SELECT customer_id FROM bookings
           GROUP BY customer_id HAVING COUNT(*) >= 2)

-- "at-risk customer" becomes:
id IN (SELECT customer_id FROM bookings GROUP BY customer_id HAVING COUNT(*) >= 2)
AND id NOT IN (SELECT customer_id FROM bookings WHERE travel_date >= CURRENT_DATE - INTERVAL '14 months')

Each vocabulary entry includes example questions that should trigger it:

TermExample questions
"high-value customer""Show me all high-value customers", "Which customers spend the most"
"at-risk customer""List at-risk customers", "Who might churn soon"
"seasonal customer""Which customers travel seasonally", "Show seasonal travelers"

schema_context

The schema_context table holds an auto-generated LLM prompt assembled from all entities, metrics, dimensions, and vocabulary. This is what the AnalyticsAgent receives at query time — not raw DDL, but a curated description of the data model.

When you edit the semantic model in the admin UI, the schema_context is regenerated.

The Admin Semantic Model Editor

Settings → Semantic Model in the admin UI provides a visual editor for all four building blocks. Changes take effect immediately for new queries. Existing cached results remain until their TTL expires.