Semantic Model

The semantic model is the most important concept in Binexia. It sits between the raw database schema and the LLM, translating business concepts into queryable SQL. Every natural language query goes through the semantic model.

Info

The semantic model is the single source of truth. Metabase, widgets, and agent queries all derive from it. When you change a metric definition here, downstream consumers update automatically.

The Four Building Blocks

Entities

Map domain tables to natural language names that the LLM understands.

Field	Example	Purpose
`name`	`booking`	Machine name used in queries
`display_name`	`Booking`	Human-readable name
`table_name`	`bookings`	Actual PostgreSQL table
`primary_key_column`	`id`	Primary key
`display_column`	`id`	Default display value
`active_filter_sql`	`status NOT IN ('cancelled')`	Excludes inactive rows from queries
`description`	"A travel booking linking a customer to a destination"	LLM context

Metrics

SQL expressions that compute values. Each metric knows its table, unit, and format.

Field	Example	Purpose
`name`	`revenue`	Machine name
`sql_expression`	`SUM(CASE WHEN status IN ('confirmed','completed') THEN total_price_eur END)`	The actual SQL
`base_table`	`bookings`	Which table to query
`unit`	`currency`	number, currency, percentage
`format_hint`	`,2 EUR`	Display formatting
`is_additive`	`true`	Whether it can be summed across time periods

Dimensions

How metrics get sliced and grouped.

Field	Example	Purpose
`name`	`booking_date`	Machine name
`dimension_type`	`time`	`time` or `categorical`
`table_name`	`bookings`	Source table
`column_name`	`created_at`	Source column
`description`	"Supports grouping by day, week, month, or year"	LLM context

Time dimensions support automatic roll-up (day → week → month → year). Categorical dimensions support filtering and grouping.

Vocabulary

Business terms mapped to SQL filters — the bridge between how people talk and how data is structured.

-- "high-value customer" becomes:
ltv_eur > 5000
AND id IN (SELECT customer_id FROM bookings
           GROUP BY customer_id HAVING COUNT(*) >= 2)

-- "at-risk customer" becomes:
id IN (SELECT customer_id FROM bookings GROUP BY customer_id HAVING COUNT(*) >= 2)
AND id NOT IN (SELECT customer_id FROM bookings WHERE travel_date >= CURRENT_DATE - INTERVAL '14 months')

Each vocabulary entry includes example questions that should trigger it:

Term	Example questions
"high-value customer"	"Show me all high-value customers", "Which customers spend the most"
"at-risk customer"	"List at-risk customers", "Who might churn soon"
"seasonal customer"	"Which customers travel seasonally", "Show seasonal travelers"

schema_context

The schema_context table holds an auto-generated LLM prompt assembled from all entities, metrics, dimensions, and vocabulary. This is what the AnalyticsAgent receives at query time — not raw DDL, but a curated description of the data model.

When you edit the semantic model in the admin UI, the schema_context is regenerated.

The Admin Semantic Model Editor

Settings → Semantic Model in the admin UI provides a visual editor for all four building blocks. Changes take effect immediately for new queries. Existing cached results remain until their TTL expires.