Guides
Building a Semantic Model
Step-by-step guide to creating a semantic model for your vertical.
The semantic model is what makes natural language queries work. This guide walks through building one, using IntelliTravel as a real example.
Step 1: Define Entities
Entities map your domain tables to natural language names. For each table that users might ask about:
- Go to Settings → Semantic Model → Entities
- Add the table with a clear name and description
IntelliTravel example:
| name | table_name | display_column | active_filter_sql | description |
|---|---|---|---|---|
customer | customers | name | — | Travel agency customer with segment and lifetime value |
booking | bookings | id | status NOT IN ('cancelled') | Travel booking linking customer to destination |
destination | destinations | name | — | Available travel destination |
supplier | suppliers | name | is_active = true | Service supplier (hotel, transfer, activity) |
contact_history | contact_history | id | — | Customer interaction record |
Warning
active_filter_sql is critical. Without it, queries count cancelled bookings, inactive suppliers, etc. Always think about what "active" means for each entity.
Step 2: Add Metrics
For each number users might ask about:
- Go to Settings → Semantic Model → Metrics
- Define the SQL expression, unit, and format
Key fields:
sql_expression— The actual SQL. UseCASE WHENfor conditional aggregation.base_table— Which table this metric queries.unit—number,currency, orpercentage.format_hint— How to display:,0for integers,,2 EURfor euros,0.0%for percentages.is_additive— Can this metric be summed across time periods? Revenue: yes. Average booking value: no.
IntelliTravel example:
| name | sql_expression | unit | is_additive |
|---|---|---|---|
total_bookings | COUNT(*) | number | true |
revenue | SUM(CASE WHEN status IN ('confirmed','completed') THEN total_price_eur END) | currency | true |
avg_booking_value | SUM(...) / NULLIF(COUNT(...), 0) | currency | false |
cancellation_rate | COUNT(*) FILTER (WHERE status = 'cancelled') * 100.0 / NULLIF(COUNT(*), 0) | percentage | false |
Step 3: Add Dimensions
Dimensions define how metrics get sliced:
- Time dimensions —
booking_date,travel_date. Enable day/week/month/year roll-up. - Categorical dimensions —
destination,booking_channel,customer_segment,supplier.
For each dimension, specify the table and column. The description field is LLM context — explain what the dimension means and what kind of grouping makes sense.
Step 4: Define Vocabulary
Vocabulary terms bridge business language to SQL. This is where domain expertise matters most.
Good vocabulary entries:
| Term | What it means | SQL filter |
|---|---|---|
| "high-value customer" | LTV > 5000 EUR + 2+ bookings | ltv_eur > 5000 AND id IN (SELECT ... HAVING COUNT(*) >= 2) |
| "at-risk customer" | 2+ past bookings, none in 14 months | Complex subquery on booking recency |
| "season" | High/Shoulder/Low demand period | CASE WHEN EXTRACT(MONTH ...) IN (6,7,8) THEN 'High' ... |
Each entry includes example questions — the phrases that should trigger this term:
- "high-value customer" → ["Show me all high-value customers", "Which customers spend the most"]
Step 5: Test
- Create a Raw Query widget on a test dashboard
- Ask questions that use your vocabulary terms
- Verify the generated SQL is correct
- Check chart rendering
Common issues:
- Wrong
base_table— Metric queries the wrong table - Missing
active_filter_sql— Includes inactive/cancelled records - Overly complex
sql_expression— Simplify; the LLM handles complexity - Vocabulary too narrow — Add more example questions per term
What Happens Next
When you save the semantic model:
schema_contextis regenerated for the LLM- Existing cached queries remain until TTL expires
- Metabase and widgets that reference changed metrics are flagged for review