If you're working in Databricks and you're tired of repeating the same joins and recalculations across dashboards or queries, metric views are a game-changer.
As an analyst, I often found myself duplicating logic, rechecking definitions, or digging through tables just to answer simple questions. Then I discovered metric views, and everything got a lot smoother.
Metric views let you define your data logic once and reuse it across dashboards, Genie Rooms, and notebooks. They simplify maintenance, improve consistency, and save time across every project.
Here’s a simple step-by-step guide to building your first metric view, even if you’re not a data engineer.
A metric view is like a curated layer that sits between your raw data and your dashboards or reporting tools. It contains:
Instead of repeating this setup in every dashboard, you define it once and reuse it across multiple tools.
Measures in metric views can range from simple aggregated values (SUM, COUNT, AVG) to more advanced types like rolling windows, month-to-date (MTD), year-to-date (YTD), and growth calculations. You can also create calculated dimensions for grouping, filtering, and labelling, such as flags for weekends or mapping status codes to readable names. Joins are always left outer joins, ensuring you retain all records from your base table.
Databricks metric views are defined using YAML. You can write this directly in your workspace using a .yaml file or in notepad and upload it yourself.
Here's the basic structure:
version: 0.1
source: your_catalog.your_schema.your_fact_table
joins:
- name: dimension_name
source: your_catalog.your_schema.dimension_table
on: source.key = dimension_name.key
dimensions:
- name: dimension_name
expr: column_name_or_expression
measures:
- name: metric_name
expr: SQL_expression
description: What this metric measures
format: percent/integer/decimal
You can also define more complex measures:
Example:
version: 0.1
source: analytics.cpg.fact_delivery_lines
joins:
- name: customers
source: analytics.cpg.dimension_customers
on: source.customer_id = customers.customer_id
dimensions:
- name: customer_name
expr: customers.customer_name
- name: delivery_month
expr: DATE_TRUNC('month', source.delivery_date)
measures:
- name: otif_percentage
expr: SUM(CASE WHEN delivery_status = 'OnTime' THEN 1 ELSE 0 END) / COUNT(*)
description: On-Time In-Full delivery percentage
format: percent
Once your YAML is written, it's time to bring it into Databricks and test it.
Once you’ve written your YAML:
Use the metric view explorer in Databricks to preview it
Check:
For time-based measures like YTD, make sure the date logic matches your business rules. Trailing windows use the latest date in the dataset rather than today’s system date.
You can run quick test queries or connect it to a lightweight dashboard to confirm everything works as expected.
Once it's looking good, it's time to set up a Genie Room.
Once it’s tested, you can start using the metric view across Databricks:
Any updates made to the metric view will automatically flow through all these tools, reducing rework and improving trust in your data.
To create a GenieRoom:
This connects your curated data model to a natural language interface that business users can explore confidently.
To make the GenieRoom even more useful, add enhancements directly into your metric view YAML:
Example:
name: delivery_date
description: Date the delivery was scheduled or occurred
synonyms: 'delivery date', 'shipment date', 'scheduled date'
In your GenieRoom settings, also add and refine:
These make theRoom much more accessible for non-technical users.
Once your GenieRoom is set up and connected to your metric view, it’s important to test it properly to make sure it works as expected for end users. This is your chance to check that everything from synonyms to sample questions is performing correctly.
Here’s how to test it effectively:
Now test it from a user's point of view. Ask questions like:
Check the following:
If something doesn’t work as expected, go back to your YAML or GenieRoom settings and refine it. Ask a someone else to try it out who isn’t familiar with the underlying data. If they can find what they need without asking you for help, your setup is running correctly.
Once it’s tested, you can start using the metric view across Databricks:
Any updates made to the metric view will automatically flow through all of these tools, reducing rework and improving trust in your data.
Now that your metric view is ready, you can use it to build a dashboard in Databricks without writing complex SQL or redoing your calculations.
Here's how:
SELECT
delivery_month,
customer_name,
otif_percentage
FROM
catalog.schema.metric_view_name
You can repeat this process to build out more charts, using different queries from the same metric view. You can also add filters (like month or customer) to let users explore the data. If you ever need to update a calculation, you can make the change once in the metric view and see it reflected across all connected visualisations automatically.
Because metric views centralise logic, changes to a calculation update every connected dashboard instantly and no duplicated SQL fixes needed.
“average_lead_time_days”
You don't need to be a data engineer to build a metric view. You just need to understand your data and the questions your stakeholders are asking.
With Databricks metric views, you can define trusted logic once and reuse it across dashboards, Genie Rooms, and notebooks. It’s one of the most effective ways I’ve found to deliver consistent, scalable analytics. It has made my work faster, more accurate, and much easier to maintain.
If you haven't tried them yet, now is the time.
Ready to see how metric views can work in your environment? Talk to our experts and discover how to put them into practice.