Semantic Model Refresh in Fabric: Beyond the Defaults

Written by Aysha Hamisi | May 2, 2025 2:53:02 PM

Overview

In Power BI there are multiple ways to refresh an import mode semantic model from within a workspace. The familiar and standard methods using the manual refresh button or scheduling a refresh via the settings on a model are straightforward, and in many scenarios sufficient, to achieve the data refresh of your models.

But what if you need more control and options to override the defaults that are applied to the processing from the standard refresh? Maybe you have one or more of the following:

Large volumes that require incremental refresh or a custom partition strategy
A small processing window and the duration of the refresh must be minimised
Backdated changes that require a full refresh outside the incremental refresh
Only a subset of the model tables require a daily refresh
Network or latency challenges that cause refresh operations to regularly fail

Historically, dealing with these types of scenarios required a more code-heavy solution using for example PowerShell with orchestration in ADF or DevOps.

With Power BI in Fabric we have additional options leveraging data pipelines or semantic link. Low code or no code depending on the required customisation for your use case. These options are available to you to manage your semantic models even if you source data from outside Fabric or if the models being refreshed are not in F SKU workspaces – you only need to initiate the refresh from a Fabric backed workspace.

Semantic model refresh activity (Preview)

Data pipelines in Fabric have a semantic model refresh option in preview. It offers some very useful options to customise without the need for any code at all. Create a new data pipeline activity (official link here) and add the activity:

Add a Semantic model refresh activity

Configure the Settings

Pick the workspace and model, and optionally specific tables instead of the entire model:

The Select partitions option gives you extra control over the objects included in the refresh and even allows you to specify which partitions to refresh from a table with multiple partitions.

The pipeline activity executes an enhanced refresh using the API which has additional customisation options.

Advanced Settings

There's a simple interface to configure advanced settings:

Wait on completion

By default the enhanced refresh via the API is asynchronous meaning that the application that triggers the refresh does not wait for it to complete. You would normally need to check the status in a subsequent activity or command. Any subsequent activity in a workflow would start immediately, even though the refresh may still be executing. With this option ON the pipeline is forced to wait until the refresh has completed and thereby opens up the ability to sequence refreshes one after the other without the need to poll for status. We can avoid the often used approach of scheduling refreshes at estimated appropriate intervals and hoping they complete in the estimated window!

This option is not currently available in the enhanced refresh API which makes this pipeline activity particularly interesting for those looking for customisable, low-code implementations.

Max parallelism

The default setting for Power BI or Fabric models is a maximum of 6 concurrent processing operations. Increasing the parallelism can have a significant effect in reducing the refresh duration if you have many large tables or partitions.

There is an option in the pbix settings to increase the parallelism from Power BI Desktop, but the pipeline activity allows for easy adjustment on a published model – which can be useful since it isn’t necessarily faster to have a large number configured. Testing various configs can be required to find the optimal max parallelism for your particular scenario.

The max parallelism is limited by your capacity SKU limits. Be aware also that your data source may limit concurrent requests and throttle the max you are expecting based on this setting. It may be unwise in any case to issue a high number of requests in parallel to the data source.

Retry count & Commit mode

If you experience network issues that cause refresh operations to fail it can be helpful to configure these settings. Partial Batch effectively enables your refresh to restart from where it left off, rather than restarting entirely after a failure.

Retry count on its own can also be useful to override since by default the service can retry a failed refresh multiple times and in some scenarios this may not be ideal. We can set a retry of 1 to terminate the processing without retrying many times.

Scheduling

The pipeline can be run manually or scheduled using a familiar interface here, with some additional options available – see Chris Webb’s blog on refreshes with fabric data pipelines.

In summary the new Semantic model refresh activity adds useful functionality and allows you to easily customise a model refresh and sequence multiple models, with a visual interface showing progress and status e.g.:

You can also visualise the sequencing in the Gantt view here:

Refresh using Semantic Link

There has for some time been excitement around the functionality opened up by semantic link in Fabric. In short, it allows you to connect to your semantic models via the sempy Python library and administer or query models from a Fabric notebook. See Microsoft sempy fabric refresh dataset

Whilst it does involve some code, configuring and executing a semantic model refresh is relatively easy to understand and write. Refreshing a model using the default settings (i.e. the same as pressing the manual refresh button on a model in the service) can be done via these steps:

Create a Fabric notebook

Import the semantic link library

Add a code cell to import the library:

Trigger a refresh of your model

Add a code cell to run a standard refresh:

Since this uses the asynchronous API refresh we can see that the command completes immediately, even though the refresh is still running. The returned guid is the refresh request id that can be used to check the status e.g.

To get the status of all refreshes for a model you run this statement:

The scenario above is easily achievable using the default clicky clicky draggy droppy refresh interface that’s been around for years, so one could argue it’s not worth writing code for it. However, there is a scenario that has historically not been feasible without writing code and orchestrating outside of the service. Overriding an incremental refresh policy becomes quite straightforward in a Fabric notebook.

Semantic Link to override incremental refresh policy

Imagine the scenario, we have 6 years of data in the model and incrementally refresh the latest month or quarter. In an ideal world when we set up an incremental refresh policy to limit the volume of data being regularly refreshed every time, we only face the challenge of refreshing the historical partitions once at the start as a one off activity.

Unfortunately, real data is often more complicated and there can be a need to occasionally refresh the full model due to backdated data changes, or to make corrections for mistakes in upstream data processing.

Semantic link offers a relatively simple method to do this in a repeatable way, all within the service, provided you have at least one Fabric backed workspace to create and run notebooks.

We can import the semantic link library and then execute a refresh command with the apply_refresh_policy option to override the incremental refresh policy, and refresh the full model including historical partitions:

import sempy.fabric as fabric

# Run a full refresh of the model overriding the policy

fabric.refresh_dataset(

workspace = "Aysha-WIP",

dataset = "Sales Model",

refresh_type = "full",

apply_refresh_policy = "false" # switch off the incremental refresh policy

)

By default the max parallelism using this method is 10 but we can configure this and other additional parameters to unlock even more customisation. For example, run this syntax to selectively refresh only specific partitions or tables, and additionally increase the parallelism further:

The object definition is TMSL and can be easily generated from SSMS or Tabular Editor

Scheduling the Fabric notebook

If the refresh needs to be scheduled it is recommended that you add the notebook to a data pipeline and schedule the data pipeline, rather than the notebook itself. E.g. add a Notebook activity to a data pipeline:

Monitoring your refreshes

All the refresh operations initiated by the methods described are visible in the Monitor hub and you can separately see the status of a pipeline as well as any models it executes:

Conclusion

Semantic model refresh in Fabric can be achieved in multiple ways but the best option for you depends on your requirements. Hopefully after reading this article you’ve got some tips on when to use the different methods that are available now, especially if you are facing refresh issues or need to speed things up.

Have questions or want to find out more about how this could benefit your organisation? Contact us today and let's discuss how we can help you optimise your semantic models!

View full post