Incremental Refresh in Power BI is an incredibly useful and powerful feature to save you both time and money. Understanding how to deploy it in Power BI is an often overlooked but essential skill. Like many of the features in Power BI there is a level of nuance to understanding when it’s appropriate and when it can result in incorrect information being presented on reports.
We’ll explain how to setup incremental refresh, but more importantly, we highlight several use cases and point out several best practices and considerations when setting up your Power BI data sources.
Let’s jump in!
Table of Contents
What is Power BI Incremental Refresh?
Incremental Refresh is a functionality within Power BI that allows people to update only the newly added or most recent information in a data model. Data is partitioned and new data is appended to the already existing data. This technique is especially useful for large datasets. It’s much more efficient for keeping your data up to date compared to fully refreshing a dataset.
Full refreshes are the default setting in Power BI and they work by essentially deleting all of the existing data and importing data from a data source into a Power BI data model, which is stored on the Power BI service or in a .PBIX file. When a full refresh occurs, all of the data has to be transferred into Power BI prior to being able to use it in calculations and reporting.
There are a number of benefits to using incremental refresh as opposed to a full data refresh.
Why You Should Use Incremental Refresh
Because all of the data has to be imported, instead of only the new data it can take a significant amount of time and compute resources when working with large datasets. Along with time savings there are a number of other reasons you should consider implementing incremental refresh in your next Power BI project.
- Time Savings – When using incremental refresh it will be faster to reload Power Query during the report development phase, and will take less time to refresh if you have a Power BI dataset scheduled to refresh through ought the day or week.
- Cost Savings – Cloud database and data warehouse services charge typically charge you based on compute usage, the amount of data transferring between systems or both. Desire their popularity, Azure, Snowflake, AWS and other services can become very expensive when you’re moving large amounts of data around. Minimizing the amount of data being transferred into Power BI keeps your costs down.
- Reduced Power BI Service Load – Data refreshes also use resources in the Power BI service. When you use incremental refresh it will reduce the load on Power BI and let you push out or avoid having to upgrade to a higher tier of reserved compute power for Power BI Premium.
As you can see there are benefits for both small and large deployments. While smaller deployments may not rely on transferring data from cloud services or use Power BI Premium as often they will still see the benefits of time savings when developing reports.
When Should You Use Incremental Refresh?
Incremental refresh has a number of benefits but isn’t appropriate or worth the time to setup for all of your datasets. Typically you will want to look at incremental refresh under the following scenarios:
- Large datasets that suffer from long load times
- Cloud-based data sources with volume-based charge structures
- Datasources that are frequently updated
It’s also worth noting that the cause of a long load time can be caused by factors outside of the size of a dataset. For example, developers working over a slow VPN or on a bad connection will see benefits of incremental refresh but the benefit will be minimal once the dashboard is published to the Power BI service.
In cases such as these it may make sense to export a dataset to a .csv or .xlsx and switch it to a larger dataset once the report is built out.
Pre-requisites for Incremental Refresh
To effectively implement Incremental Refresh, you need at least one column that can serve as a parameter to filter data. Usually it’s a date column, like invoice date, or sales order date. You may also want to work with your data engineering team if a table does not have a visible date column, often times they can add a date stamp to tell when a data warehouse was last updated, or they’ll already have one in a dev environment that they can make visible to you.
How to Setup Incremental Refresh in Power BI
Setting up incremental refresh in Power BI includes several steps. It starts with defining the start and end date of when you want to refresh the Power BI dataset. This is done by creating parameters. Parameters are essentially place holders that can be dynamically updated for other values. You then apply filters to the dataset, publish the dataset, configure it in the Power BI service and then establish a refresh schedule.
We’ll break these steps down into greater detail.
1.) Creating Parameters
To create a parameter, first launch the Power Query editor. You can do this from the report view by right clicking on a data field and click edit query. Then navigate to the Home tab of the Power Query ribbon, and click Manager Parameters.
Select New Parameter which launches the Parameter Editor.
You will need to define two parameters, a start date and an end date for the incremental refresh. Use the names RangeStart and RangeEnd, they are special reserved names in Power BI. Define the data type as Date or Date/Time and a starting value that you would like to use.
There is a small “New” button at the top of the parameter names where you can add new ones without closing out of the screen.
Parameters will show up on the left side under your dataset queries. You can click on Manage Parameter to edit them if needed.
2.) Filter Existing Dataset
Within Power Query, go back to your primary dataset by selecting the Query or table that you’re working with. This is where we will apply a Custom Filter on the Date Column or other column that you want to filter your data.
There is a small icon in the middle of the row filter that lets you select a Parameter.
3.) Close Power Query and Setup the Incremental Refresh Settings
Setting up incremental refresh settings takes place in the Report View of Power BI. Close out of Power Query, wait for your dataset to load and then right click on a table and select Incremental Refresh.
This launches the incremental refresh setup screen. If this screen is not available to launch, go back and make sure that your parameters are properly named and they have been assigned to a column for filtering.
From here you have a few options, you can choose how long to archive data for, and you have the ability to determine what the most recent period is that you want to be refreshed. In typical accounting scenarios we would probably set this to a t least 2 months to ensure that partial month data isn’t being pulled in when working with an ERP, but would want to take extra care to make sure that data is finalized prior to pulling it in.
Values that are set are used to replace the RangeStart and RangeEnd values that were previously defined in parameters.
4.) Publish Report and Perform Initial Refresh
After setting up the date ranges for the refresh, publish your report by clicking the Publish button on the report viewer and put the report into a workspace on the Power BI Service.
Your first refresh will need to be done manually and may take a long time as it pulls in the full starting dataset. To manually refresh a dataset log into PowerBI.com, go to the workspace and click on Refresh Now next to the dataset.
The button to the right of Refresh Now will allow you to setup an automatic refresh schedule. The options available will depend on the data source location, and whether you are working in a Pro or Premium workspace.
We also recommend checking out the official documentation on Microsoft Learn related to incremental refresh, as with many complex features there is some nuance to getting the most out of it. You may even find that there are valuable features in a Power BI Premium license.
Balancing Incremental Refresh with Periodic Full Refreshes
One of the challenges that organizations often face is managing refresh schedules comes when data is backdated, or arrives late. When working with Incremental Refresh, these changes may not be picked up and may require you to periodically fully refresh the dataset to make sure that data is up to date and accurate.
Why Periodic Full Refreshes Are Important
There are a few reasons that you should still consider fully refreshing data. Some of them include:
- Historical Data Adjustments – Inserting or updating records in past periods can escape the Incremental Refresh criteria. A full refresh ensures that any backdated data is captured.
- Data Integrity – Full refreshes can act as a way to validate and ensure completeness of your data.
- Systematic Error Correction – On occasion computers mess up, connections drop, or whatever kind of voodoo that causes IT problems. A full refresh will mitigate these from becoming lasting issues in your dataset.
If you decide to implement a full refresh schedule, we recommend doing it on a regular basis that aligns with business needs. For example, accounting data could best be updated on a quarterly or annual basis after you’re sure that all periods are locked.
Note that as datasets grow in size, it can take a long time for full refreshes to occur. It may make sense to schedule them in off hours or manually set one off before you leave for the day.
Data Size Limits in Power BI Instances
Power BI developers should be aware that there are data size limitations to the Power BI Service and they vary based on which tier of licensing you have and which settings you have enabled. There are both dataset size limits along with workspace size limits. You can get around some individual data set size limits by splitting it into smaller datasets, or it may be beneficial to sign up for a Premium Per User license that supports much larger datasets.
To learn more, please reference Microsoft Licensing: Pricing & Product Comparison | Microsoft Power BI
and be aware that there are additional settings to working with files bigger than 10gb: Large datasets in Power BI Premium – Power BI | Microsoft Learn
If dataset sizes are consistently exceeding these, it’s recommended to look into a more robust large data storage solution, such as Microsoft One Lake.
Incremental refresh can be a huge time saver, and sometimes more importantly a money saver. Cloud data providers are notorious for sticking companies with exceptionally large bills when they don’t fully understand the cost structure of moving data around and paying for on demand compute power. These costs can partially be controlled by limiting the amount of data being transferred at any given point making incremental refresh a highly valuable part of your Power BI toolbox and should be considered a best practice for any large enterprise deployment.