There are times when you have to import multiple files of the same format into Power Query. The most common scenario is consolidating Excel files saved in a folder, but each file needs to specific transformations applied prior to importing them. Helper Queries allow you to apply the same data transformation steps to individual files then automatically consolidate them.
We’ll break down how to use Helper Queries and the Transform Sample File option to prep individual files before consolidating them into a larger dataset. Even though the feature isn’t used very often, it’s incredibly useful when you need it and should be in every Power BI Developer’s toolbox.
There are some additional tips and tricks that you’ll want to be aware of to avoid common pitfalls.
Let’s jump in and use an example of importing multiple Excel Files from a Folder using Helper Queries!
Table of Contents
How to Import a Folder of Excel Files into Power BI
To connect Power BI to a folder full of Excel Files, navigate to Home, Get Data, All, Folder then select the folder where your files are stored. If all of your files have the same headers on the top row, Power BI can automatically consolidate them into a single data source. If each file requires some data transformations to get them into the same format, you will want to consider using Helper Queries and Sample File Transformations.
After selecting a folder to import, you can click on Combine to automatically import the files and consolidate them into a single table to work with. This method will only work if the files are in the same format and Power BI can determine how to line up the different columns to append them together.
If you need to make changes to your data prior to importing it, click on Transform.
What are Power Query Helper Queries?
Power Query helper queries let you pre-process individual files prior to combining them. They allow you to apply data transformations, and prep your data one file at a time. When the helper queries are done they feed the transformed data into the primary query giving you a consolidated dataset to work with.
You’ve likely seen them but not thought much about it. The following screenshot is a Helper Query shown in the Power Query view of Power BI.
Let’s look at an example of how to use them.
How to Use Helper Queries in Power Query
To use helper queries in Power Query, select a Sample File to transform under the Helper Queries section, and apply data transformation steps like you would to a full dataset. All of the commands are the same, with the primary difference being that the transformations are applied to each individual file at the same time. Switch back and forth between the sample query and the primary query to ensure that transformations are being applied appropriately.
Because queries are being applied to multiple files that could have different formats, there is some nuance to using this feature.
Let’s look at an example using the following two Excel spreadsheets. These are the same general format, but have different column headers, and the headers start on row 5. If the headers were on the first row and the same text it would not be necessary to transform each one individually.
Start the process, in Power BI by importing from a folder Home, Get Data, All, Folder, select the folder, choose your files and click on Transform Data.
You’ll see a table of files listed, go to Content and Expand it by clicking on the Two Down Arrows in the top right corner of the Content column.
The next screen prompts you to Select Template File, as long as the data is consistent it doesn’t make much of a difference as to which file you select. This is setting up a template that Power Query will use as a starting point to transform each file.
From the Transform Sample File Section, begin applying data transformation steps.
In this example, we applied the following steps:
- Delete Promoted Headers
- Remove Top 3 Rows
- Use First Row as Headers
As we’re working on it, you’ll probably see that there is an error in the primary query.
To find out what the error is, click on the primary query, then scroll down to where the two datasets change. You’ll see the name of the first file being imported correctly, then Errors where the second file would normally be appended into the dataset.
The error occurs because one of the data transformation steps wasn’t able to be applied to each file.
When this error occurs, the most common cause is a difference in column names between datasets. It’s one of the trickiest errors to workaround in Power Query because so many of the data transformation steps are based on referencing specific column names.
Helper Queries Must Have the Same Column References
Power Query applies the same applied steps from a Transform Sample File to each file. If your columns have different names between files many of the data transformations will cause errors when the columns aren’t found in a subsequent dataset that differs from the Sample File Template.
For example, here are the transform sample file steps as soon from the Power Query Advanced Editor where you can view the Power Query M code:
As you can see in the 5th row, the Changed Type step references Apples and Oranges columns in addition to others that are not in our second vegetables file. The mis-match causes an error when Power Query is unable to find the columns.
The first thing to check for an error is deleting Change Data Type applied steps. They can be applied later at the primary query level. Even if you’re not intending it to happen, things like Promoting Headers will automatically create a Changed Type step.
After fixing the data type mis-match, we can look at our primary query and see that our datasets are consolidated into a single one.
Working with Datasets with Different Column Headers in Power Query
Power Query typically has multiple ways of getting to the same result, but there can be some nuance to the exact way that you get there. When working with datasets that have different column names you have to be careful to make sure that the specific transformations you are using reference as few column names as possible.
For example, the following excerpt from the Power Query Advanced Editor shows two ways to pivot data. The first selects several columns and uses the Unpivot Only Selected Columns Option.
#"Unpivoted Only Selected Columns" = Table.Unpivot(#"Promoted Headers", {"Apples", "Oranges", "Bananas"}, "Attribute", "Value")
This second example selects a date column and uses the Unpivot Other Columns option.
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Promoted Headers", {"Date"}, "Attribute", "Value")
These two queries return the same results, but notice how the specific field names are not defined in the second transformation other than the Date column which is common between both datasets.
Another common issue occurs when removing specific column names vs. using the removing other column feature. One references a column name that may not appear in all datasets while the other only references columns that should exist across files.
It may take some experimenting with different options in Power Query to avoid data transformation errors across multiple files, but there is typically a way to do it as long as the data you are working with is consistent across files.
Conclusion
Whether you’re working in Power BI or Excel, the Power Query Transform Sample File feature is an incredibly useful tool. It’s indispensable when you have to prep similar datasets prior to consolidating them into a larger table that can be imported into a report.
Apply data transformation steps to Transform Sample File. The transformations that occur to the template are applied to the primary query in the “Other Queries” Section. It’s a good practice to periodically check the primary query to make sure that you don’t get a data error and the applied steps are being applied appropriately to all of the different files.
Once you finish updating the helper query you can click on Close and Apply in Power Query, and the data will be loaded. After loading, you will work with a single dataset of the consolidated table. From that point on, you won’t even be aware that helper queries are behind the scenes setting up your reporting data.