Power Query is one of the best features of Excel to help people automate repetitive data tasks. One of those tasks is combining multiple spreadsheets into a single table. You can import a folder full of different files and tell Excel Power Query to consolidate them automatically, making it easy to have up to date data sets by dragging and dropping new files into the same folder.
The ability to combine files is incredibly common, and one of the best Power Query use cases. We’ll cover how to import a folder full of files into Excel and explain some of the areas that can cause errors, and cover how to avoid or troubleshoot them.
Let’s jump in!
Table of Contents
Using Power Query to Consolidate Files
Power Query supports a wide range of file types, and live data connections. The biggest benefit of using Power Query to consolidate files is that you only have to setup Power Query once, and then you can refresh it like you would a pivot table, letting it bring in fresh data from whatever source it’s connected to.
To import files into Power Query in Excel, go to the Data tab on the Excel Ribbon, and then select Get Data. From here, you can choose your desired file type. This will launch the Power Query Editor allowing you to modify and transform your data as needed.
Power Query supports a wide range of file formats including Excel workbooks (.xlsx, .xlsm), CSV, XML, JSON, and text files that will work with this method. The one thing to be careful of is that it will be much easier to import and consolidate files when they are in the same format, such as having the same column names, and sheet names.
Connecting Power Query to a Folder of Files
To connect Power Query to a folder and import the files within a folder, go to the Excel Ribbon’s Data section, then select Get Data, From File, From Folder. When you select a folder to import, Excel will not initially show you the contents of the folder but press Open. Power Query will pop up and prompt you to import the files in the folder.
Here’s how to import files from a folder using Power Query in more detail.
Step 1.) Get Data From Folder
The first step to use Power Query is to import data into Power Query. You can do this from the Get data button on the Data section of the Excel Ribbon, and then select From File, From Folder.
When you select From Folder, a Windows file explorer dialog will come up. Navigate to the folder you want to open. Be aware that none of the files inside of the folder will show up at this point. This is normal behavior, we want to connect to the folder itself and allow Power Query to detect the files that are contained inside of it.
Pres Open after navigating to the folder that contains the desired files.
After pressing Open from the Windows Explorer, you will see a Power Query dialog box appear that lists the files and file types along with some other meta data about the files.
You’ll have options to Combine, Load or Transform and the option you choose largely depends on whether or not you have to transform your data. For example, if you want to limit the list of files to a subset of files by filtering the list, or if you want to remove headers etc.
For now, we’ll click Combine which will prompt us to choose the Sheet Name in our Excel files that need to be consolidated.
Step 2.) Select the Sheet of the Excel Workbooks to Consolidate
After choosing to combine files, Power Query gives you the ability to choose which Sheet to consolidate from the workbooks.
Power Query uses the concept of a Sample File that acts as a template for consolidating data. Excel needs to have a file that is designated as the first file because it will attempt to line up all other columns of data from other files based on the column headers in the first file.
Step 3.) Transform Imported Data in Power Query
After clicking OK and selecting the Sheet Name to import, you will be brought to the Power Query screen that shows the imported and consolidated data. In our example below, a new column has been created called Source.Name that designates which file each row of data came from.
At this point you can perform additional data transformations, such as right clicking on the source.name column and deleting it from the final dataset.
When finished, click on Close & Load in the top left corner of Power Query and consolidated data will be combined into a single table.
How to Refresh Power Query Data
After data is loaded into Power Query, and you click Close & Load a new table will appear in your Excel workbook. Typically it’s colored green to indicate that it’s tied to Power Query. When you add or remove files to the folder Power Query is connected to, Excel will not update until you refresh the table. Refresh a Power Query table by right clicking on the table and selecting Refresh.
Upon Refresh, Excel goes out into the folder checks for new or removed files and consolidates them. This is a fast and efficient way to add monthly or quarterly data into a folder and apply data transformations on it.
File Schema Requirements
When importing multiple Excel files it’s important to have a consistent file schema. This means that the structure of your files, including column names and data types, should be uniform. Inconsistencies can lead to import errors or incomplete data.
Misnamed columns are one of the most common causes for Power Query imports to not consolidate correctly. Even small changes make a difference.
Take for example, the following column names that would not be consolidated.
- “Sales Amount”
- “Sales Amount “
- Sals Amount”
Because each of these have a different spelling or additional white space after the name, Excel won’t automatically align the columns causing for data to look like it’s missing from Power Query, like the example below.
If data doesn’t come in correctly, check the column names to make sure that they’re all exactly the same. You can even use an Excel formula to check that the headers are exact matches.
Using Helper Queries
Helper queries in Power Query are designed to assist in complex data transformations and manipulations. They can be used to filter data, perform repeated tasks, or manage dynamic parameters. However, one of the most common use cases is prepping individual files that are being consolidated.
In this scenario we’ll consolidate files that are similar to the ones we brought in previously, but with a twist. These files all have two rows of headers. Without removing the headers first, the columns won’t align appropriately. Luckily, we can use Power Query Helper Queries to prep each individual file prior to attempting to consolidate them.
To start, try and import files and if they come in with headers, go to the Transform files section, and go to the Transform Sample.
When you transform the sample data, the transformations will be applied to each individual file prior to loading them. In the example above, we would skip the first two rows, and then promote headers so that each column has a name that matches.
Power Query will execute the transformation steps prior to the consolidation and align the columns. Subsequent transformations on the combined dataset would occur using the Consolidate Files Query in the example above.
Excel users can utilize Power Query to import and consolidate folders full of .xlsx, .csv or other common file formats. It’s easiest to import files that have the same scheme, another name for column headers and column data types. While Power Query is capable of importing and combining files when the columns aren’t named the same, or there are extra columns it won’t be able to automatically align them which can be a pain to address within Power Query.
Because it’s easier to start with clean data, we recommend using files that have consistent formats, or come directly from source systems to ensure consistency. Even when files have headers or other formatting issues, Power Query helper queries can help transform and clean them up allowing you to drop files into a folder and hit refresh in Excel to get up to date data.