Use the Power BI Merge function to combine two tables, placing one on the left and one on the right based on a common criterion. It’s the same as a SQL Join and similar to using an Excel VLOOKUP to bring in additional columns from multiple datasets. Merging can combine two datasets in Power BI and is an essential function of data preparation in Power BI.
Merging tables together in Power BI is an important skill to master. We’ll look at both how to merge or join tables together along with the use cases where the technique should be deployed.
Let’s dive in!
Table of Contents
When Should Merge Queries in Power BI?
Datasets should be merged in Power BI when you need a consolidated dataset that combines information into a single table. Some common use cases include:
- Merge Vendor Numbers with Vendor Addresses on separate tables.
- Merge expected Sales Demand with Current Inventory Levels
- Merge Customer Details with Customer Purchase History
In all 3 of these use cases there would typically be a commonality between them. In the examples above the common fields would be something like:
- Vendor Number
- SKU or Inventory Control Number
- Customer Number
Joining tables together in Power BI based on a common value in a column is by far the most common use case. You can think of it like using an Excel VLOOKUP to pull in additional columns from a separate tab to create one table with all of the columns you need in your reporting dashboard.
However, you can also merge tables together to find mis-matched datasets or use it as a way to filter on multiple criteria.
What is a Star Schemas in Power BI?
It’s good to have a working knowledge of Star Schemas when talking about merging and joining data in Power BI. A common practice in data modeling is to split data into multiple smaller tables rather than one gigantic table with all of the columns you could ever possibly need. This makes it easier to find the data that you want, and it helps reports run faster because it’s only calculating with the data that you need instead of using all possible data that’s available.
Another way to look at it, is that there is a primary table. Then related information that you might want is split out into separate tables.
The example below shows Cookie Sales as the primary table (Fact Table). It has 4 columns to track Cookie Sales.
There are two additional tables called Cookie Details and Customer Details (Dimension Tables) that have a single column in common with the sales data.
These datasets are related by a common field, and once merged together you can build one bigger table as needed.
An added benefit is that if a customer updates their address, you only have to update it in on place, on the customer details table. Otherwise, you would have to find and replace all in the sales table which could be thousands or millions of rows.
The example above is a simple one with only three tables, but the concept is very common when pulling data out of accounting or ERP systems.
The name Star Schema comes from the way that a dataset looks when you have many different complementary tables to the primary one. Connections go out from the primary table to the secondary tables creating a star shape.
Most systems will have a large number of tables, but only a small subset of them are needed in most reports.
We can combine these tables into a consolidated one using the Power BI Merge function.
How to Merge or Join Two Tables in Power BI
To Merge or Join two datasets in Power BI, First Launch the Power Query Editor. Navigate the Merge Queries button on the Home tab of the Power Query Ribbon. Select Merge Queries or Merge Queries as New. When the Merge options screen appears, select the second table you want to merge and then click on each column you want to use as matching criteria.
In the following example we will merge a table of Cookie Sales data with a Customer Address table. The commonality between the two tables is the Customer Number column.
Each table has been imported in Power BI, and we launched the Power Query editor.
From Power Query, click on the Merge Queries button on the Home tab of the Power Query Editor.
Merge Queries provides 2 options.
- Merge Queries – This will bring columns from the second table into the first one. Modifying the first table.
- Merge Queries as New – This option leaves both tables alone, and creates a new third table with the combined columns.
After selecting a Merge method, the Merge options screen appears.
1.) Select the table that you want to merge into the first column, using the dropdown in the middle of the page.
2.) After selecting a second table, a list of columns appears.
3.) Click on a column in each datasets. The selected columns are used to match up the values on a row by row basis.
Note: Hold down shift to select multiple columns. Power BI will only match rows if all criteria are matched between the selected columns.
After setting up the Merge, click OK and Power BI willjJoin the two tables together.
After pressing OK, you will see your original dataset, and to the right there is a new column called Customer Address.
Click on the <-|-> arrow button at the top of the new column. It will show you a drop down to select columns that you want to expand.
After selecting the additional columns, they will be merged with your Power BI table.
After the datasets are combined you can exit the Power Query editor and work with the data like it’s a single continuous table.
Different Techniques for Merging Queries in Power BI
Power BI has 6 different ways to merge or join data together. While not as robust as writing SQL queries, the concepts are the same. Sometimes when combining tables with a merge, there are more rows on one table than the other. Different merge types handle non-matching rows differently. The following table breaks down the different Merge Types and when to use each one.
|Inner Join||Returns only the rows that have matching values in both tables.|
|Left Outer||Returns all records from the left table, and the matched records from the right table.|
|Right Outer||Returns all records from the right table, and the matched records from the left table.|
|Full Outer||Returns all records when there is a match in either table.|
|Left Anti||Returns the rows from the left table that do not have matching values in the right table.|
|Right Anti||Returns the rows from the right table that do not have matching values in the left table.|
As you can see, Power BI gives you a lot of options when matching up tables. The biggest decision to make is to determine how you want Power BI to react when there is extra data one table vs the other.
While slightly different terminology, W3 Schools explains the different join types used in SQL with some great graphics to highlight the differences.
Using Fuzzy Match to Merge in Power BI
The standard merge settings will look for an exact match between two tables before combining them. However, Power BI also has the ability to Fuzzy Match. It makes its best estimate to find the next closest value and combines based on it.
Because of the high risk of incorrect matches, we recommend using this feature sparingly and only in datasets that are small enough where you can manually review the records being combined to determine whether or not the merge is joining records appropriately.
How Fuzzy Matching Works to Merge in Power BI
Fuzzy matching in Power BI utilizes a set of algorithms to determine the “distance” between different strings. A low distance indicates a strong match, even if the strings are not identical to each other.
Several parameters allow you to customize the fuzzy matching process:
- Threshold: A value between 0 and 1 that sets the minimum similarity score for a match. A value of 1 would mean an exact match is required, while lower values make the matching more lenient.
- Maximum number of matches: Specifies the maximum number of matches that can be returned for each value in the column from the primary table.
- Ignore Case: Allows you to specify whether the fuzzy matching should be case-sensitive or not.
- Ignore Spaces: If enabled, spaces will be disregarded when evaluating similarity between strings.
A common use case for Fuzzy Matching is to combine two tables of addresses or names that commonly have different spellings and punctuations. Try merging the two lists together to identify possible matches. It can make for a starting point to identify potential matches without having to manually review a long list.
Tips and Best Practices Merging in Power BI
There are a number of common issues that arise when merging multiple datasets together in Power BI. The following scenarios are ones to watch out for.
Check Row Count Before and After a Merging Tables in Power BI
Rows can be duplicated during a merge, or they can be left off altogether based on the join type and expected behavior. To identify whether rows have been duplicated or dropped off, click on the bottom left of the Power Query editor and Turn On Column profiling based on entire data set.
The default setting will only sample the first thousand rows. The number of rows and columns is displayed in the bottom left corner of the Power Query editor.
Duplication of Rows after Merging in Power BI
If a value exists multiple times in a table that you are merging with another one, it’s possible that you will end up with duplicate rows in your primary data table. Be careful to make sure that data isn’t being duplicated during a join, it will throw off your totals and other calculations in the dashboard or report.
To fix duplicated rows, you can right click on a column header prior to merging the data and remove duplicates from a dataset.
To avoid duplicates you can also merge on multiple column criteria or concatenate the values to create a unique primary key between the two. The video below explains more of how to utilize this technique to complex merges.
Data falling off Unexpectedly after Merging in Power Bi
Rows can fall out of a Power BI data set after a merge when no matches are found between the two data sets and the behavior will differ based on the merge type. Be careful to have the correct merge type selected if you want to have only matches from the left or right table, or if you want to have a fully consolidated data set after the merge that includes all rows.
Using Anti Merge Types to identify Non-Matching Records
The Anti Merge settings for left and right merges can be used to identify non-matching items. Think of this as a way that you can quickly reconcile two tables based on multiple criteria. You could tell Power BI to Merge based on invoice number and dollar amount columns. For the items that don’t match, you can output it to a new table based on an anti-merge which displays the missing items.
You can also use an inner join to determine which records match, and use it as a way to filter two tables on each other using multiple columns of data.
Fixing Data Type Mis-matches to Avoid Merge Errors
When data types don’t match between the columns that you are using as merge criteria in Power BI, it can result in an error or a blank table when you attempt to expand columns after the merge. If this happens right click on a column name and change the data type prior to the merge.
If the data types match between the two columns Power BI is merging on you should eliminate the error.
Merging Data vs Creating Relationships in Power BI
Merging and creating relationships can both be used to combine tables in Power BI. Merging typically occurs in Power Query Editor and produces a new table that physically contains the combined data. On the other hand, creating relationships is done in the Power BI model and sets up a dynamic link between tables.
Use the merge function when combining data that requires additional data prep or calculations in Power Query. Relationships are better suited for larger datasets and situations where on-the-fly aggregation is needed without duplicating the size of your data set.
Use the Power BI Merge function to combine two datasets when you need one table with columns from different tables. Power BI provides several different merging options that determine what to do with duplicate rows between data sets and what do you when a row value exists in one data set and not the other.
There are a number of considerations when working with merged tables, the biggest one is looking at the row count to ensure that your data doesn’t grow or shrink in size.
As an alternative to merging tables, you can use the Power BI relationship editor to build relationships to different tables. It’s less permanent than updating data in Power Query, but sometimes it’s necessary to merge tables in Power Query so you can perform additional data transformations on a joined table prior to building a dashboard or report on the data.
- How to Append or Union Two or More Tables in Power BI (www.popautomation.com)
- How to Calculate Percentage Change in Power BI
- How to Remove Rows and Promote Headers in Power BI
- How to Find and Remove Duplicates in Power BI
- How to Schedule Power BI Automated Emails and Reports
- How to Add or Remove Blank Rows in Power BI Matrix Visuals
- A Comprehensive Guide to Power BI AI Features