Looking up data from a table or worksheet is probably the most common activity undertaken by Excel users to create reports. Learning to use VLOOKUP for many is their first taste of Excel’s power. But what about when using Power Query? There isn’t a VLOOKUP function, so how can we lookup data from another table?
Effectively there are three types of lookup, an exact match, an approximate match, and a fuzzy match.
- Exact matches are the most common and require the lookup value to be identical.
- Approximate matches find the value above (or below) the lookup value.
- Fuzzy matches find values based on how similar they are to other values using some form of algorithm.
Power Query can do all of these types, though in this post, we will focus primarily on the first two. To perform a lookup in Power Query, we will be using the merge transformation, which is on the Home ribbon.
The examples in this post use the file called Example 14 – Lookup Data.xlsx
In the file there are three tables:
- Sales – Containing monthly sales data
- Customers – Containing customer contact information
- Rebates – The discount given to customers for buying over specific values.
To illustrate how we can lookup data in Power Query we will be creating two reports.
Total Sales by Town (lookup with an exact match)
The Sales values and the Town of each customer are in separate tables; therefore, we need to use the Merge function to perform the lookup.
Total Rebate by Customer (lookup with an approximate match)
To get the rebate for each Customer we need to perform an approximate match. We don’t necessarily want to match the exact value, but to find the band in which the sales value falls. We will still use the Merge function, but we will need a few more transformations to get the correct end result.
Load the data into Power Query
Start by loading the three tables into Power Query.
Click on any cell in the Sales table, then select Data -> From Table / Range from the Excel menu.
The Power Query Editor will load. I have tried to set-up the tables so they require minimal transformations, but in the real world it is rarely this simple.
For our example, we do not need to load the data into the worksheet, instead we can just create a connection. From the Power Query ribbon click Home -> Close & Load To…
Select Only Create Connection from the Import Data window, then click OK.
Repeat this action with the Customers and Rebates tables.
In Excel, open the Queries & Connections pane (Click Data -> Queries & Connections if it is not visible), the three queries should be listed.
Now we are ready to start using the Merge feature 🙂
Lookup with an exact match
To illustrate an exact match, we will create a report of total sales by town.
Let’s get back into the Power Query editor by double-clicking on the Sales query within the Queries and Connections pane.
In the Power Query editor select Home -> Merge Queries (drop-down).
There are two options in here, Merge Queries and Merge Queries as New. The difference between them is whether the Merge will create a new query, or be added as a transformation step within an existing query. For ease, we will use a new query. Select the Merge Queries as New option.
The Merge window will open.
A lot is going on here:
- Select the first query to be used – for our example, it is the Sales query.
- Select the second query to be used – in our example, it is the Customers query.
- Select the column from the two tables which should be matched.
- The Join Kind provides six different types of merges – Select Left Outer if working along with the example.
- Click OK.
There are 6 join types, look at the section below to find out what the different types do.
A new query is created. The first query selected in the Merge window is displayed, with an additional column containing the table of the second query.
Click on the expand table icon in the header of the Customers column. We only need the Customer and Town columns, so select those, then click OK.
Power Query will merge the queries, by looking up from the first table into the second table.
The selected join included all items from the Sales table and the matching items from the Customers table. Any items without a match will display null, as shown in the screenshot below. The Customer Mega Mart exists in the Sales query, but not the Customers query, therefore the null value is displayed.
To complete our example, we’re going to use the merged column to create a summary report. Select the Town column, then click Transform -> Group By from the menu.
In the screenshot above, I have chosen to Sum the Value column, into a column called Total Sales. After I click OK, it will create a summary report of sales by town (see the screenshot below)
Types of joins
Before we look at another example, let’s take a few minutes to think about the six different types of joins.
Joins enable us to compare lists, then return values accordingly. Thankfully, the descriptions provided for each join is a good summary of what it does.
Outer joins return all the rows from one or both lists. We can select either Left, Right or Full, depending on which list should return all its rows.
- Left Outer – all items from the first list are returned, along with the matching items from the second list
- Right Outer – all items from the second list are returned, along with the matching items from the first list
- Full Outer – all items from both lists
An Inner join returns only the values which exist on both lists. If the first list or second list has items which are not on the other list, these are excluded from the end result.
Anti joins return the items which do not match any values on the other list
- Left Anti – returns any items in the first list which do not match the second list
- Right Anti – any items in the second list which do not match the first list
This is all powerful stuff and proof that Power Query can achieve more than VLOOKUP.
Lookup with an approximate match
Now it’s time to look at an approximate match. Before we start, if you’re not sure what this means, please read my post about Approximate Match here.
In this example, we are calculating the value of a rebate due to a customer based on the sales value. Customers with sales greater than:
- $500 receive a 2% rebate
- $750 receive a 5% rebate
- $1,000 receive a 10% rebate
These thresholds are shown in the table on the worksheet.
First we need to work out the total sales by customer first, then do a more complicated merge to calculate the rebate value based on the total sales.
Add another version of the Sales table into Power Query by following the same steps above.
In the Power Query editor click Transform -> Group By.
The Group By window will open, make the following selections:
- Group by: Customer
- New column name: Total Sales
- Operation: Sum
- Column: Value
Click OK. The Preview Window should display the following table, with the total sales by customer.
In the last example, we created the Merge as a new query; this time, we can add the Merge as another step. Click click Home -> Merge Queries.
In the Merge window we want to use a Full Outer join between the Total Sales column of the Sales table and the Rebate Band of the Rebates table.
Click OK to close and return to the Preview Window.
Expand the Rebates column, include both columns, then click OK.
The Preview Window now looks like this:
We’re now going to write an if statement. We could use the Conditional Column feature, but I find it easier to write it as a formula. Click Add Column -> Custom Column.
The Custom Column window opens. Give the column a new name and enter the following text in the formula box:
if [Total Sales] = null then [Rebate Band] else [Total Sales]
We will cover if statements in more detail in a future post. For now, just think of it as the standard Excel IF function with the following changes:
- remove the brackets
- replace the first comma with the word then
- replace the last comma with the word else
Sort the new column by selecting the column header then click Home -> A-Z.
Next, select the Rebate Value column and click Transform -> Fill (dropdown) -> Down.
The Preview Window should look like the following:
Now just a few steps remaining to tidy up the table:
- Filter out the null values from the Customer column
- Remove all columns except the Customer, Total Sales, and Rebate Value.
Now let’s finish up by calculating the rebate value. Select the Total Sales and Rebate Value columns, click Add Column -> Standard -> Multiply.
The query now displays the final result.
What happens if there are multiple items which could match? In this scenario, a VLOOKUP exact match in Excel returns the first item it finds in the list. Merge behaves differently.
Merge will return each instance of a matched item. For example, let’s assume we have two tables one with Product information and another with Stock data about those products.
If we were to Merge only on the Size column, the M sized items would be duplicated as there are two M’s in the table being looked up.
If, for whatever reason, you only wanted to match one item, remove the duplicates from one of the tables before performing the Merge.
Multiple Criteria Lookup
The great news is that Power Query does not restrict you to one list. Let’s say that you want to match three columns? That’s no problem.
The order in which you select the columns determines which columns are matched. Look at the screenshot below. The columns selected in the first table were Color, Size, then Location (in that order). The numbers in the column header identify the order in which the items were selected. The columns selected in the second table were in the same order, Color, Size and Location.
How easy is that!
Fuzzy Match Lookup
When looking at the Merge window in our examples, did you notice the option to “Use fuzzy matching to perform the merge”? This is a reasonably new option, it matches values which are similar. For example, it will match “Power Query” with “power-query”. You can’t do that with VLOOKUP! We can even change the thresholds of how similar values must be before they match.
As this is a new feature I won’t go into detail here, but you can find out more information in these posts:
- Excel University – Fuzzy match with Power Query
- Microsoft – Fuzzy match support for Get & Transform (Power Query)
- Import Data
- Data Refresh
- Edit Queries
- Close & Load Options
- Using Parameters
- Basic Transformations
- Combine / Append Queries
- Import All Files in a Folder
- List All the Files in a Folder and File Attributes
- Import Data from the Current Workbook
- Import Data from the Web
- Unpivot Data
- Unstacking Data in a Column
- Lookup Values using Merge
- Change the Source Data Location
- If Statements for Conditional Logic