A few weeks ago, I posted about using Excel’s REDUCE and SUBSTITUTE functions to perform multiple find and replace actions to create dynamic text. But what if we want multiple find and replace during the data-cleaning process? In that scenario, we need to use Power Query to replace values based on a list. So, in this post, let’s find out how to do that.
Table of Contents
Download the example file: Join the free Insiders Program and gain access to the example file used for this post.
File name: 0156 Replace values based on a list.xlsx
Watch the video
Example data
The scenario we are working with is as follows:
- Each division of your company has submitted a list of sales by customer
- You have combined the data into a single table for analysis
- However, each division is not consistent in their customer naming
- You need to find and replace the customer names to get a clean data set
Here is the data we are working with. This has been loaded into Power Query in a query called Data.
In this data set, the following values are the same customer:
- Magic Marvin | Marvin Hilton
- Sarah Toote | Sarah Toote (t/a Epic Designs)
- Whittlesborough Lodge Limited | Whittlesborough Lodge Ltd
- Wicks Limited | Wick’s Limited | Wicks Ltd
Therefore, this is the Table of values to find and replace; it has been loaded into Power Query as a query called FindReplace.
Perform a single replace values transformation
Let’s start by getting the syntax to replace a single value.
In the Data query, select the Name column, then click Transform > Replace Values
Enter any values in the Replace Values dialog box; we don’t care; we just want the syntax. I’ve used Find This and Replace With. Then click OK.
The formula bar at the top displays the M code created.
= Table.ReplaceValue(#"Changed Type","Find This",
"Replace With",Replacer.ReplaceText,{"Name"})
The key parts are:
- #”Changed Type”: The name of the table to perform the transformation on
- “Find This”: The text to find
- “Replace With”: The text to replace
- {“Name”}: The list of columns to perform the replace values action on
Copy this formula from the formula bar; we will need it later.
Then delete the step.
Using List.Accumulate to find and replace values based on a list
OK, now comes the tricky part.
We are going to use the List.Accumulate function in Power Query. This does not appear anywhere in the User Interface, so we have to write the code by hand.
List.Accumulate loops through a list and performs a transformation for each item in that list.
To learn more about List.Accumulate, check out this post: https://gorilla.bi/power-query/list-accumulate/
Click on the fx icon next to the formula bar and enter the following:
= List.Accumulate(
{0..List.Count(FindReplace[Find]) - 1},
#"Changed Type",
(state, current) => state
)
The key parts are:
- List.Accumulate: The name of the function which perform the looping activity.
- {0..List.Count(FindReplace[Find]) – 1}: Creates a list of numbers from 0 to the count of items in the Find column of the FindReplace query. Because Power Query is zero-based, we need to minus 1. This creates the list to loop through.
- #”Changed Type”: The start value for the initial state. In our example, this is the name of the previous step.
- (state, current) => state: The function to perform for each item in the list.
- state: the intermediate result after the transformation on each item
- current: the value currently used inside the loop.
At present, this function achieves nothing; there is nothing happening inside the loop; it’s simply returning the existing state with each loop.
Now it’s time to use the code we copied earlier and add it into the loop (see the bold section below)
= List.Accumulate(
{0..List.Count(FindReplace[Find]) - 1},
#"Changed Type",
(state, current) => Table.ReplaceValue(#"Changed Type",
"Find This","Replace With",
Replacer.ReplaceText,{"Name"})
)
Before this works, we need to make a few changes:
- Change #”Changed Type” to state
This ensures we use the state after the previous loop - Change “Find This” to FindReplace[Find]{current}
This returns the current value from the Find column of the FindReplace query - Change “Replace With” to FindReplace[Replace]{current}
This returns the current value from the Replace column of the FindReplace query
The query now looks like this
= List.Accumulate(
{0..List.Count(FindReplace[Find]) - 1},
#"Changed Type",
(state, current) => Table.ReplaceValue(state,
FindReplace[Find]{current},FindReplace[Replace]{current},
Replacer.ReplaceText,{"Name"})
)
That’s it. Take a look at the Preview Window. All the changes have been made.
BOOM! Now we can analyze this data.
Conclusion
Replace values in Power Query is applied one value at a time. Therefore, it is difficult to use if there are lots of values, or an unknown number of values. However, finding and replacing values based on a list is possible by leveraging the power of List.Accumulate.
Related posts:
- How to Find & Replace multiple words in Excel: REDUCE & SUBSTITUTE
- How to filter by a list in Power Query (4 methods)
- How to transform nested tables in Power Query (without coding)
Discover how you can automate your work with our Excel courses and tools.
Excel Academy
The complete program for saving time by automating Excel.
Excel Automation Secrets
Discover the 7-step framework for automating Excel.
Office Scripts: Automate Excel Everywhere
Start using Office Scripts and Power Automate to automate Excel in new ways.
Mark – nice one. Works very effectively on Case-senstive replacements too.
A suggestion: I would probably swap around the first argument of List.Accumulate and make a direct reference to the translation table.
You could either use
– Table.ToRows
– Table.ToRecords (https://powerquery.how/table-torecords/ )
I’d probably go for the Table.ToRecords one. Simply because it allows you to reference the field name of the items to swap.
Cheers,
Rick
This is pretty cool. Thanks Mark. I am starting to realize that the possibilities with PQ are limited mostly just by the confines of my creativity.
I noticed the suggestion by Rick and decided to give it a shot on my dataset. Works.
let
Source = Excel.CurrentWorkbook(){[Name=”tblData”]}[Content],
tblToListOfRecords= Table.ToRecords(tblFindReplace), //convert table to a LIST of records
tblScrubbed = List.Accumulate( tblToListOfRecords,
Source,
(state, current) => Table.ReplaceValue(Source,
current[Wrong], //old value. access field by field name
current[Right], //replace value. access field by field name
Replacer.ReplaceText,{“Companies”})
)
in
tblScrubbed
Rick’s method is pretty awesome. He certainly knows his stuff.