We’ve seen how to import external data from a single file, how to import all the files in a folder and how to import data from a Table / named range within the same workbook. But what if we want to import ALL the data within the same workbook? Well, that is where we are headed in this post.
If there is one Excel best practice which users ignore all the time it’s this – keep data of the same type on one tab. As you’ll see in our example files, the January, February and March are the same type of data in the same structure, so they should really be in one table on one tab. Most beginner Excel users don’t think like this, so there is a good chance you’ll come across these types of workbooks and will need to use this technique at some point.
Download the example file
The examples in this post use the Example 11 – Import from Current Workbook.xlsx file.
Import Tables from the current workbook
Open the Example 11 – Import from Current Workbook.xlsx file. Next, we are going to create with a blank query by clicking Data -> Get Data -> From Other Sources -> Blank Query
The Power Query Editor will open. There is one step in the Applied Steps window, nothing in the Preview window and most of the transformations are greyed-out. While the Applied Steps window shows Source as the step, there is actually nothing in this step at present. This truly is a blank query.
We are going to write some M code to give Power Query the source.
If using the Formula Bar, we could type the following (Click View -> Formula Bar if the formula bar is not visible).
Or if using the Advanced Editor (Home -> Advanced Editor) we could have the following:
let Source = Excel.CurrentWorkbook() in Source
Remember, M code is case sensitive, so you’ll need to type the text exactly as it is shown above.
The Preview Window will display the Tables in the worksheet.
Click the Expand icon to drill into the workbook structure. Uncheck the Use original column name as prefix, then click OK.
The Preview Window now displays the combined data.
Complete the query with the following transformations:
- Remove the Name column
- Change the data type for each column
- Give the query an appropriate name (I have chosen CombinedTable).
Click Close & Load to push the data into a new worksheet. You don’t know it yet, but you’ve got a problem, I’ll show you.
The Queries & Connections pane shows 151 rows loaded.
Make some changes to the January, February or March tables and click Data -> Refresh All.
Err… what just happened. We’ve now got 301 rows, but we’ve not added any more rows.
If we refresh the data again, we’ll have 451 rows.
Let’s go back into Power Query and see what’s going wrong. In the Queries & Connection pane double-click the query to open the Power Query Editor.
Click on the Source step in the Applied Steps box, then click Home -> Refresh Preview. The preview window now shows this:
Hopefully you see the problem now. The query we created loads the data into Excel as a Table, therefore it is now included as a source Table each time the query is refreshed and is combined with the other tables, before being loaded back into Excel again. Each time we click Refresh, the Table gets longer and longer and longer. It’s a bit like the Excel version of the movie Inception.
Let’s fix this problem right now. Add a step after the Source step to filter out the combined query.
Will this change filter to remove the CombinedTables or will it filter to include tblJanuary, tblFebruary and tblMarch? This is important, as we want the query to expand to include new tables as they are added to the workbook.
Take a look at the Formula Bar; the M code will be:
= Table.SelectRows(Source, each ([Name] <> "CombinedTables"))
The good news is that this code will remove the CombinedTables table, but allow any other Tables added to the workbook to will be automatically included into the scope of the query.
Having a standard naming convention of tables and queries is useful. For example, you may decide that all source Tables must have tbl at the start, this allows us to filter only to include tables which start with tbl. By doing this, it no longer matters how many tables or queries we have, the looping effect can be controlled.
Tables or Ranges
Power Query will happily import Tables and Named Ranges. If they both exist in the worksheet then both are listed.
Print areas are a special type of named range, so are also listed. Correct naming conventions and filtering is required to ensure the query uses the source we want it to.