In the previous part of this Power Query series, we took a first look at how to import data from various file formats and load it into Excel. Now, in this part, we move on to consider how we can refresh the data, which enables us to build a query once and use it over and over.
Download the example files
It may be tempting to just read the text below, but you will retain the information much better if you work along with the examples.
Subscribers can download the files from the Downloads section.
Refresh all data
Unlike Excel’s calculation engine, which by default will re-calculate with every change, Power Query will only recalculate when specifically commanded.
The refresh process is straightforward, click Data -> Refresh All
Let’s see this refresh process in action.
The following uses Example 6 – Data Refresh 1.csv and Example 6 – Data Refresh 2.csv from the downloads.
Create a basic query
Open a new workbook and create a new query using a CSV file; from the Data ribbon, Click Get Data -> From File -> From Text/CSV.
The Import Data window will open. Navigate to the Example 6 – Data Refresh 1.csv file, select it and click Import.
Power Query will open a new window and display a sample of the data. Click Transform Data.
We will now make some basic data transformations to shape the CSV into more useful information. This is the first time in this series we are performing some of these transformations, but don’t worry too much about them now, we will definitely see them again in future sections.
Click on the Date column. From the ribbon, click Transform -> Date -> Month -> End of Month (this will change the Date column to the last day in the calendar month).
Click on the Product column header, then hold the Shift key and click on the Sold By column header. Both columns should now be selected.
From the ribbon click Home -> Remove Columns (this will remove the selected columns).
Now let’s pivot the data on the Date column. Select the Date column header, then click Transform -> Pivot Column on the ribbon.
The Pivot Column window will open. Change the Values Column to Value. Click the Advanced options and set the Aggregate Value Function to Sum. Then, click OK.
Now click Home -> Close & Load. Excel will create a new worksheet with a Table. This table contains the data from the CSV, which has been transformed. It should look like the screenshot below.
Update the source data
To demonstrate the refresh process, we are simulating where a user might receive a new file on a daily, weekly or monthly basis. To demonstrate this, we need to rename both of the example workbooks.
- Example 6 – Data Refresh 1.csv can be renamed to anything you want
- Example 6 – Data Refresh 2.csv must be renamed to Example 6 – Data Refresh 1.csv
Power Query is still pointing to a file called Example 6 – Data Refresh 1.csv, which contains the new data.
Refresh the data
Click Data -> Refresh All.
In the background, Excel will now import the data from the file into Power Query, apply the same transformations and Load the data into the worksheet. The new data should appear automatically on the worksheet.
Did you notice that! The March data appeared with a single click – that is amazing!
Each time there is a change to the existing data, or where a new file is received, it is only necessary to save the file with the same file path, then click one button. That is powerful stuff… right?
In a future part, we will look at linking the file path a cell, so you can import without having to overwrite the previous file.
Refresh only specific queries
When we have lots of queries within a workbook, it can take a while to refresh them all. In these circumstances, it is useful to refresh only the queries we need. There are a few ways to achieve this.
If you click on a Table, which has come from a query, the Data -> Refresh drop-down menu includes an additional option.
Clicking on Refresh will only refresh the selected query.
Queries & connections menus
When we loaded the data from Power Query into Excel, the Queries and Connections menu opened. This window shows all the queries created in the workbook.
If that menu is not open, click Data -> Queries and Connections
Each query in the Queries and Connection menu has a refresh icon. Simply click the icon to refresh the data.
Alternatively, we can right-click on the query and select Refresh from the menu.
I’m not going to cover it in this series, but you can use VBA to refresh individual queries. This can be a useful option when:
- you want to provide an easy to use interface for the user
- you want to control the order in which the queries refresh
Advanced refresh options
As more queries are added to a workbook, it can soon become time-consuming to refresh individual queries, or too slow to refresh them all. The good news is that Excel already has options to enable the data to be refreshed automatically.
Select a cell within a query Table, then click Data -> Refresh All (Drop Down) -> Connection Properties
The Query Properties window will open
The refresh options available in this window are:
Background refresh – Background refresh allows us to keep working while the data refreshes in the background. It is turned on by default. If we remove this option, we will be unable to use Excel until the refresh process is complete. With background refresh applied, the refresh process itself will take longer, but that might not be a problem as we can keep working.
Refresh every x minutes – The workbook must be open for the “Refresh every x minutes” option to work. It can be useful where the source data is regularly changing. It is probably a good idea only to use this option with the background refresh enabled, or it could become very annoying for users.
Refresh when opening the file – Automatically refreshing the data when opening the file is a useful feature as you know the data is up-to-date whenever you open it.
Refresh this connection on Refresh All – Where a query contains static data, there is no need to refresh it every time. By removing the tick next to “Refresh this connection on Refresh All”, it removes these queries from the Refresh All process, which in turn, reduces the refresh time.
The watch outs!
OK, it’s now time to consider the things which might catch us out.
Closed vs. Open files
The location where the source data is stored is important for data refresh.
When the data is stored in an external file, be it CSV, text or even an Excel Workbook, it is the last saved version of the file which is loaded into Power Query. If the external file is open, the changes will not be included in the refresh until it has been saved.
Where the data and query are in the same workbook, all changes are incorporated even if the file has not yet been saved.
Queries based on other queries
In a future post, we will consider how to use a query within another query. There is nothing wrong with this practice, but we just need to be aware of the consequences if we only refresh a single query, as any subsequent queries in the chain will not be refreshed.
By using Refresh All, the queries are refreshed in the right order. This is not the case when refreshing individual queries, it is up to the user to refresh the right queries at the right time and in the right order.
Background refresh is a great feature as we can keep working in Excel. However, any Pivot Tables or formulas which are updated or calculated before the background refresh has completed will not necessarily have the correct data. Therefore is some circumstances, you may wish to remove background refresh.