This post may contain affiliate links. Please read my disclosure for more info.

Power Query – Import All Files in a Folder

Powre Query - Combine All Files in Folder

In this post, we are looking at using Power Query to import all the files in a folder.  We give Power Query a folder path, click a few buttons and it will import and combine all the files into a single table.  That’s powerful stuff, right!  If we want to add another file into the output table, we only have to save a copy of the file in the folder and click refresh, the new file will be imported too.  This can save you hours and hours.

Before we get started on this technique, there is one point which I need to make you aware of.  All the files to be imported must follow a similar structure and column pattern.  Power Query is magic, but you’ve got to give it a reasonable chance.

Downloads

To work along with the examples below, download the sample files.  Click here, to subscribe and get access to the Downloads section.

Excel Downloads

The examples in this post use the following files:

  • Example 10 – January 2019.csv
  • Example 10 – February 2019.csv
  • Example 10 – March 2019.csv
  • Example 10 – April 2019.csv
  • Example 10 – January 2019.xlsx
  • Example 10 – February 2019.xlsx
  • Example 10 – March 2019.xlsx
  • Example 10 – April 2019.xlsx

Setting up the files for the examples

To work along with the examples in this post, start by moving the January and February files into a separate folder.  These are the files we will import initially.

File set-up

Do not include the March or April files, we will add them in later as part of the examples.

Import all CSV files from a folder

The easiest files to work with are CSV or text files, so that’s an excellent place to start.  What makes them easy to work with?

  • They always contain a single worksheet
  • The first row is treated as the header row
  • Merged cells are not permitted

In Excel, click Data -> Get Data -> From File -> From Folder

Import files from folder

The folder window will open.  Click Browse…, navigate to the folder to be used, then click OK.

Folder Browse OK

A preview of the folder and file attributes is displayed.  If the files are entirely uniform and there are no edits to make, we could just click Combine (drop-down button) -> Combine Close & Load.  However, you’ll notice we have a mix of .csv and .xlsx files in the folder; we can’t combine these easily.  Instead, we will click Transform Data and take a closer look at what’s happening here.

List of files in folder

The Power Query editor opens.  It does not show the data from the file as it has in the past, instead, it shows data about the files.

Our first transformation will be to keep only the files we wish to be combined (the CSV files for our current example).  It is always good practice to filter to only include the files we need; you never know when another user might decide to save a random file within the folder.

Filter the Extension column to include only .csv.

Filter to show only CSV

From here we have two options:

  1. Use a Custom Column
  2. Combine the files

Which method you choose is up to you, and you may even decide to use different methods for different circumstances.

Method 1: Use a Custom Column

The first method uses a Custom Column containing a Power Query function.

Start by selecting the Content and Name columns, then click Home -> Remove Columns -> Remove Other Columns.  This removes the file attribute columns which we do not require.  We have kept the Name column to illustrate that using this method you can retain file attributes through into the final output.

Click Add Column -> Custom Column.

Add Custom Column

In the Custom Column window, input the following information

New column name: CSV

Custom column formula:

=Csv.Document([Content])

Click OK.

Custom Column Window

A new column called CSV is added to the preview window.

You will notice a new icon in the CSV column header.  This is the Expand icon and indicates that the column contains more detail which can be expanded.

Click on the Expand icon, then click OK as we want to keep all the columns of data.

Expand CSV column options.

We have now expanded all the data from both files, and the January & February files are combined.

CSV after Expand

To tidy up the table, do the following:

  • Remove the Content column
  • Add the column headers to be Source File, Date, Customer, Product, Sold By and Value.
  • In the Value column filter to remove the text of “Value”
  • Set the data types for each column

Our preview window should now look like this:

After all CSV's combined

That’s it all done, click Close & Load to load the Table into Excel.

Method 2: Combine the files

Now time for Method 2.  This uses Power Query’s magic built-in combine process.

Assuming we didn’t do Method 1, the Preview window should look like this:

Content Combine button

Did you notice the icon with the two down arrows at the top of the Content column?  That’s the Combine Files icon, it drills from the file level into the content level, and combines each file.  Go ahead and click it… you know you want to.

The Combine File window will open, showing a sample of the data using the first file in the query as an example.  All looks good, so click OK.

Combine Files window

Power Query will now do its magic.  Once it has finished, we can see the files have been combined, that’s all it took, one button!  As shown in the screenshot below, a lot of transformations and steps have been made to achieve this.

Auto query transformations

The preview data should look like this.

Preview window after combine

You can see that both the January and February data has been combined into this single table.

Click Home -> Close & Load to load the data into Excel.

Adding more files to the folder

I have created a Pivot Table to show that the January & February files have been combined.

Data with Pivot Table

Copy the Example 10 – March 2019.csv  file into the folder.

Adding March File

Now for the moment of truth… in Excel, click Data -> Refresh All, the data will be updated.

“If it’s updated, why hadn’t the Pivot Table updated?” – Good question.  The Pivot Table refreshed before the data finished loading from the query, so we have to click Data -> Refresh All again to refresh the Pivot Table.

Data with Pivot Table after update

To turn off the need to double refresh, follow my post on Data Refresh.

File uniformity is key to success

Now add the Example 10 – April 2019.csv file into the folder.  Click Data -> Refresh All

Method 1 will update without any issues.  Method 2 though will lead to an error.

Combine file - Refresh error

Why did this happen?  What went wrong?  Open up the Example 10 – April 2019.csv and take a close look at the first row.  In this file, the column called Customer which exists in the other files has changed to Sold To.

Sold To in CSV file

As part of the automatic steps, Power Query promotes the first row of a CSV file as the column headers.  When we initially combined the files using Method 2, it used February as the sample file, therefore the column headers from February were set as the column headers for the whole query.  Power Query has no way of knowing that April is different from the other files.

Method 1 does not use the column headers from the CSV files,  but uses the relative column positions.  Which is okay unless the column order changes in any files, in which case you could be combining different types of data without knowing it.

This is why file uniformity is key to the success of this technique.

Import Excel workbooks from a folder

The method to import Excel workbooks is almost identical to CSV files.  But it introduces different challenges.  We’ve already seen in previous posts that Power Query likes to hard code values into the query.  This becomes a problem when all our Excel workbooks have worksheets with different names.  Guess what… yep, all our Example files have different worksheet names.

Copy the Example 10 – March 2019.xlsx and Example 10 – April 2019.xlsx into the folder.

All files in the folder

In Excel, follow these steps:

  • Click Data-> Get Data -> From File -> From Folder.
  • In the Folder window click Browse… navigate to the folder, then click OK.
  • When the list of files appears, click the Transform Data button.
  • The Power Query Editor window opens.
  • On the Extension column filter to only include the .xlsx files

Let’s take a look at the same two methods as before

Method 1 – Use a Custom Column

Method 1 uses a Custom Column formula as before.

Select the Content and Name columns, then Remove Other Columns.

Now let’s add the new column, click Add Column -> Custom Column.

In the Custom Column window, input the following information

New column name: ExcelFile

Custom column formula:

=Excel.Workbook([Content])

Click OK.

Excel File Custom Column

A new ExcelFile column is added.  Click on the Expand icon, then click OK to include all the columns.

Expand Excel Column Options

Every worksheet and table from the workbook is listed.  The April file contains a worksheet and a table, so filter the ExcelFile.Kind column to remove the Tables and retain only Sheets.

Import Sheet and Table

Select the ExcelFile.Data column, then Remove Other Columns.  Click the Expand icon on the ExcelFile.Data column.

Look at the Preview window – Ta-Dah!  All the worksheets are now combined.  Next Close & Load to get the data into Excel.

Method 2: Combine the files

Now time for Method 2.  This uses Power Query’s built-in combine process again.  Assuming we didn’t do Method 1, the Preview window should look like this:

XLSX Combine button

Click the combine icon as we did when using Method 2 for CSV files.

As Example 10 – April.xlsx was first in the list of files, Power Query will use this as it’s Sample File.  On the Combine Files window, select the April sheet and click OK.

Combine Excel files

If you all the worksheets have the same name, then you’re in luck.  All the data will be combined into a single table.

But for our scenario, as you might expect, only the April data will load into the Preview window.  There is only one workbook with a worksheet called April, so only one worksheet will be loaded.  We have to find the location in the M code where the sheet name has been hardcoded and adjust it.

Open the Transform Sample File query.

Transform Sample File

Click on the Navigation step in the applied steps.

Applied steps - Navigation

The M code for this step looks like this:

= Source{[Item="April",Kind="Sheet"]}[Data]

Delete the highlighted section.  The M code should now be this:

= Source{[Kind="Sheet"]}[Data]

Go back to the main query and look at the Preview window again.  All the worksheets are now combined – that’s magic.

Finish off the query

We’ll need to make some transformations to tidy things up.

  • Remove the top 3 rows
  • Promote the first row as the header by clicking Transform -> Use First Row as Headers
  • Select the last column and filter to remove “null” and “Value”.
  • Apply the correct data types to each column.

Finally, Close and Load the query into Excel.

Notes & tips

Here are a few bonus tips:

  • When looking for files, Power Query will look in sub-folders too.  This is great when files are organized within a defined structure.  But if the folders are in a bit of a mess, then a bit of housekeeping might be required to get this method to work.
  • Plan for what might go wrong, think about what might happen if somebody else were to save a different file type in the folder.  When you create the query, include extra filter steps to ensure only the files you want are imported.
  • Data which comes from an IT system normally works well with this technique.  I find that any data which requires user input tends to cause problems… why?  Because people change things, they add columns,  they rename sheets, they type over static values.  If your files come from user input, be prepared to use the combine and append method.


Want to Learn VBA & Macros?

If you want to automate Excel and save time, my recommended resouces for learning VBA Macros are:

Leave a Reply

Your email address will not be published. Required fields are marked *