This post may contain affiliate links. Please read my disclosure for more info.
Advertisement:

Copy a data table from PDF into Excel

PDF to Excel Table

Copy PDF to Excel

PDF files are pretty much the norm for distributing reports these days. They provide a nice easy way to collate documents from different sources for distribution.  However, once a document is in a PDF format getting that information back into a usable form is a nightmare.  If we try to copy and paste a data table from PDF into Excel it just doesn’t format as expected.

 

PDFs are not born equal

The pasted information will be displayed in Excel differently based on how the PDF was created. In my experience the pasted data will show as one of the following:

  • A list of values
  • A continuous text string for each line
  • A picture

I would love to offer you the perfect solution to get the PDF data table into Excel, however I don’t believe there is one.  If the paste displays as a picture, then as far as I know, you will need to resort to third-party software which contains OCR.  If the paste is either a list of values or a continuous text string, then there are some possible workarounds. One of these should work, providing reasonable results and saving you time.

Here is our example PDF file:

 

Copy the PDF table into Word first

Excel is a software program which is designed to work with tables, whilst Word is designed to work with text. However, Word is actually better at dealing with PDF tables.

  • Copy the table from the PDF document
  • Paste the table into Word
  • Copy the table from Word
  • Paste the table into Excel

You may now have a perfect data table in Excel. Or maybe, which is more likely, you have a table which requires a bit of tweaking.  It may not be perfect, but it’s still closer than what you had before.


Advertisement:

 

Pasted list: Use VBA to format the list

If the paste into Excel is just a list of values in one column we can turn to VBA for a bit of help.

pdf-data-as-list

 

The VBA code below which will cycle through the selected data and recreate a table layout.

  • Copy the table from the PDF document
  • Paste the table into Excel
  • Select all the pasted cells
  • Run the Macro below

 

Sub ConvertListToTable()

'Define the variables
Dim NoOfColumns As Integer
Dim TargetRow As Integer
Dim TargetCol As Integer
Dim i As Integer

'Set the initial values for the variables of where to place the table
TargetRow = Selection.Row
TargetCol = Selection.Column 

'Set the variable of the number of columns in the table
NoOfColumns = 13

'loop through every cell in the selected range
For i = 0 To Selection.Rows.Count - 1
    'Change the value for the Target Column
    TargetCol = TargetCol + 1

     'Set the value of the Target Cell based our the Source Cell
    Cells(TargetRow, TargetCol).Value = Cells(Selection.Row + i, Selection.Column).Value

    'Reset the Target Column and change the value for the Target Row
    If TargetCol = Selection.Column + 13 Then
        TargetRow = TargetRow + 1
        TargetCol = Selection.Column
    End If
Next i
End Sub

 

We will need to change the following line so that it is equal to the number of columns in the source table, else the data will be in the wrong columns.

NoOfColumns = 13

 

Pasted strings: Use Excel functionality to split the string


Advertisement:

pdf-data-as-string

If the paste into Excel is a continuous text string for each line then you can use Excel’s built-in functionality to separate the string into columns.

Data -> Text to columns

text-to-column-menu

Select “Delimited” from the Convert Text to Column Wizard Step 1 window

text-to-column-wizard1

Click Next

Select “Space” from the Convert Text to Columns Wizard Step 2 window

text-to-column-wizard2

Click Next


Advertisement:

Click Finish

If there are spaces between words in the data this will unfortunately separate each of those words into a different cell. We now have a table in Excel which probably just requires a bit of tweaking.  It may not be perfect, but it’s still closer than what we had before.

 

Pasted Strings: Use VBA to split the string

Rather than using the Excel functionality we could use VBA code to achieve the same effect.

  • Copy the table from the PDF document
  • Paste the table into Excel
  • Select all the pasted cells
  • Run the Macro below

 

Sub PDFStringtoTable()

Dim SplitString As Variant
Dim Rng As Range
Dim Cell As Range
Dim i As Integer
Set Rng = Selection

For Each Cell In Rng

    SplitString = Split(Cell.Value, " ")

    For i = 0 To UBound(SplitString) - 1

        Cell.Offset(0, i + 1).Value = SplitString(i)

    Next i

Next Cell

End Sub

 

Selective copying from the PDF

We can increase the chance of our data formatting correctly by being selective about which parts of the PDF to copy. We don’t have to select the whole table in the PDF file.  For example if the first column in the table is a description column (maybe with spaces between the words), then we will get better results by selecting the first column, converting to Excel, then converting the other columns to Excel.

To achieve this selective copying press Ctrl+Alt whilst selecting the data in the PDF table.

 

A perfect solution?

I have even used Adobe’s own PDF to Excel converter, and even that requires a lot of manual adjustments.  Therefore, I don’t believe that a perfect solution exists for this problem.  But, hopefully one of these workarounds has provided a reasonable solution and has saved you a lot of re-keying time.