This post may contain affiliate links. Please read my disclosure for more info:

Copy a data table from PDF into Excel

PDF to Excel Table

Copy PDF to Excel

PDF files are pretty much the norm for distributing reports these days. They provide a nice easy way to collate documents from different sources for distribution.  However, once a document is in a PDF format getting that information back into a usable form is a nightmare.  If we try to copy and paste a data table from PDF into Excel it just doesn’t format as expected.

 

PDFs are not born equal

The pasted information will be displayed in Excel differently based on how the PDF was created. In my experience the pasted data will show as one of the following:

  • A list of values
  • A continuous text string for each line
  • A picture

I would love to offer you the perfect solution to get the PDF data table into Excel, however I don’t believe there is one.  If the paste displays as a picture, then as far as I know, you will need to resort to third-party software which contains OCR.  If the paste is either a list of values or a continuous text string, then there are some possible workarounds. One of these should work, providing reasonable results and saving you time.

Here is our example PDF file:

 

Copy the PDF table into Word first

Excel is a software program which is designed to work with tables, whilst Word is designed to work with text. However, Word is actually better at dealing with PDF tables.

  • Copy the table from the PDF document
  • Paste the table into Word
  • Copy the table from Word
  • Paste the table into Excel

You may now have a perfect data table in Excel. Or maybe, which is more likely, you have a table which requires a bit of tweaking.  It may not be perfect, but it’s still closer than what you had before.

 

Pasted list: Use VBA to format the list

If the paste into Excel is just a list of values in one column we can turn to VBA for a bit of help.

pdf-data-as-list

 

The VBA code below which will cycle through the selected data and recreate a table layout.

  • Copy the table from the PDF document
  • Paste the table into Excel
  • Select all the pasted cells
  • Run the Macro below

 

Sub ConvertListToTable()

'Define the variables
Dim NoOfColumns As Integer
Dim TargetRow As Integer
Dim TargetCol As Integer
Dim i As Integer

'Set the initial values for the variables of where to place the table
TargetRow = Selection.Row
TargetCol = Selection.Column 

'Set the variable of the number of columns in the table
NoOfColumns = 13

'loop through every cell in the selected range
For i = 0 To Selection.Rows.Count - 1
    'Change the value for the Target Column
    TargetCol = TargetCol + 1

     'Set the value of the Target Cell based our the Source Cell
    Cells(TargetRow, TargetCol).Value = Cells(Selection.Row + i, Selection.Column).Value

    'Reset the Target Column and change the value for the Target Row
    If TargetCol = Selection.Column + 13 Then
        TargetRow = TargetRow + 1
        TargetCol = Selection.Column
    End If
Next i
End Sub

 

We will need to change the following line so that it is equal to the number of columns in the source table, else the data will be in the wrong columns.

NoOfColumns = 13

 

Pasted strings: Use Excel functionality to split the string

pdf-data-as-string

If the paste into Excel is a continuous text string for each line then you can use Excel’s built-in functionality to separate the string into columns.

Data -> Text to columns

text-to-column-menu

Select “Delimited” from the Convert Text to Column Wizard Step 1 window

text-to-column-wizard1

Click Next

Select “Space” from the Convert Text to Columns Wizard Step 2 window

text-to-column-wizard2

Click Next

Click Finish

If there are spaces between words in the data this will unfortunately separate each of those words into a different cell. We now have a table in Excel which probably just requires a bit of tweaking.  It may not be perfect, but it’s still closer than what we had before.

 

Pasted Strings: Use VBA to split the string

Rather than using the Excel functionality we could use VBA code to achieve the same effect.

  • Copy the table from the PDF document
  • Paste the table into Excel
  • Select all the pasted cells
  • Run the Macro below

 

Sub PDFStringtoTable()

Dim SplitString As Variant
Dim Rng As Range
Dim Cell As Range
Dim i As Integer
Set Rng = Selection

For Each Cell In Rng

    SplitString = Split(Cell.Value, " ")

    For i = 0 To UBound(SplitString) - 1

        Cell.Offset(0, i + 1).Value = SplitString(i)

    Next i

Next Cell

End Sub

 

Selective copying from the PDF

We can increase the chance of our data formatting correctly by being selective about which parts of the PDF to copy. We don’t have to select the whole table in the PDF file.  For example if the first column in the table is a description column (maybe with spaces between the words), then we will get better results by selecting the first column, converting to Excel, then converting the other columns to Excel.

To achieve this selective copying press Ctrl+Alt whilst selecting the data in the PDF table.

 

A perfect solution?

I have even used Adobe’s own PDF to Excel converter, and even that requires a lot of manual adjustments.  Therefore, I don’t believe that a perfect solution exists for this problem.  But, hopefully one of these workarounds has provided a reasonable solution and has saved you a lot of re-keying time.

Save

Save

2 thoughts on “Copy a data table from PDF into Excel

  1. Sandip says:

    I have a pdf address file that is of the following format. how can i convert this to an excel tabular form.

    Name1 Name 2 Name 3
    Address1 Address2 Address 3
    Address1A Address 2A Address 3A
    City1 City 2 City 3
    phone number

    In the above details, note that in first col address is of 5 lines wherein in 2nd and 3rd is of 4 line.
    How can we have this format converted to excel.

  2. Excel Off The Grid says:

    Hi Sandip – That’s a good question, but unfortunately I don’t have a good answer. Getting data back out of a PDF is difficult, I don’t believe there is a perfect solution.

    A blank in a PDF is nothing, Excel doesn’t even know to leave a blank cell for it, as a result each record is not of a consistent length. It’s hard to apply any rules in these circumstances.

    Did you try copying it into Word first? Did that give you reasonable results?

    Are there any parts of the text, which you can use as a separator (i.e does Name happen to have “Name:” at the start – it’s unlikely, but you might be lucky)?

Leave a Reply

Your email address will not be published. Required fields are marked *