PDF files are pretty much the norm for distributing reports these days. They provide a nice easy way to collate documents from different sources for distribution. However, once a document is in a PDF format getting that information back into a usable form is a nightmare. If we try to copy and paste a data table from PDF into Excel it just doesn’t format as expected. So, in this post, we look at how to copy a table from PDF to Excel.
PDFs are not born equal
The pasted information will be displayed in Excel differently based on how the PDF was created. In my experience the pasted data will show as one of the following:
- A list of values
- A continuous text string for each line
- A picture
I would love to offer you the perfect solution to get the PDF data table into Excel, however I don’t believe there is one. If the paste displays as a picture, then as far as I know, you will need to resort to third-party software which contains OCR. If the paste is either a list of values or a continuous text string, then there are some possible workarounds. One of these should work, providing reasonable results and saving you time.
Here is our example PDF file:
Copy table from PDF to Excel via Word
Excel is a software program which is designed to work with tables, whilst Word is designed to work with text. However, Word is actually better at dealing with PDF tables.
- Copy the table from the PDF document
- Paste the table into Word
- Copy the table from Word
- Paste the table into Excel
You may now have a perfect data table in Excel. Or maybe, which is more likely, you have a table which requires a bit of tweaking. It may not be perfect, but it’s still closer than what you had before.
Pasted list: Use VBA to format the list
If the paste into Excel is just a list of values in one column we can turn to VBA for a bit of help.
The VBA code below which will cycle through the selected data and recreate a table layout.
- Copy the table from the PDF document
- Paste the table into Excel
- Select all the pasted cells
- Run the Macro below
Sub ConvertListToTable() 'Define the variables Dim NoOfColumns As Integer Dim TargetRow As Integer Dim TargetCol As Integer Dim i As Integer 'Set the initial values for the variables of where to place the table TargetRow = Selection.Row TargetCol = Selection.Column 'Set the variable of the number of columns in the table NoOfColumns = 13 'loop through every cell in the selected range For i = 0 To Selection.Rows.Count - 1 'Change the value for the Target Column TargetCol = TargetCol + 1 'Set the value of the Target Cell based our the Source Cell Cells(TargetRow, TargetCol).Value = Cells(Selection.Row + i, Selection.Column).Value 'Reset the Target Column and change the value for the Target Row If TargetCol = Selection.Column + 13 Then TargetRow = TargetRow + 1 TargetCol = Selection.Column End If Next i End Sub
We will need to change the following line so that it is equal to the number of columns in the source table, else the data will be in the wrong columns.
NoOfColumns = 13
Pasted strings: Use Excel functionality to split the string
If the paste into Excel is a continuous text string for each line then you can use Excel’s built-in functionality to separate the string into columns.
Data -> Text to columns
Select “Delimited” from the Convert Text to Column Wizard Step 1 window
Click Next
Select “Space” from the Convert Text to Columns Wizard Step 2 window
Click Next
Click Finish
If there are spaces between words in the data this will unfortunately separate each of those words into a different cell. We now have a table in Excel which probably just requires a bit of tweaking. It may not be perfect, but it’s still closer than what we had before.
Pasted Strings: Use VBA to split the string
Rather than using the Excel functionality we could use VBA code to achieve the same effect.
- Copy the table from the PDF document
- Paste the table into Excel
- Select all the pasted cells
- Run the Macro below
Sub PDFStringtoTable() Dim SplitString As Variant Dim Rng As Range Dim Cell As Range Dim i As Integer Set Rng = Selection For Each Cell In Rng SplitString = Split(Cell.Value, " ") For i = 0 To UBound(SplitString) - 1 Cell.Offset(0, i + 1).Value = SplitString(i) Next i Next Cell End Sub
Selective copying from the PDF
We can increase the chance of our data formatting correctly by being selective about which parts of the PDF to copy. We don’t have to select the whole table in the PDF file. For example if the first column in the table is a description column (maybe with spaces between the words), then we will get better results by selecting the first column, converting to Excel, then converting the other columns to Excel.
To achieve this selective copying press Ctrl+Alt whilst selecting the data in the PDF table.
A perfect solution?
I have even used Adobe’s own PDF to Excel converter, and even that requires a lot of manual adjustments. Therefore, I don’t believe that a perfect solution exists to copy a table from PDF to Excel. But, hopefully one of these workarounds has provided a reasonable solution and has saved you a lot of re-keying time.
Related Posts:
Discover how you can automate your work with our Excel courses and tools.
Excel Academy
The complete program for saving time by automating Excel.
Excel Automation Secrets
Discover the 7-step framework for automating Excel.
Office Scripts: Automate Excel Everywhere
Start using Office Scripts and Power Automate to automate Excel in new ways.
I have a pdf address file that is of the following format. how can i convert this to an excel tabular form.
Name1 Name 2 Name 3
Address1 Address2 Address 3
Address1A Address 2A Address 3A
City1 City 2 City 3
phone number
In the above details, note that in first col address is of 5 lines wherein in 2nd and 3rd is of 4 line.
How can we have this format converted to excel.
Hi Sandip – That’s a good question, but unfortunately I don’t have a good answer. Getting data back out of a PDF is difficult, I don’t believe there is a perfect solution.
A blank in a PDF is nothing, Excel doesn’t even know to leave a blank cell for it, as a result each record is not of a consistent length. It’s hard to apply any rules in these circumstances.
Did you try copying it into Word first? Did that give you reasonable results?
Are there any parts of the text, which you can use as a separator (i.e does Name happen to have “Name:” at the start – it’s unlikely, but you might be lucky)?
THANK YOU VERY MUCH!!!!!!!
No problem – glad I could help 🙂
Thank you very much! It’s really helpful
Good news, I’m glad it was useful.