Recently, I received the following question from a reader:
“I have an Excel question – Is there a way to interpolate a value from a table? I have an X and Y that are not on the table, but have correlated data so want to calculate the interpolated value”.
At first, I thought I could just use simple mathematics, which would definitely work if the results were perfectly linear. But what if they are not? What if they are not perfectly correlated?
I then thought about Excel’s FORECAST function. Based on its name, the FORECAST function seems like an odd choice, but it’s actually one of the best options for linear interpolation in Excel.
But what if the X and Y relationship is not linear at all? Tere would need to be a different approach.
In this post we’ll look at these three options:
- Interpolation using simple mathematics
- Interpolation using the FORECAST function
- Interpolation when the data is not linear.
Interpolation using simple mathematics
Using simple mathematics works well when there are just two pairs of numbers or where the relationship between X & Y is perfectly linear. Here is a basic example:
The formula in Cell E3 is:
That might look a bit complicated to some, so I’ll just give a quick overview of this formula.
The last section (highlighted in red above) calculates how much the Y value moves whenever the X value moves by 1. In our example, Y moves by 1.67 for every 1 of X.
The second section (in red above) calculates how far our interpolated X is away from the first X, then multiplies it by the value calculated above. Based on our example, it is 17.5 (Cell E2) minus 10 (Cell A2), the result of which is then multiplied by 1.67 (which is 12.5).
Finally, in the first section of the formula (in red above), we add the first Y value. In our example, this provides a final result of 77.5 (65 + 12.5). For anybody who remembers high school mathematics the formula is as follows:
Even if you don’t remember that from school, the good news is that Excel has given us a simpler option, the FORECAST function.
Interpolation using the FORECAST function
The FORECAST function is great when we want to extrapolate values, for example when trying to forecast the value of a future point. But, FORECAST can also be used to estimate an interpolated value.
Using the same numbers as above, the function in Cell E3 could have been:
The FORECAST function has the following Syntax:
The three arguments in the function are
- x – the data point for which we want to predict a value
- known_y’s – the range of cells or array of values containing the known Y values
- known_x’s – the range of cells or array of values containing the known X values
When using the FORECAST function the result of Cell E3 is also 77.5 (just as in the first example above).
But . . . what if our data isn’t perfectly linear? Look at the chart below, the data has a linear relationship, but it’s not perfect.
In these circumstances the FORECAST function is even more useful, as it does not just interpolate between the first and last values.
The function in Cell E3 is:
In these circumstances, the FORECAST estimates a value based on all the available data, rather than just the start and end points. The result of the FORECAST function in cell E3 is an estimated a value of 77.3 (rounded to 1 decimal place), which is more accurate than if we had applied simple linear interpolation.
Interpolation when the data is not linear
But, what if the data is not linear at all? Then what? Here is our new data:
Look at the chart below. If we took a simple linear approach, that would give us a value of 77.5, quite a long way from the curve. Using the FORECAST function as above would provide a result of 70.8, also a long way from the curve.
In this circumstance, our approach will need to change. We need to find a way of obtaining the result above and the result below the value we wish to interpolate, then using linear interpolation between those two points. Using our example of 17.5, we would need to retrieve the results of 16 and 18, then calculate the linear interpolation. To do this we will use the INDEX, MATCH and FORECAST functions combined.
Firstly, we need to use the MATCH function to retrieve the position of the value lower than 17.5.
The last argument of the MATCH function indicates that we wish to use an approximate match (i.e., the closest value below the lookup value). 16 is the closest value below 17.5. 16 is the 5th item in the list of X values so that it will return a the value of 5.
Now things starts to get a bit tricky. We can combine the INDEX function with the MATCH function to create a range for the two Y values (note this method does require data sorted in ascending order).
The first INDEX function will return the cell reference of B6 (the result of the red highlighted section). The second INDEX function will return the cell reference of B7 (the result of the blue highlighted section). These are separated by a colon ( : ), to create a range – B6:B7
We can do the same to create a range for the two X values.
Note: It is also possible to use the OFFSET function if you prefer (but it is a volatile function, so may cause slower calculations).
Now, let’s combine the range we created for the Y values and the range we created for the X values within the FORECAST function.
=FORECAST(E2, INDEX(B2:B11,MATCH(E2,A2:A11,1)):INDEX(B2:B11,MATCH(E2,A2:A11,1)+1), INDEX(A2:A11,MATCH(E2,A2:A11,1)):INDEX(A2:A11,MATCH(E2,A2:A11,1)+1))
That’s a pretty big formula, right. Hopefully I’ve managed to explain it so that it’s not too scary.
The result of the Non-Linear FORECAST function is 67.6 (to 1 decimal place, shown in Cell F5 below). Have a look at the chart again you will see it is a much more accurate estimation based on the data we have.
One word of warning – ultimately this is still a linear interpolation but based on the two values either side. Therefore, the distance between the values above and below will have a direct impact on how accurate the interpolation is.
Other uses for the FORECAST function
The FORECAST function can also be used to extrapolate values to estimate a future value. When the data follows a season trend, we must use some Excel trickery – find out how here.