Skip to main content

Interpolation and Data Estimation

A fuller guide to linear interpolation, centred on when a straight-line estimate is reasonable, how to express the estimate honestly, and when extrapolation becomes risky.

Key formulas

Linear interpolation
y = y1 + (x - x1)(y2 - y1) / (x2 - x1)

Interpolation estimates between known points

Linear interpolation assumes that the change between two known data points is approximately straight when viewed locally. It is a way of estimating within the range of measured values rather than beyond them.

That distinction matters because interpolation is usually more defensible than extrapolation. Inside the known interval you are filling a gap. Outside it, you are betting that the pattern continues.

Why the straight-line assumption may or may not be sensible

A linear estimate is reasonable when the interval is small and the underlying relationship is fairly smooth across that span. It becomes weaker when the data are curved, threshold-based, oscillatory, or sparse.

This is why plotting or at least mentally sketching the data helps. If the trend clearly bends, a linear estimate should be framed as a rough local convenience rather than a faithful model of the process.

How the formula works conceptually

Interpolation takes a known starting value and adds a fraction of the difference between the two surrounding values. The fraction comes from how far the target x-value sits between the two known x-values.

That means the formula is really a weighted average of the two y-values, with weights determined by position in the interval.

  • Check that the target x-value sits between the known x-values.
  • Express the position as a fraction of the interval.
  • Apply that fraction to the y-change, then add it to the starting y-value.

Worked example

Suppose a value is 12 at x = 4 and 20 at x = 8. At x = 5, the target sits one quarter of the way across the interval, so add one quarter of the y-change of 8. The interpolated value is therefore 14.

This example shows why interpolation is intuitive: it is just proportional movement between two neighbouring points.

Common mistakes and communication

Linear interpolation is strongest when the gap is small and the surrounding data look smooth. Say that out loud when presenting the result.
  • Interpolating outside the known range and calling it interpolation rather than extrapolation.
  • Ignoring obvious curvature in the data and presenting the estimate as exact.
  • Using x-values that are not ordered cleanly, which muddles the interval fraction.
  • Failing to state that the result is an estimate derived from local linear behaviour.

Where this fits in the wider library

Interpolation sits naturally beside descriptive statistics and probability because it is part of responsible data reasoning: summarise what you know, estimate cautiously where justified, and describe the uncertainty honestly. Use it when you need a missing in-range estimate, not as a substitute for collecting better data.

Related calculators

Apply the topic straight away.