Chapter 2 - Polynomial Interpolation

Imagine you have a handful of data points. They might be measurements from a lab, outputs from a costly computer simulation, or simply values of a complicated function. Your task is to “connect the dots.” But you want more than just a jagged line; you want a single, smooth, elegant curve that passes exactly through every single point. This is the problem of interpolation.

The most natural tool for this job is the polynomial. Why? Because polynomials are the workhorses of mathematics. They are simple to evaluate, a joy to differentiate, and a breeze to integrate. More profoundly, the Weierstrass Approximation Theorem guarantees that any continuous function can be approximated, to any degree of accuracy we desire, by a polynomial.

So, our goal is clear: given a set of $n + 1$ data points $(x_{0}, y_{0}), (x_{1}, y_{1}), \dots, (x_{n}, y_{n})$ , we seek the unique polynomial of degree at most $n$ that satisfies the interpolation condition:

p (x_{i}) = y_{i} for all i = 0, 1, \dots, n

The points $x_{i}$ are called the nodes or knots. The question is, how do we find this polynomial? This chapter is a story about three ways to build it, the hidden dangers we’ll encounter, and the beautiful solution that saves the day.

The Monomial Basis: The Obvious, Wrong First Step

The first idea that comes to mind is to write the polynomial in its most familiar form, the monomial basis ${1, x, x^{2}, \dots, x^{n}}$ . We assume our polynomial $p (x)$ has the form:

p (x) = c_{0} + c_{1} x + c_{2} x^{2} + \dots + c_{n} x^{n}

To find the unknown coefficients $c_{0}, \dots, c_{n}$ , we simply enforce the interpolation conditions. This creates a system of $n + 1$ linear equations for the $n + 1$ coefficients:

11 ⋮ 1 x_{0} x_{1} ⋮ x_{n} x_{0}^{2} x_{1}^{2} ⋮ x_{n}^{2} \dots \dots ⋱ \dots x_{0}^{n} x_{1}^{n} ⋮ x_{n}^{n} c_{0} c_{1} ⋮ c_{n} = y_{0} y_{1} ⋮ y_{n}

The matrix on the left is the famous Vandermonde matrix. In pure mathematics, this matrix is invertible as long as the nodes $x_{i}$ are distinct, which guarantees a unique solution exists. It seems we’re done.

But in the finite world of computer arithmetic, this approach is a trap. For even a moderate number of points (say, $n > 10$ ), the columns of the Vandermonde matrix begin to look very similar. For example, on the interval $[0, 1]$ , the vectors for $x^{8}$ and $x^{9}$ are nearly indistinguishable. The matrix becomes nearly singular, or ill-conditioned. When a computer tries to solve this system, tiny round-off errors from Chapter 1 are amplified into enormous, meaningless errors in the coefficients.

import numpy as np
 
# Let's try to fit 15 points on [0, 1]
x = np.linspace(0, 1, 15)
y = np.random.rand(15)
 
# NumPy's polyfit attempts to solve the Vandermonde system.
# It's smart enough to warn us when things go wrong.
try:
    coeffs = np.polyfit(x, y, 14)
except np.linalg.LinAlgError as e:
    print(f"Error: {e}")
 
# More likely, you will see a warning like this:
# RankWarning: Polyfit may be poorly conditioned

The Perils of the Monomial Basis

The monomial basis ${1, x, x^{2}, \dots}$ is a terrible choice for practical high-degree polynomial interpolation. The resulting Vandermonde system is severely ill-conditioned, rendering the computed coefficients useless. The most “obvious” mathematical representation is often a poor numerical one.

Interestingly, even if the coefficients are computed with large errors, there’s a wonderfully stable way to evaluate the polynomial: Horner’s Method. It’s a simple, recursive rearrangement:

p (x) = c_{0} + x (c_{1} + x (c_{2} + \dots + x (c_{n}) \dots))

This nested form avoids computing large, potentially inaccurate powers of $x$ and is both fast ( $O (n)$ operations) and numerically robust. But a stable evaluation method can’t save us if we can’t find the coefficients in the first place.

The Newton Basis: A Smarter Construction

The failure of the monomial basis teaches us a crucial lesson: the choice of basis matters. We need a basis that is better suited to the problem. This brings us to the Newton form of the interpolating polynomial.

The core idea is incremental construction. Let’s build the polynomial one point at a time.

For one point $(x_{0}, y_{0})$ , the polynomial is a constant: $p_{0} (x) = c_{0}$ . We simply set $c_{0} = y_{0}$ .
To add a second point $(x_{1}, y_{1})$ , we don’t start from scratch. We add a new term that is zero at $x_{0}$ , so it doesn’t disturb our previous work: $p_{1} (x) = p_{0} (x) + c_{1} (x - x_{0}) = c_{0} + c_{1} (x - x_{0})$ We then choose $c_{1}$ to satisfy the new condition: $y_{1} = c_{0} + c_{1} (x_{1} - x_{0})$ , which gives $c_{1} = \frac{y _{1} - y _{0}}{x _{1} - x _{0}}$ .

This leads to the Newton basis: ${1, (x - x_{0}), (x - x_{0}) (x - x_{1}), \dots, \prod_{i = 0}^{n - 1} (x - x_{i})}$ . The polynomial is written as:

p (x) = c_{0} + c_{1} (x - x_{0}) + c_{2} (x - x_{0}) (x - x_{1}) + \dots + c_{n} (x - x_{0}) \dots (x - x_{n - 1})

The coefficients $c_{k}$ are called divided differences. If we write out the system of equations for the $c_{k}$ , we find something wonderful. The matrix is lower triangular!

11 ⋮ 1 0 (x_{1} - x_{0}) ⋮ (x_{n} - x_{0}) 00 ⋮ \dots \dots \dots ⋱ \prod_{i = 0}^{n - 1} (x_{n} - x_{i}) 00 ⋮ c_{0} c_{1} ⋮ c_{n} = y_{0} y_{1} ⋮ y_{n}

Solving a triangular system is fast and stable. The coefficients can be found via a simple recurrence. We define the divided differences, with the notation $c_{k} = y [x_{0}, \dots, x_{k}]$ , as follows:

y [x_{i}] y [x_{i}, \dots, x_{i + k}] = y_{i} = \frac{y [ x _{i + 1} , \dots , x _{i + k} ] - y [ x _{i} , \dots , x _{i + k - 1} ]}{x _{i + k} - x _{i}}

These can be computed efficiently in a table. The coefficients $c_{0}, \dots, c_{n}$ for the Newton polynomial are simply the top diagonal of this table. The total cost is a stable $O (n^{2})$ .

A Practical Example: Divided Differences in Action

This process is best understood with a concrete example. Let’s find the interpolating polynomial that passes through these four points:

(x_{0}, y_{0}) = (0, 1), (x_{1}, y_{1}) = (1, 4), (x_{2}, y_{2}) = (3, 2), (x_{3}, y_{3}) = (4, 2)

We build the divided difference table column by column.

Column 0 (Zeroth differences): These are just the $y$ values.
- $y [x_{0}] = 1$
- $y [x_{1}] = 4$
- $y [x_{2}] = 2$
- $y [x_{3}] = 2$
Column 1 (First differences):
- $y [x_{0}, x_{1}] = \frac{y [ x _{1} ] - y [ x _{0} ]}{x _{1} - x _{0}} = \frac{4 - 1}{1 - 0} = 3$
- $y [x_{1}, x_{2}] = \frac{y [ x _{2} ] - y [ x _{1} ]}{x _{2} - x _{1}} = \frac{2 - 4}{3 - 1} = - 1$
- $y [x_{2}, x_{3}] = \frac{y [ x _{3} ] - y [ x _{2} ]}{x _{3} - x _{2}} = \frac{2 - 2}{4 - 3} = 0$
Column 2 (Second differences):
- $y [x_{0}, x_{1}, x_{2}] = \frac{y [ x _{1} , x _{2} ] - y [ x _{0} , x _{1} ]}{x _{2} - x _{0}} = \frac{- 1 - 3}{3 - 0} = - \frac{4}{3}$
- $y [x_{1}, x_{2}, x_{3}] = \frac{y [ x _{2} , x _{3} ] - y [ x _{1} , x _{2} ]}{x _{3} - x _{1}} = \frac{0 - ( - 1 )}{4 - 1} = \frac{1}{3}$
Column 3 (Third differences):
- $y [x_{0}, x_{1}, x_{2}, x_{3}] = \frac{y [ x _{1} , x _{2} , x _{3} ] - y [ x _{0} , x _{1} , x _{2} ]}{x _{3} - x _{0}} = \frac{\frac{1}{3} - ( - \frac{4}{3} )}{4 - 0} = \frac{5/3}{4} = \frac{5}{12}$

The coefficients for the Newton polynomial are the top entries in each column: $c_{0} = 1$ , $c_{1} = 3$ , $c_{2} = - 4/3$ , $c_{3} = 5/12$ .

So, our interpolating polynomial is:

p (x) = 1 + 3 (x - 0) - \frac{4}{3} (x - 0) (x - 1) + \frac{5}{12} (x - 0) (x - 1) (x - 3)

Implementation and Evaluation

Here is the complete process in Python. First, a function to compute the coefficients, then a function to evaluate the polynomial using a nested form analogous to Horner’s method.

import numpy as np
import matplotlib.pyplot as plt
 
def divided_differences(x_nodes, y_nodes):
    """
    Computes the coefficients of the Newton interpolating polynomial.
    Returns the top diagonal of the divided difference table.
    """
    n = len(y_nodes)
    coeffs = np.zeros([n, n])
    # The first column is the y values
    coeffs[:, 0] = y_nodes
    
    # Iterate to compute the table
    for j in range(1, n):
        for i in range(n - j):
            coeffs[i, j] = (coeffs[i+1, j-1] - coeffs[i, j-1]) / (x_nodes[i+j] - x_nodes[i])
            
    # The coefficients are the first row
    return coeffs[0]
 
def newton_eval(coeffs, x_nodes, x_eval):
    """
    Evaluates the Newton polynomial at points x_eval.
    Uses a nested evaluation scheme (Newton-Horner).
    """
    n = len(coeffs) - 1
    # Start with the highest degree coefficient
    p = coeffs[n]
    
    # Loop backwards
    for i in range(n - 1, -1, -1):
        p = p * (x_eval - x_nodes[i]) + coeffs[i]
        
    return p
 
# --- Main Example ---
# Our data points
x_data = np.array([0, 1, 3, 4], dtype=float)
y_data = np.array([1, 4, 2, 2], dtype=float)
 
# 1. Compute the Newton coefficients
newton_coeffs = divided_differences(x_data, y_data)
print(f"Newton Coefficients: {newton_coeffs}")
 
# 2. Create a fine grid for plotting
x_plot = np.linspace(-0.5, 4.5, 400)
 
# 3. Evaluate the polynomial on the fine grid
y_plot = newton_eval(newton_coeffs, x_data, x_plot)
 
# 4. Plot the results
plt.figure(figsize=(8, 6))
plt.plot(x_data, y_data, 'ro', label='Data Points')
plt.plot(x_plot, y_plot, 'b-', label='Newton Interpolating Polynomial')
plt.title("Newton Polynomial Interpolation")
plt.xlabel("x")
plt.ylabel("y")
plt.grid(True)
plt.legend()
plt.show()

The code confirms our hand-calculated coefficients and produces a smooth curve that perfectly threads through the data points, just as we designed.

The Error of Interpolation

The Newton form gives us a powerful tool to analyze the error. For a function $f (x)$ , the divided difference is intimately related to its derivatives.

Theorem 2.2.11: Divided Differences and Derivatives If $f$ is $n$ times continuously differentiable, then for any distinct points $x_{0}, \dots, x_{n}$ , there exists a point $ξ$ in the interval spanned by these points such that: $y [x_{0}, x_{1}, \dots, x_{n}] = \frac{f ^{(n)} ( ξ )}{n !}$

This beautiful result connects the discrete world of our data points to the continuous world of calculus. It leads directly to the fundamental error formula for polynomial interpolation.

Theorem 2.2.12: Interpolation Error Formula Let $p (x)$ be the polynomial of degree $n$ that interpolates a function $f (x)$ at the nodes $x_{0}, \dots, x_{n}$ . If $f$ is $n + 1$ times continuously differentiable, then for any $x$ , there exists a $ξ$ in the interval spanned by $x, x_{0}, \dots, x_{n}$ such that: $f (x) - p (x) = \frac{f ^{(n + 1)} ( ξ )}{( n + 1 )!} \prod_{i = 0}^{n} (x - x_{i)}$

This formula is the key to everything. It tells us the error depends on three things:

The smoothness of the function, via $f^{(n + 1)}$ . Smoother functions are easier to interpolate.
The length of the interval, hidden in the product term.
The placement of the nodes $x_{i}$ , which controls the nodal polynomial $ω (x) = \prod_{i = 0}^{n} (x - x_{i})$ .

The Lagrange Basis and Barycentric Form: The Pinnacle of Practice

Let’s try one more basis. What if we designed a basis so simple that finding the coefficients required no work at all? This is the idea behind the Lagrange basis.

The Lagrange basis polynomials, $ℓ_{j} (x)$ , are defined by the elegant property:

ℓ_{j} (x_{i}) = δ_{ij} = {10 if i = j if i \neq = j

Each basis polynomial $ℓ_{j} (x)$ is 1 at its “home” node $x_{j}$ and 0 at all other nodes. You can think of them as switches. We can construct them explicitly:

ℓ_{j} (x) = i = 0, i \neq = j \prod n \frac{x - x _{i}}{x _{j} - x _{i}}

With this basis, the interpolating polynomial is trivial to write down:

p (x) = j = 0 \sum n y_{j} ℓ_{j} (x)

Each term $y_{j} ℓ_{j} (x)$ ensures the polynomial has the value $y_{j}$ at $x_{j}$ , and since all other $ℓ_{i} (x_{j})$ are zero, they don’t interfere.

While beautiful for theory, this form is slow for computation. Evaluating each $ℓ_{j} (x)$ takes $O (n)$ work, making the total cost to evaluate $p (x)$ a sluggish $O (n^{2})$ .

However, a clever algebraic rearrangement of the Lagrange formula leads to the barycentric interpolation formula, arguably the best way to evaluate an interpolating polynomial in practice.

p (x) = \frac{\sum _{j = 0}^{n} \frac{w _{j}}{x - x _{j}} y _{j}}{\sum _{j = 0}^{n} \frac{w _{j}}{x - x _{j}}} where the barycentric weights are w_{j} = \frac{1}{\prod _{k \neq = j} ( x _{j} - x _{k} )}

This formula is a masterpiece of numerical design:

Fast: After a one-time, $O (n^{2})$ cost to compute the weights $w_{j}$ , each subsequent evaluation of $p (x)$ costs only $O (n)$ .
Stable: It is numerically stable, even when evaluating very close to one of the nodes.
Flexible: If you add a new point, updating the weights is an efficient $O (n)$ process.

The Barycentric Formula is the Right Way

For practical computation, the barycentric form is the method of choice. It combines the theoretical elegance of the Lagrange basis with the speed and stability needed for real-world applications. scipy.interpolate.BarycentricInterpolator is your tool for this.

The Runge Phenomenon: A Shocking Failure

We now have a powerful, stable, and efficient method. Let’s put it to the test on a simple, smooth, bell-shaped function on the interval $[- 1, 1]$ , known as the Runge function:

f (x) = \frac{1}{1 + 25 x ^{2}}

We’ll interpolate it at $n + 1$ equally spaced points. What could possibly go wrong?

The result is a catastrophe. The polynomial matches the function at the nodes, but it oscillates wildly near the ends of the interval. And it gets worse: as we add more equally spaced points, the oscillations grow larger and more violent. This is the infamous Runge phenomenon.

What’s happening? The error formula told us everything: $f (x) - p (x) \propto \prod (x - x_{i})$ . For equally spaced points, this nodal polynomial $\prod (x - x_{i})$ is small in the middle of the interval but grows exponentially large near the endpoints. The polynomial has too much freedom, and it uses that freedom to wiggle uncontrollably.

The formal way to measure this instability is with the Lebesgue constant, $Λ_{n}$ , defined as the maximum value of $\sum_{j = 0}^{n} ∣ ℓ_{j} (x) ∣$ . It bounds how much errors in the $y_{i}$ values can be amplified.

Theorem 2.3.8: Error Bound with Lebesgue Constant The error of the interpolating polynomial $p (x)$ is bounded by:
$∥ f - p ∥_{\infty} \leq (1 + Λ_{n}) q \in P_{n} min ∥ f - q ∥_{\infty}$

This theorem is profound. It says that the error of our interpolant is at most $(1 + Λ_{n})$ times the error of the best possible polynomial approximation of that degree. If $Λ_{n}$ is small, our interpolant is nearly optimal. If $Λ_{n}$ is huge, our interpolant can be terrible, even if a good approximation exists.

For equally spaced points, the Lebesgue constant grows exponentially: $Λ_{n} \approx 2^{n} / (e n lo g n)$ . This is the mathematical signature of the Runge phenomenon.

The Solution: Chebyshev Points

The problem wasn’t the polynomial. It was our choice of points. The error formula points the way: to make the interpolation good, we must choose the nodes $x_{i}$ to make the nodal polynomial $\prod (x - x_{i})$ as small as possible across the entire interval.

This is a famous problem with a beautiful solution. The optimal points are not equally spaced. They are the Chebyshev points.

They are the projections onto the x-axis of points equally spaced around a semicircle. This causes them to be clustered near the endpoints of the interval, precisely where the Runge phenomenon was worst. The formula for the Chebyshev nodes of the second kind (the extrema of $T_{n} (x)$ ) on $[- 1, 1]$ is:

x_{k} = cos (\frac{kπ}{n}), k = 0, \dots, n

Let’s repeat our experiment with these “magic” points.

The result is a triumph. The oscillations are gone. The polynomial is now a spectacular approximation to the function. As we increase $n$ , the polynomial converges rapidly to the true function.

Why does this work? The nodal polynomial for Chebyshev points is, in fact, a scaled Chebyshev polynomial, which has the unique property of having the smallest possible maximum magnitude on $[- 1, 1]$ of any monic polynomial of its degree. Quantitatively, the Lebesgue constant for Chebyshev points grows only logarithmically: $Λ_{n} \approx \frac{2}{π} lo g n$ . This slow growth guarantees stability and convergence for any reasonably smooth function.

The Golden Rule of Polynomial Interpolation

If you must use a single high-degree polynomial, never use equally spaced points. Use Chebyshev points. They are strategically placed to suppress oscillations and guarantee convergence.

Chebyshev Polynomials and the FFT Connection

The theory behind these points involves the remarkable Chebyshev polynomials, defined by $T_{n} (x) = cos (n arccos x)$ . While this definition involves trig functions, they are indeed polynomials, satisfying a three-term recurrence relation:

T_{0} (x) = 1, T_{1} (x) = x, T_{n + 1} (x) = 2 x T_{n} (x) - T_{n - 1} (x)

These polynomials form an orthogonal basis. This means we can represent our interpolating polynomial $p (x)$ as a sum of Chebyshev polynomials:

p (x) = k = 0 \sum n c_{k} T_{k} (x)

The amazing part is that the coefficients $c_{k}$ can be computed very quickly. The formula for the coefficients turns out to be a Discrete Cosine Transform (DCT) of the function values at the Chebyshev points. The DCT is a close relative of the Fast Fourier Transform (FFT). This means we can find the coefficients not in $O (n^{2})$ time, but in $O (n lo g n)$ time.

This connects three great ideas: stable polynomial interpolation, the theory of orthogonal polynomials, and the immense computational power of the FFT. It’s a perfect example of how deep connections in mathematics lead to powerful, practical algorithms.

Next: Chapter 3 - Trigonometric Interpolation

CS Notes

Explorer

Chapter 2 - Polynomial Interpolation

The Monomial Basis: The Obvious, Wrong First Step

The Newton Basis: A Smarter Construction

A Practical Example: Divided Differences in Action

Implementation and Evaluation

The Error of Interpolation

The Lagrange Basis and Barycentric Form: The Pinnacle of Practice

The Runge Phenomenon: A Shocking Failure

The Solution: Chebyshev Points

Chebyshev Polynomials and the FFT Connection

Table of Contents

Graph View

Backlinks