In the last chapter, we found a powerful tool for interpolation: the polynomial. With a careful choice of points, the Chebyshev points, we could create excellent approximations. But polynomials have a fundamental character: they want to go to infinity. They are not periodic. If you try to model a phenomenon that repeats itself, a sound wave, a heartbeat, the seasons, forcing a polynomial to do the job is like trying to draw a circle with a set of straight rulers. It can be done, but it’s awkward and unnatural.

We need a new set of building blocks, a new language designed from the ground up for periodicity. This is the world of sines and cosines. This is the world of Jean-Baptiste Joseph Fourier. His revolutionary idea, born from studying the flow of heat, was that any periodic function could be represented as a sum of simple, harmonically related sinusoids. This was more than a new technique; it was a new way of seeing. It gave us a pair of glasses to switch between viewing a function as a shape in time and as a spectrum of frequencies.

The Continuous World: Fourier Series

Instead of building our functions from powers of , like , we will build them from a basis of pure oscillations. It’s most elegant to use complex exponentials, which wrap up sine and cosine into one beautiful package thanks to Euler’s formula, . Using is mathematically convenient because it combines the properties of both sine and cosine, simplifying the algebra greatly.

A trigonometric polynomial of degree is a function of the form:

where the are complex coefficients. This function is naturally periodic with a period of 1. If the coefficients satisfy the symmetry (which means the coefficient for frequency is the complex conjugate of the coefficient for frequency ), the function will be purely real-valued.

Functions as Vectors in an Infinite-Dimensional Space

To truly grasp Fourier’s idea, we must think like a linear algebraist, as Gilbert Strang would urge us to. Let’s consider the space of all complex-valued functions on the interval that have a finite “energy,” a space called . (Think of as the space of all functions that have a finite “length” or magnitude when measured continuously.) In this space, functions are not just curves; they are vectors. We can define an inner product (the continuous version of a dot product) between two functions and :

This inner product measures how much the two functions align or overlap. The “length” or norm of a function is then .

The magic of the complex exponentials, , is that they form an orthonormal basis for this infinite-dimensional vector space. “Orthonormal” means these building blocks are perfectly independent of each other (orthogonal) and all have unit length (normalized). Let’s check the orthogonality:

They are perfectly orthogonal, just like the standard basis vectors in . This orthogonality is not just a mathematical curiosity; it is the key that unlocks everything.

Finding the Fourier Coefficients

Because the basis is orthonormal, finding the coordinates of a function is easy. Just as you find the -th coordinate of a vector by projecting it onto the -th basis vector (), we find the Fourier coefficients of a function by projecting it onto our basis functions:

The coefficient tells us exactly how much of the wave with frequency is contained in the function .

The Fourier Series Theorem then states that we can reconstruct the original function perfectly by summing up these basis functions weighted by their coefficients:

This is a profound statement. It says any well-behaved periodic function is a superposition of simple waves. The set of coefficients is the function’s representation in the frequency domain.

The Dialogue Between Smoothness and Decay

Here we come to a crucial insight. There is a beautiful and deep connection between how smooth a function is in the time domain and how quickly its Fourier coefficients decay in the frequency domain. What happens when we differentiate a Fourier series?

The Fourier coefficients of the derivative, , are simply . For the series of the derivative to converge (i.e., for its energy to be finite), the coefficients must decay faster than . For the second derivative, they must decay faster than , and so on.

Smoothness is Rapid Decay

The smoother a function is, the faster its Fourier coefficients must go to zero as the frequency . This means for smooth functions, we only need to keep a few low-frequency coefficients to achieve excellent accuracy.

  • A function with a jump discontinuity (the least smooth): .
  • A continuous function with a sharp corner: .
  • An infinitely smooth function (like a Gaussian): decays faster than any power of (exponential decay).

This dialogue is the key to signal compression, denoising, and understanding why our methods work.

The Discrete World: The Discrete Fourier Transform (DFT)

The Fourier series is a tool for the continuous world of pure mathematics. On a computer, we only have a finite number of samples. Suppose we measure a function at equally spaced points in the interval :

Our data is a vector of values , where .

How can we approximate the Fourier coefficients from this discrete data? The most natural approach is to replace the integral in the definition of with a numerical quadrature rule. The simplest one for a periodic function is the trapezoidal rule (which, in this case, is equivalent to a simple Riemann sum):

Let’s define a vector of coefficients, , based on this sum (we’ll ignore the scaling factor for now):

This formula is the Discrete Fourier Transform (DFT). It is a linear transformation that takes a vector of samples in the “time domain” and produces a vector of coefficients in the “frequency domain.”

The Miracle of Discrete Interpolation

Now for a moment of magic. What happens if we build a trigonometric polynomial using these approximated coefficients, ?

Let’s evaluate this polynomial at one of our sample points, :

The term in the parenthesis is a geometric sum of the powers of the -th root of unity, . This sum has a remarkable property, often called the discrete orthogonality relation:

(Don’t worry about the detailed derivation here; the key is the outcome.) Since both and are between and , this is equivalent to saying the sum is 1 if and 0 otherwise. The inner sum collapses, acting like a Kronecker delta, picking out only the term where . The result is:

This is a stunning result. The trigonometric polynomial built from the DFT coefficients exactly interpolates the original data points. The numerical approximation we made for the continuous integral has, by a beautiful symmetry of mathematics (the discrete orthogonality), become an exact interpolation formula.

The DFT as a Change of Basis

The DFT is a linear transformation, which we can write as a matrix-vector product, . The matrix is the DFT matrix:

This matrix is a change of basis matrix. It transforms a vector from the standard basis (the “time domain”) into the discrete Fourier basis (the “frequency domain”). The columns of the inverse Fourier matrix, , are the discrete basis vectors themselves.

The Algorithm: The Fast Fourier Transform (FFT)

The DFT is a beautiful theoretical tool, but a direct matrix-vector multiplication (calculating all coefficients by summing terms each) costs operations. For a million data points (), that’s a trillion operations, impossibly slow.

The Fast Fourier Transform (FFT) is a revolutionary algorithm that computes the exact same DFT in only time. For , this is the difference between a calculation taking hours or days and one taking milliseconds. It was rediscovered by Cooley and Tukey in 1965 and is one of the most important algorithms ever developed.

The idea is a classic divide and conquer. A DFT of size can be broken down into smaller DFTs. If (an even number), we can split the sum into even and odd indices:

Essentially, instead of solving one big problem of size , we solve two half-sized problems (DFTs of size ), and then combine the results with only extra steps (the “twiddle factors,” ). By applying this recursive splitting many times ( times), the total cost comes down from to .

Applications and Practical Details

The FFT is not just a mathematical curiosity; it is a workhorse of modern science and engineering.

Frequency Analysis: The Signal’s DNA

The FFT is like a prism for data. It takes a complex signal and reveals the pure frequencies it’s composed of. The magnitude squared of the Fourier coefficients, , is called the power spectrum. Peaks in the power spectrum correspond to the dominant frequencies in the signal.

Example: Finding Frequencies in a Noisy Signal

Let’s create a signal composed of two sine waves (at 1 Hz and 7 Hz) and add some random noise. Can we recover the original frequencies?

import numpy as np
import matplotlib.pyplot as plt
 
N = 128  # Number of sample points
t = np.linspace(0, 1, N, endpoint=False) # Time vector (samples taken over 1 second)
 
# Create a signal with two frequencies + noise
y = np.sin(1 * 2 * np.pi * t) + 0.5 * np.sin(7 * 2 * np.pi * t)
y += 0.5 * np.random.randn(N)
 
# Compute the DFT and the power spectrum
c = np.fft.fft(y)
power = np.abs(c)**2
 
# The frequencies corresponding to the coefficients (in Hz)
freqs = np.fft.fftfreq(N, d=1/N)
 
# Plotting
plt.figure(figsize=(12, 5))
plt.subplot(1, 2, 1)
plt.plot(t, y)
plt.title("Signal in Time Domain (Looks like noise!)")
plt.xlabel("Time")
 
plt.subplot(1, 2, 2)
# We only plot the first half of the frequencies, due to symmetry
plt.plot(freqs[:N//2], power[:N//2])
plt.title("Power Spectrum (The Signal's DNA)")
plt.xlabel("Frequency (Hz)")
plt.grid()
plt.show()

The plot of the power spectrum will show two sharp peaks at exactly 1 Hz and 7 Hz, rising clearly above the noise floor. The FFT has successfully extracted the underlying components from the seemingly random data.

Aliasing: The Wagon-Wheel Effect

A crucial question is: how many samples do we need? The Nyquist-Shannon sampling theorem provides the definitive answer. If the highest frequency present in a continuous signal is , you must sample at a rate greater than (the Nyquist rate) to avoid losing information.

If you sample too slowly, high frequencies get “folded down” and masquerade as lower frequencies. This is called aliasing. It’s the reason why in old movies, the wheels of a speeding stagecoach sometimes appear to spin slowly backward. The camera’s frame rate (the sampling frequency) is too slow to capture the rapid rotation of the wheel spokes, and our brain is aliased into seeing a slower, backward motion.

Beware of Aliasing

If you don’t sample fast enough, you will see ghosts. High frequencies will appear as low frequencies, and you will draw completely wrong conclusions from your data. Always be sure your sampling rate is at least twice the highest frequency you care about.

The Grand Finale: DFT and Chebyshev Interpolation

This chapter might seem far removed from the polynomials of Chapter 2, but here is the grand, unifying connection.

Recall the change of variables . This maps the interval for (the standard polynomial domain) to the interval for .

  • Chebyshev points in are of the form . When you look at them in the variable, these are equally spaced points.
  • A polynomial in , , becomes the function . A remarkable fact is that can be written as a sum of terms like , which means it is a trigonometric polynomial in the variable.

This leads to a spectacular conclusion:

Interpolating a function at Chebyshev nodes is mathematically identical to performing trigonometric interpolation on the related, periodic function at equally spaced nodes in .

This is a profound link between two worlds. It means we can use the incredible speed of the FFT to solve the polynomial interpolation problem. The problem of finding the coefficients for a Chebyshev expansion can be computed not in time, but in time using a variant of the FFT called the Discrete Cosine Transform (DCT).

The stability and accuracy of Chebyshev polynomial interpolation are powered by the computational engine of the FFT. It’s a perfect example of how deep connections across different fields of mathematics lead to the most powerful and elegant numerical methods.

Next: Chapter 4 - Piecewise Polynomial Interpolation