05 Discrete Fourier Transform

Recap and Our Goal Today

In our last session, we introduced the revolutionary idea of Fourier analysis: any signal can be seen as a sum of simple waves.

We saw that a function’s smoothness is directly related to how quickly its Fourier coefficients decay. This is the secret behind data compression.
We moved from the continuous world of functions to the practical world of discrete samples, defining the Discrete Fourier Transform (DFT) as the tool to find the frequency components of our sampled data.

However, we left the DFT as a somewhat magical formula. Today, our goal is to demystify it completely. We will show that the DFT is not just a formula; it is a fundamental operation in linear algebra: a change of basis. We will build the entire DFT from the ground up and see how this change of perspective from the “time domain” to the “frequency domain” gives us incredible power to analyze and manipulate signals.

Signals as Vectors

Let’s start with our data. When we sample a continuous signal, we get a finite list of numbers. For example, if we record one second of audio at 8 Hz, we get 8 numbers representing the microphone’s pressure at each sample time.

y = [y_{0}, y_{1}, y_{2}, y_{3}, y_{4}, y_{5}, y_{6}, y_{7}]^{T}

This is a vector in an 8-dimensional space, $C^{8}$ . We call this the time-domain representation. It tells us the signal’s amplitude at each moment in time. This is the standard way we think about data, and it corresponds to representing our vector in the standard basis (where each basis vector has a single 1 and the rest are zeros).

But is this the most useful representation? If the signal was a pure musical note, the time-domain vector would look like a sampled sine wave. The crucial information isn’t the 8 individual values, but the single fact that it’s a “440 Hz sine wave.” Our goal is to find a new basis for our vector space that directly reveals this kind of frequency information.

Building a Basis for Frequency

What would a good basis for frequency look like? The basis vectors themselves should look like the fundamental components we’re searching for: simple, oscillating waves. We need a set of $N$ basis vectors for $C^{N}$ that represent pure frequencies.

The Roots of Unity

To build these wave-like vectors, we need the fundamental building blocks of digital oscillation. These are the N-th roots of unity.

The N-th Roots of Unity

The principal N-th root of unity, $ω_{N}$ , is the complex number that corresponds to taking one step out of $N$ around the unit circle in the complex plane:
$ω_{N} = e^{i 2 π / N} = cos (2 π / N) + i sin (2 π / N)$
The powers of this number, ${ω_{N}^{0}, ω_{N}^{1}, \dots, ω_{N}^{N - 1}}$ , give us $N$ perfectly spaced points on the unit circle. They are the “atoms” of discrete rotation. They are periodic: $ω_{N}^{k + N} = ω_{N}^{k}$ .

Why We Use Roots of Unity

When constructing a frequency basis for discrete signals, we need functions that are both periodic and orthogonal over $N$ samples. The complex exponentials built from the roots of unity satisfy both conditions perfectly:

Periodicity:
Each root of unity $ω_{N}^{k}$ returns to 1 after $N$ steps because $(ω_{N}^{k})^{N} = 1$ . This means every vector built from powers of $ω_{N}^{k}$ fits exactly within the $N$ -sample window, no leftover partial cycles.
Orthogonality:
Different frequencies $k$ and $ℓ$ correspond to different powers of $ω_{N}$ . When we take their inner product, the sum
$n = 0 \sum N - 1 ω_{N}^{(k - ℓ) n}$
equals zero unless $k = ℓ$ . This cancellation happens because the $N$ roots of unity are symmetrically distributed around the circle, their vector sum is zero. Hence, each frequency vector is orthogonal to all the others.

The Dot Product in $C^{N}$ In complex vector spaces, orthogonality is defined using the Hermitian inner product (also called the complex dot product):

$⟨ v, w ⟩ = n = 0 \sum N - 1 v [n] \overline{w [n]} .$
The conjugate ensures that the length (or energy) of a vector is always a positive real number, even when components are complex.
Orthogonality means this inner product equals zero, the two vectors have no overlap in the sense of their entire sequences, not just their geometric angle on the complex plane.
So while two roots of unity on the circle may not be $9 0^{\circ}$ apart visually, the sequences formed by their powers are orthogonal in $C^{N}$ because their products cancel out over all $N$ samples.

Example:
Let’s take $N = 4$ and $ω_{4} = e^{- iπ /2} = i$ . The four roots of unity are ${1, i, - 1, - i}$ .
- For $k = ℓ = 1$ :
  $⟨ v_{1}, v_{1} ⟩ = n = 0 \sum 3 ω_{4}^{(1 - 1) n} = n = 0 \sum 3 1 = 4.$
  The inner product is nonzero, this is the vector’s own energy.
- For $k = 1$ , $ℓ = 2$ :
  $⟨ v_{1}, v_{2} ⟩ = n = 0 \sum 3 ω_{4}^{(1 - 2) n} = n = 0 \sum 3 ω_{4}^{- n} = 1 + (- i) + (- 1) + i = 0.$
  Here the complex terms cancel exactly, showing perfect orthogonality between two different frequency vectors.
Completeness:
There are exactly $N$ distinct powers of $ω_{N}$ before repetition, giving us $N$ unique, linearly independent vectors. Together, they span the entire $N$ -dimensional space $C^{N}$ .

In short, the roots of unity give us the simplest possible discrete oscillations that (a) fit neatly into an $N$ -sample window, (b) don’t interfere with each other when projected (orthogonality), and (c) completely describe all possible periodic behaviors in discrete time. This makes them the natural and unique choice for building the DFT basis.

Assembling the Atoms into Basis Vectors

We can now use these roots of unity to construct our basis vectors. Each basis vector, $v_{k}$ , will represent a pure wave of frequency $k$ .

Definition 3.2.3: The Trigonometric Basis

The $k$ -th trigonometric basis vector for $C^{N}$ is built by taking powers of $ω_{N}^{k}$ :
$v_{k} = (ω_{N}^{k})^{0} (ω_{N}^{k})^{1} ⋮ (ω_{N}^{k})^{N - 1} = 1 ω_{N}^{k} ω_{N}^{2 k} ⋮ ω_{N}^{(N - 1) k}$
Let’s look at the first few vectors to build our intuition:

$v_{0}$ (Frequency 0): Here $k = 0$ , so $ω_{N}^{0} = 1$ . The vector is $v_{0} = [1, 1, \dots, 1]^{T}$ . This represents a constant, non-oscillating signal, also known as the DC component or the mean of the signal.

$v_{1}$ (Frequency 1): The elements are $[1, ω_{N}^{1}, ω_{N}^{2}, \dots]$ . As we move down the vector, we take single steps around the unit circle. This vector represents the fundamental frequency that completes one full cycle over the $N$ samples.

$v_{2}$ (Frequency 2): The elements are $[1, ω_{N}^{2}, ω_{N}^{4}, \dots]$ . Here, we take steps of size 2 around the unit circle. This vector completes two full cycles over the $N$ samples.

And so on. $v_{k}$ represents the $k$ -th harmonic, a pure wave that completes $k$ cycles.

These $N$ vectors, ${v_{0}, v_{1}, \dots, v_{N - 1}}$ , form a new basis for our vector space $C^{N}$ .

The DFT as a Basis Change

Any time-domain signal $y$ can be written as a unique linear combination of these new basis vectors:

y = c_{0} v_{0} + c_{1} v_{1} + \dots + c_{N - 1} v_{N - 1}

The coefficients $c_{k}$ are the Fourier coefficients. They are the coordinates of our signal in the frequency basis. The vector $c = [c_{0}, c_{1}, \dots, c_{N - 1}]^{T}$ is the frequency-domain representation of our signal.

This is the heart of the DFT. It’s a machine for finding the coordinates $c$ given the signal $y$ . We can write this relationship using the Fourier Matrix, $V$ , whose columns are our new basis vectors:

V = [v_{0} ∣ v_{1} ∣ \dots ∣ v_{N - 1}]

The equation for the basis change is then simply:

y = V c

Inverse DFT: This equation shows how to reconstruct the time-domain signal $y$ from its frequency coordinates $c$ . This is what ifft does.
Forward DFT: To find the frequency coordinates, we need to solve for $c$ : $c = V^{- 1} y$ . This is what fft does.

The Magic of Orthogonality

Normally, calculating a matrix inverse $V^{- 1}$ is a computationally expensive nightmare ( $O (N^{3})$ ). This would make the DFT useless for large signals. But our trigonometric basis is not just any basis; it is an orthogonal basis.

The “dot product” (Hermitian inner product) of any two different basis vectors is exactly zero:

v_{j}^{H} v_{k} = l = 0 \sum N - 1 \overline{(ω_{N}^{l j})} (ω_{N}^{l k}) = 0 for j \neq = k

When $j = k$ , the sum is $N$ . This simple, beautiful structure means that the inverse of the Fourier matrix is incredibly easy to find:

V^{H} V = N \cdot I ⟹ V^{- 1} = \frac{1}{N} V^{H}

This is a remarkable result! The inverse transform is just the conjugate transpose of the forward transform (with a scaling factor). There is no complex system to solve. This property is what allows the Fast Fourier Transform (FFT) algorithm to compute the DFT in a blistering $O (N lo g N)$ time.

The plots on the slide perfectly illustrate the power of this basis change.

Left (Time Domain): The signal is a mix of waves. Its representation in the standard basis is dense and doesn’t reveal the underlying structure.
Right (Frequency Domain): This is the vector of Fourier coefficients, $c$ . The representation is sparse. Only a few coefficients are large, telling us immediately that the signal is composed of just a few pure frequencies. This representation is more meaningful and highly compressible.

`fftshift`: Making Frequency Plots Intuitive

A standard FFT algorithm computes the coefficients $c_{k}$ for frequencies $k = 0, 1, \dots, N - 1$ .
However, this ordering does not match how we intuitively think about frequency. To understand why, we need to talk about positive, negative, and zero frequencies in the complex domain.

Positive, Negative, and Zero Frequencies

Each DFT basis vector corresponds to a discrete complex exponential:

e^{i 2 πkn / N} .

When $k > 0$ , the exponential rotates counter-clockwise around the complex plane as $n$ increases, this is called a positive frequency.
When $k < 0$ , it rotates clockwise, a negative frequency.
When $k = 0$ , it doesn’t rotate at all, this is the zero frequency or DC component, representing the constant average value of the signal.

Because the DFT assumes periodicity with period $N$ , a frequency of $k = N - 1$ is mathematically equivalent to $k = - 1$ :

e^{i 2 π (N - 1) n / N} = e^{i 2 πn} e^{- i 2 πn / N} = e^{- i 2 πn / N} .

So the higher half of the FFT output actually represents negative frequencies.

How the FFT Orders Frequencies

The raw FFT output is arranged as:

[c_{0}, c_{1}, \dots, c_{N /2 - 1}, c_{N /2}, c_{N /2 + 1}, \dots, c_{N - 1}],

which corresponds to frequency bins:

[0, + 1, + 2, \dots, + N /2 - 1, - N /2, - N /2 + 1, \dots, - 1] .

$c_{0}$ : DC component (frequency = 0)
$c_{1 \dots N /2 - 1}$ : Positive frequencies
$c_{N /2}$ : Nyquist frequency, the highest resolvable frequency (half the sampling rate)
$c_{N /2 + 1 \dots N - 1}$ : Negative frequencies (rotating in the opposite direction)

This ordering is natural for computation, but awkward for plotting, since zero frequency is at the left edge rather than in the center.

The Role of `fftshift`

fftshift reorders the FFT output so that:

The zero frequency (DC) component appears in the middle,
The negative frequencies appear on the left, and
The positive frequencies appear on the right.

Visually, it “rotates” the FFT output by $N /2$ samples, producing a spectrum centered at 0:

np.fft.fftshift(c)

This rearrangement makes it much easier to interpret spectra and compare real vs. complex components.

Example: Why `fftshift` Matters

Consider a signal that contains both positive and negative frequency components. Before applying fftshift, the FFT output shows two peaks, one near a low index (positive frequency) and another near the end of the array (negative frequency). After fftshift, these peaks are symmetrically arranged around zero, making the spectrum intuitive to read.

Application: A Recipe for Denoising a Signal

Let’s use our new understanding to perform a classic signal processing task: removing noise.

Imagine we have a signal that is supposed to be clean but has been corrupted by random noise.

The Problem: In the time domain (top right plot), the original signal (a sum of two sine waves) is completely hidden by the noise. We can’t separate them by just looking at the samples.

The Frequency Domain Solution (A Recipe):

Transform to the Frequency Domain: We take our noisy signal vector $y_{noisy}$ and compute its Fourier coefficients using the FFT: $c_{noisy} = fft (y_{noisy})$ .
Analyze the Spectrum: We look at the power spectrum, which is simply the magnitude squared of the coefficients, $∣ c_{k} ∣^{2}$ . This tells us how much energy is at each frequency. The bottom left plot shows the result. The structure is now crystal clear:
- Two tall, sharp peaks represent the strong, coherent energy of our original sine waves.
- A noisy, low-level “floor” represents the random noise, which has its energy spread thinly across all frequencies.
Filter in the Frequency Domain: This is the key step. We apply a threshold. We assume any coefficient with a small magnitude is noise, so we set it to zero. We keep only the few large coefficients that form the peaks. This gives us a new, clean vector of coefficients, $c_{clean}$ .
$c_{clean} (k) = {c_{noisy} (k) 0 if ∣ c_{noisy} (k) ∣ > threshold otherwise$
Transform Back to the Time Domain: We use the inverse FFT to transform our clean frequency representation back into a time-domain signal: $y_{clean} = ifft (c_{clean})$ .

The result is a clean signal with the noise almost entirely eliminated. This powerful technique of transform, filter, and inverse transform is fundamental to audio engineering, image processing, and countless other fields, and it is only possible because the Fourier basis allows us to clearly separate signal from noise.

import numpy as np
import matplotlib.pyplot as plt
 
np.random.seed(42)
t = np.arange(0, 256)          # 256 points
 
# Original signal: sum of two sine waves
x = np.sin(2 * np.pi * t / 64) + np.sin(4 * 2 * np.pi * t / 64)
 
# Noisy signal
y = x + np.random.randn(len(t))
 
# DFT and thresholding
Y = np.fft.fft(y)
mag = np.abs(Y)
threshold = 40
Y_denoised = Y * (mag > threshold)
 
# Inverse DFT to recover signal
y_denoised = np.fft.ifft(Y_denoised)
 
# Plotting
plt.figure(figsize=(12,7))
plt.subplot(221)
plt.plot(t, x)
plt.title("Original Signal")
 
plt.subplot(222)
plt.plot(t, y)
plt.title("Noisy Signal")
 
plt.subplot(223)
plt.plot(np.abs(Y))
plt.title("DFT Magnitude Spectrum")
 
plt.subplot(224)
plt.plot(t, y_denoised.real)
plt.title("Denoised Signal (IFFT)")
 
plt.tight_layout()
plt.show()

Continue here: 06 DFT Applications - Convolution, Algorithms, and Applications

CS Notes

Explorer

05 Discrete Fourier Transform

Recap and Our Goal Today

Signals as Vectors