TRI NGUYEN

Introduction

Mathematics is the foundation of all the 'magic' we use today. While some resources may specify the underlying mathematical concepts, it isn't always clear what they mean or how they apply in our daily lives. This blog will cover how Fourier Transformation eases the approach to processing images and signals, and how to extend its application beyond these areas.

Fourier Transformation

When considering signals, we are considering them in terms of waves. From this, we can apply Fourier Transformation to split the waves into a set of sine/cosine waves. We can visualize this by the following:

So why is it important?

Given that a wave can be decomposed into an infinite series of sine waves (Fourier series), we can selectively store the most important components of a signal to create an accurate replica of the original without damaging the information transmitted. For example, we can disregard smaller frequencies, such as the fourth harmonic in the above example, and keep only the fundamental, second, and third harmonic waves. When combined, these waves will reconstruct a wave that closely resembles the original one. While it may have slightly fewer features and sharpness, it will still preserve the important information, with the differences being negligible or indistinguishable to the human eye.

Discrete Fourier Transform (DFT)

To ensure the process returns an accurate output, we can apply DFT which can be think of as an algorithm determine what frequencies, when added together, will recreate the source wave. Let $f(t)$ be a continuous signal (the source signal), we can consider $N$ samples of split waves $f[0], f[1], f[2],..., f[k],..., f[N - 1]$ . From this, the Fourier transform of the original signal, $f(t)$ , is:

F(j \omega) = \int_{-\infty}^{\infty} f(t)e^{-j\omega t} \, dt

However, for a finite number of frequencies (we cannot consider the infinite amount of singals in real life), the transform will be call DFT and represent as:

F(j\omega) = \sum_{k = 0}^{N - 1}f[k]e^{-j\omega kT}

The DFT treats the input data as it is periodic ( $f(N)$ to $f(2N - 1)$ will be the same as $f(0)$ to $f(N - 1)$ ). Therefore, we can get a fairly accurate result that evaluate the fundamental and its harmonics frequencies by considering the following matrix form:

\begin{bmatrix} F[0] \\ F[1] \\ F[2] \\ \vdots \\ F[N-1] \end{bmatrix} = \begin{bmatrix} 1 & 1 & 1 & 1 & \cdots & 1 \\ 1 & W & W^2 & W^3 & \cdots & W^{N-1} \\ 1 & W^2 & W^4 & W^6 & \cdots & W^{N-2} \\ 1 & W^3 & W^6 & W^9 & \cdots & W^{N-3} \\ \vdots & \vdots & \vdots & \vdots & \ddots & \vdots \\ 1 & W^{N-1} & W^{N-2} & W^{N-3} & \cdots & W \end{bmatrix} \begin{bmatrix} f[0] \\ f[1] \\ f[2] \\ \vdots \\ f[N-1] \end{bmatrix}

where $W$ is $e^{-j\omega}$

From this, to reconstruct a "copy" of the orignal wave, we can take the inverse of this matrix, which is $\dfrac{1}{N}$ times the complex conjuate of the orignal matrix, to get the answer.

Correlation

We can see that the formula of DFT has the following format:

\sum_{i = 0}^N a(i)b(i)

which is a product of two functions. The result of this product can indicate the correlation (similarity) between the two functions by looking at the result's sign and value:

If $a(i)$ is similar to $b(i)$ , then when $a(i)$ is posivtive can lead to $b(i)$ be possitive and vice versa, a negative $a(i)$ can lead to a negative $b(i)$ .
If $a(i)$ is not that similar to $b(i)$ , the result can be low.

Corrolated signals

Uncorrelated signals

So what does DFT really mean here?

Applying the concept of correlation above to DFT, we can see it is the inner product of a function $f[k]$ with a complex exponential (representing sine/cosine waves components). Therefore, this inner product can be think as a measure of the correlation between two funcions that represent each individual wave and the source wave where the complex exponentials act as the basis function of the source space. By projecting each input sample $f(k)$ onto these basis functions (via multiplication and summation), the DFT computes a similarity score that tells us how much of that basis frequency is present in the source wave.

Compressing images

By establishing a metric for the similarities between individual signals and the source signal, we can compress the signal while preserving the core information. This is achieved by setting all values close to zero to zero. Then, we use the inverse DFT to reconstruct a compressed signal that will be indistinguishable to human perception. However, to actually compress images, which are real-life objects, we will use Discrete Cosine Transform (DCT). This is similar to DFT but we only working with real numbers instead of complex numbers.

JPEGs

To compress images, one approach is to represent them using sine and cosine waves. However, these representations will appear as lines, not curves. Starting with grayscale images, we can use horizontal and vertical wave images and multiply them together to form 8x8 blocks, a good size balance between compression quality and performance. By adding these blocks, we can reconstruct a compressed image that closely resembles the original, but occupies less storage space. You can test how this compression works here:

Image Compression Tool

Choose an image file

PNG, JPG, GIF up to 10MB

Wrapping Up

This Discrete Fourier Transform (DFT) approximates the Fourier Transform of the function's data, so it is not completely accurate. Therefore, one must carefully consider the input data to avoid two common errors: leakage and aliasing, as well as the time required to run the algorithm. In all cases, DFT calculation is brute-force; for each output, we sum N terms and repeat this for N outputs, leading to an $O(N^2)$ time complexity. To effectively apply this logic, we must also evaluate cases individually. For instance, in the image compression example mentioned earlier, we use the Discrete Cosine Transform (DCT), or we can use a more dynamic programming approach like the Fast Fourier Transform (FFT), which uses memoization to reuse calculated values.