Spectrogram analysis in python¶
A spectrogram is a visual representation of the spectrum of frequencies of a signal as it varies with time. When applied to an audio signal, spectrograms are sometimes called sonographs, voiceprints, or voicegrams. More informations on wikipedia
How do we decode this¶
How do we interpret a spectrogram? Imagine it as a chart where typically the horizontal axis denotes time, the vertical axis denotes frequency, and the color or brightness of the points on the chart reflects the amplitude (or volume) of a specific frequency at a given time.
Spectrograms are generated through several methods, such as passing the audio signal through a series of band-pass filters, or by applying a mathematical operation known as the Fourier transform. These techniques decompose the incoming audio signal into its constituent frequency components, effectively revealing the fundamental 'blueprint' of the sound.
Let's try to code this in python here 😎
import os
import matplotlib.pyplot as plt
import librosa, librosa.display
import IPython.display as ipd
import numpy as np
Download your data¶
For this article I have downloaded two sound extraction from the https://freesound.org/ website, you can download the sound you want for this practice.
BASE_FOLDER = "/home/benjamin/workspace/cours/linainsaf.github.io/docs/en/audio/files"
violin_sound_file = "violin_c5.wav"
piano_sound_file = "piano_c5.wav"
#load your sounds
violin_c5, _ = librosa.load(os.path.join(BASE_FOLDER, violin_sound_file))
piano_c5, _ = librosa.load(os.path.join(BASE_FOLDER, piano_sound_file))
def plot_spectrogram(signal, name):
"""
Compute power spectrogram with Short-Time Fourier Transform `librosa.stft()` and plot result.
"""
spectrogram = librosa.amplitude_to_db(librosa.stft(signal))
#adjust on your needs
plt.figure(figsize=(20, 15))
librosa.display.specshow(spectrogram, y_axis="log")
plt.colorbar(format="%+2.0f dB")
plt.title(f"Log-frequency power spectrogram for {name}")
plt.xlabel("Time")
plt.show()
# display your sounds in a interactive fancy way with IPython.display
ipd.Audio(os.path.join(BASE_FOLDER, violin_sound_file))
plot_spectrogram(violin_c5, "c5 on violin")
/tmp/ipykernel_640542/2912319097.py:3: UserWarning: amplitude_to_db was called on complex input so phase information will be discarded. To suppress this warning, call amplitude_to_db(np.abs(S)) instead. spectrogram = librosa.amplitude_to_db(librosa.stft(signal))
ipd.Audio(os.path.join(BASE_FOLDER, piano_sound_file))
plot_spectrogram(piano_c5, "c5 on piano")
/tmp/ipykernel_640542/2912319097.py:3: UserWarning: amplitude_to_db was called on complex input so phase information will be discarded. To suppress this warning, call amplitude_to_db(np.abs(S)) instead. spectrogram = librosa.amplitude_to_db(librosa.stft(signal))
Discrete Fourier Transform¶
In mathematics, the discrete Fourier transform (DFT) converts a finite sequence of equally-spaced samples of a function into a same-length sequence of equally-spaced samples of the discrete-time Fourier transform (DTFT), which is a complex-valued function of frequency. More on wikipedia
The Discrete Fourier Transform (DFT) is the equivalent of the continuous Fourier. Transform for signals known only at instants separated by sample times (i.e. a finite sequence of data)
X = np.fft.fft(violin_c5)
X_mag = np.absolute(X)
f = np.linspace(0, _, len(X_mag))
plt.figure(figsize=(18, 10))
plt.plot(f, X_mag) # magnitude spectrum
plt.xlabel('Frequency (Hz)')
Text(0.5, 0, 'Frequency (Hz)')
len(violin_c5)
12964