Spectrogram¶

Mel Spectrogram¶

melspectrogram.py: Utilities for dealing with mel spectrograms

class opensoundscape.melspectrogram.MelSpectrogram(S, sample_rate, hop_length, fmin, fmax)¶

Immutable spectrogram container

classmethod from_audio(audio, n_fft=1024, n_mels=128, window='flattop', win_length=256, hop_length=32, htk=True, fmin=None, fmax=None)¶

Create a MelSpectrogram object from an Audio object

The kwargs are cherry-picked from:

Parameters:

n_fft – Length of the FFT window [default: 1024]
n_mels – Number of mel bands to generate [default: 128]
window – The windowing function to use [default: “flattop”]
win_length – Each frame of audio is windowed by window. The window will be of length win_length and then padded with zeros to match n_fft [default: 256]
hop_length – Number of samples between successive frames [default: 32]
htk – use HTK formula instead of Slaney [default: True]
fmin – lowest frequency (in Hz) [default: None]
fmax – highest frequency (in Hz). If None, use fmax = sr / 2.0 [default: None]

Returns:

opensoundscape.melspectrogram.MelSpectrogram object

to_image(shape=None, mode='RGB', s_range=(0, 20))¶

Generate PIL Image from MelSpectrogram

Given a range of values for S (e.g. default is minimum 0, maximum 20) generate a PIL image in 3-channel (RGB) or single channel (L) mode. A user can optionally resize the image.

Parameters:	shape – Resize to shape (h, w) [default: None] mode – Mode to write out “RGB” or “L” [default: “RGB”] s_range – The input range of S [default: (0, 20)]
Returns:	PIL.Image

to_pcen(gain=0.8, bias=10.0, power=0.25, time_constant=0.06)¶

Create PCEN from MelSpectrogram

Argument descriptions come from https://librosa.org/doc/latest/generated/librosa.pcen.html?highlight=pcen#librosa-pcen

Parameters:

gain – The gain factor. Typical values should be slightly less than 1 [default: 0.8]
bias – The bias point of the nonlinear compression [default: 10.0]
power – The compression exponent. Typical values should be between 0 and 0.5. Smaller values of power result in stronger compression. At the limit power=0, polynomial compression becomes logarithmic [default: 0.25]
time_constant – The time constant for IIR filtering, measured in seconds [default: 0.06]

Returns:

The per-channel energy normalized version of MelSpectrogram.S

Spectrogram¶

spectrogram.py: Utilities for dealing with spectrograms

class opensoundscape.spectrogram.Spectrogram(spectrogram, frequencies, times)¶

Immutable spectrogram container

amplitude(freq_range=None)¶

create an amplitude vs time signal from spectrogram

by summing pixels in the vertical dimension

Args: freq_range=None: sum Spectrogrm only in this range of [low, high] frequencies in Hz (if None, all frequencies are summed)

Returns:	a time-series array of the vertical sum of spectrogram value

bandpass(min_f, max_f)¶

extract a frequency band from a spectrogram

crops the 2-d array of the spectrograms to the desired frequency range

Parameters:	min_f – low frequency in Hz for bandpass high_f – high frequency in Hz for bandpass
Returns:	bandpassed spectrogram object

classmethod from_audio(audio, window_type='hann', window_samples=512, overlap_samples=256, decibel_limits=(-100, -20))¶

create a Spectrogram object from an Audio object

Parameters:	window_type="hann" – see scipy.signal.spectrogram docs for description of window parameter window_samples=512 – number of audio samples per spectrogram window (pixel) overlap_samples=256 – number of samples shared by consecutive windows = (decibel_limits) – limit the dB values to (min,max) (lower values set to min, higher values set to max)
Returns:	opensoundscape.spectrogram.Spectrogram object

classmethod from_file()¶

create a Spectrogram object from a file

Parameters:	file – path of image to load
Returns:	opensoundscape.spectrogram.Spectrogram object

limit_db_range(min_db=-100, max_db=-20)¶

Limit the decibel values of the spectrogram to range from min_db to max_db

values less than min_db are set to min_db values greater than max_db are set to max_db

similar to Audacity’s gain and range parameters

Parameters:	min_db – values lower than this are set to this max_db – values higher than this are set to this
Returns:	Spectrogram object with db range applied

linear_scale(feature_range=(0, 1))¶

Linearly rescale spectrogram values to a range of values using in_range as decibel_limits

Parameters:	feature_range – tuple of (low,high) values for output
Returns:	Spectrogram object with values rescaled to feature_range

min_max_scale(feature_range=(0, 1))¶

Linearly rescale spectrogram values to a range of values using in_range as minimum and maximum

Parameters:	feature_range – tuple of (low,high) values for output
Returns:	Spectrogram object with values rescaled to feature_range

net_amplitude(signal_band, reject_bands=None)¶

create amplitude signal in signal_band and subtract amplitude from reject_bands

rescale the signal and reject bands by dividing by their bandwidths in Hz (amplitude of each reject_band is divided by the total bandwidth of all reject_bands. amplitude of signal_band is divided by badwidth of signal_band. )

Parameters:	signal_band – [low,high] frequency range in Hz (positive contribution) band (reject) – list of [low,high] frequency ranges in Hz (negative contribution)

return: time-series array of net amplitude

plot(inline=True, fname=None, show_colorbar=False)¶

Plot the spectrogram with matplotlib.pyplot

Parameters:	inline=True – fname=None – specify a string path to save the plot to (ending in .png/.pdf) show_colorbar – include image legend colorbar from pyplot

to_image(shape=None, mode='RGB', spec_range=[-100, -20])¶

create a Pillow Image from spectrogram linearly rescales values from db_range (default [-100, -20]) to [255,0] (ie, -20 db is loudest -> black, -100 db is quietest -> white)

Parameters:	destination – a file path (string) shape=None – tuple of image dimensions, eg (224,224) mode="RGB" – RGB for 3-channel color or “L” for 1-channel grayscale spec_range=[-100,-20] – the lowest and highest possible values in the spectrogram
Returns:	Pillow Image object

trim(start_time, end_time)¶

extract a time segment from a spectrogram

Parameters:	start_time – in seconds end_time – in seconds
Returns:	spectrogram object from extracted time segment