Ruffed Grouse recognition method
This notebook demonstrates the use of the Ruffed Grouse drumming detector presented in Lapp et al 2022
We use a signal processing approach to detect ruffed grouse drumming events in a set of 5-minute annotated audio files. We then calculates metrics of the method’s performance.
Before beginning, download the data:
The data used in this notebook is available through Dryad at the following link:
https://doi.org/10.5061/dryad.hdr7sqvmc
contact sam.lapp (at) pitt.edu if the link is broken or you have trouble accessing the data
After downloading the data from Dryad, follow these steps:
uncompress or unzip the downloaded folder, which may be called
doi_10.5061_dryad.hdr7sqvmc__v2. This contains a README file and a second, zipped folder with the dataset.within this folder, uncompres or unzip the
ruffed_grouse_validation_set.zipspecify the path to the folder
ruffed_grouse_validation_setin the cell below
Related Manuscript: > Lapp, S., Larkin, J., Parker, H., Larkin, J., Shaffer, D., Tett, C., McNeil, D., Fiss, C., Kitzes, J., 2023. “Automated recognition of Ruffed Grouse drumming in field recordings”. Wildlife Society Bulletin.
[ ]:
dataset_path = "./ruffed_grouse_validation_set/" # change to the location of your data. current value is only correct if the folder is in the same directory as this notebook
import required packages
[ ]:
from opensoundscape.audio import Audio
from opensoundscape.spectrogram import Spectrogram
from opensoundscape.signal_processing import detect_peak_sequence_cwt
import numpy as np
import pandas as pd
from glob import glob
from pathlib import Path
from tqdm import tqdm
from matplotlib import pyplot as plt
[3]:
plt.rcParams['figure.figsize']=[15,5] #for big visuals
%config InlineBackend.figure_format = 'retina'
parameters
These parameter choices match those used in the WSB 2022 manuscript (Lapp et al. 2022) and are designed to detect ruffed grouse drumming events
[ ]:
sr = 400 # resample audio to this sample rate
wavelet = "morl"
peak_threshold = 0.2
window_len = 60 # sec to analyze in one chunk
center_frequency = 50 # for cwt
peak_separation = 15 / 400 # min duration (sec) between peaks
analysis function
The function detect_peak_sequence_cwt takes a path to an audio file and performs the automated drumming recognition on sequential non-overlapping clips (windows) of audio.
Each audio file is analyzed in 60-second non-overlapping windows. For each 60 second window, the function detect_peak_sequence_cwt performs two steps to detect drumming events:
Extracts peaks from a continuous wavelet transform (50 Hz center frequency) on the audio data
Searches the detected peaks for accelerating sequences that match the pattern of ruffed grouse drumming
For further details on the detection method and parameter choices, see the manuscript.
[ ]:
def detect_ruffed_grouse(path):
# load the audio file into and Audio object
audio = Audio.from_file(path)
# analyze 1 window at a time with the recognition method
# All parameters used here are the default parameters of this function in OpenSoundscape 0.7.0,
# however we write them exiplicitly in case defaults change in a future OpenSoundscape package version
return detect_peak_sequence_cwt(
audio,
sample_rate=sr,
window_len=window_len,
center_frequency=center_frequency,
wavelet=wavelet,
peak_threshold=peak_threshold,
peak_separation=peak_separation,
dt_range=[0.05, 0.8],
dy_range=[-0.2, 0],
d2y_range=[-0.05, 0.15],
max_skip=3,
duration_range=[1, 15],
points_range=[9, 100],
plot=False,
) # returns a dataframe summarizing all detected sequences
Analyze one file and examine a detection
[ ]:
path = f"{dataset_path}/audio/20200425/LSD-0028/20200425_114000_0s-300s.wav"
detect_ruffed_grouse(path)
| sequence_y | sequence_t | window_start_t | seq_len | seq_start_time | seq_end_time | seq_midpoint_time | |
|---|---|---|---|---|---|---|---|
| 0 | [0.5425226051085454, 0.46501937580732555, 0.40... | [65.5077294887287, 66.05025209383724, 66.51527... | 60.0 | 24 | 65.507729 | 70.335431 | 67.921580 |
| 1 | [0.5950247926996965, 0.4900204175173961, 0.437... | [149.8887453643902, 150.48377015708988, 150.97... | 120.0 | 22 | 149.888745 | 155.296471 | 152.592608 |
| 2 | [0.16000666694445598, 0.15750656277344888, 0.1... | [240.29751239634984, 240.68252843868495, 241.0... | 240.0 | 11 | 240.297512 | 241.992583 | 241.145048 |
TIP: if you recieve an error, make sure you’ve downloaded the data and moved the entire folder titled
ruffed_grouse_validation_setand correctly specified the path to it in the dataset_path variable
We can see that the first detection, which contains 24 pulses in a sequence, begins in the window starting at 60 seconds, and starts 5.5 seconds into that window. Let’s look at that section of the audio file as a spectrogram (note that we bandpass the spectrogram to 0-1000 Hz to zoom in on the low frequencies):
[ ]:
# load the relevant audio
audio_clip_with_detection = Audio.from_file(path, offset=65, duration=10)
# create a spectrogram
spec = Spectrogram.from_audio(audio_clip_with_detection).bandpass(0, 1000)
# plot the spectrogram
spec.plot()
# display the audio in a playable IPython object
from IPython import display
display.Audio(
audio_clip_with_detection.samples, rate=audio_clip_with_detection.sample_rate
)
We can clearly see the ruffed grouse drumming event in the spectrogram. If you have speakers or headphones with good low-frequency response, you can also hear the accellerating thumping of the ruffed grouse drumming display by clicking play on the audio player.
Analyze all files in the validation set
The next cell will take a while to run because it requires analyzing 1120 minutes of audio. If you wish, you can skip this cell (which analyzes the audio) and instead run the commented out cell below to load the results of the analysis from a .csv file.
[ ]:
# get a list of all files
files = glob("./ruffed_grouse_validation_set/audio/*/*/*.wav")
[ ]:
print(f"analyzing {len(files)} files")
# run the detector on each file, saving output dataframes in a list
all_results = []
for f in tqdm(files):
results_df = detect_ruffed_grouse(f)
if len(results_df) > 0:
results_df.index = [f] * len(results_df)
results_df.index.name = "file"
all_results.append(results_df)
if len(all_results) > 0:
all_results_df = pd.concat(all_results)
else:
print("no detections in this set of files")
analyzing 224 files
100%|██████████| 224/224 [00:20<00:00, 11.19it/s]
[10]:
# save results if desired
# all_results_df.to_csv('saved_recognizer_outputs.csv')
Take a quick look at the longest detected sequences
[ ]:
all_results_df.sort_values(by="seq_len", ascending=False).head()
| sequence_y | sequence_t | window_start_t | seq_len | seq_start_time | seq_end_time | seq_midpoint_time | |
|---|---|---|---|---|---|---|---|
| file | |||||||
| ./ruffed_grouse_validation_set/audio/20200425/LSD-0014/20200425_114000_0s-300s.wav | [0.6375265636068157, 0.5175215633984749, 0.460... | [24.738530772115503, 25.37605733572232, 25.893... | 0.0 | 25 | 24.738531 | 30.348765 | 27.543648 |
| ./ruffed_grouse_validation_set/audio/20200425/LSD-0028/20200425_114000_0s-300s.wav | [0.5425226051085454, 0.46501937580732555, 0.40... | [65.5077294887287, 66.05025209383724, 66.51527... | 60.0 | 24 | 65.507729 | 70.335431 | 67.921580 |
| ./ruffed_grouse_validation_set/audio/20200507/LSD-0029/20200507_114000_0s-300s.wav | [0.5475228134505628, 0.4650193758073229, 0.410... | [94.43643485145213, 94.9839576649027, 95.44897... | 60.0 | 24 | 94.436435 | 99.284137 | 96.860286 |
| ./ruffed_grouse_validation_set/audio/20200516/LSD-0014/20200516_114000_0s-300s.wav | [0.6175257302387598, 0.5250218759114964, 0.440... | [121.91257969082045, 122.5301054210592, 123.05... | 120.0 | 24 | 121.912580 | 127.455311 | 124.683945 |
| ./ruffed_grouse_validation_set/audio/20200516/LSD-0028/20200516_114000_0s-300s.wav | [0.5550231259635794, 0.4925205216884052, 0.425... | [108.9120380015834, 109.50456269011208, 109.99... | 60.0 | 24 | 108.912038 | 114.449769 | 111.680903 |
load labels for 1-minute clips
This table has a label for each minute of each 5-minute audio clip: 1 if it contains ruffed grouse drumming and 0 otherwise. It was generated from the raven selection tables included in ./validation-set/annotations. To see the details on how the table was generated, scroll to the bottom of this notebook.
[ ]:
label_df_1min = (
pd.read_csv(f"{dataset_path}/annotations/1minute_presence_absence.csv")
.set_index("file")
.astype(int)
)
label_df_1min.head(4)
| 1 | 2 | 3 | 4 | 5 | |
|---|---|---|---|---|---|
| file | |||||
| ./validation_set/audio/20200425/LSD-0021/20200425_114000_0s-300s.wav | 0 | 0 | 0 | 0 | 0 |
| ./validation_set/audio/20200425/LSD-0019/20200425_114000_0s-300s.wav | 0 | 0 | 0 | 0 | 0 |
| ./validation_set/audio/20200425/LSD-0010/20200425_114000_0s-300s.wav | 0 | 0 | 0 | 0 | 0 |
| ./validation_set/audio/20200425/LSD-0028/20200425_114000_0s-300s.wav | 0 | 1 | 1 | 1 | 1 |
Evaluate the performance of the recognizer
Now that we have created 1-minute labels on the validation set and have generated detections with the automated recognizer, we can evaluate its performance using precision, recall, and f1 score (see here for definitions).
[13]:
from sklearn.metrics import precision_recall_fscore_support
Summarize detection/non-detection data (1 or 0) on 1-minute clips for the automated recognizer scores.
[ ]:
all_results_df["minute"] = (all_results_df.window_start_t // 60 + 1).astype(int)
all_results_df.index = [
Path(f).parent.name + "/" + Path(f).name for f in all_results_df.index
]
label_df_1min.index = [
Path(f).parent.name + "/" + Path(f).name for f in label_df_1min.index
]
# use parent folder and file name for unique id
# create a table of 0/1 (non-detection/detection) for 5 minute file x minute
detection_1min = [
[
len(all_results_df[(all_results_df.index == id) & (all_results_df.minute == m)])
> 0
for m in [1, 2, 3, 4, 5]
]
for id in label_df_1min.index
]
detection_df_1min = pd.DataFrame(
detection_1min, index=label_df_1min.index, columns=label_df_1min.columns
)
Also summarize the maximum sequence length detected per clip per minute
[ ]:
# create a table of max sequence len for 5 minute file x minute
max_seq_1min = [
[
(
all_results_df[(all_results_df.index == id) & (all_results_df.minute == m)]
).seq_len.max()
for m in [1, 2, 3, 4, 5]
]
for id in label_df_1min.index
]
seq_df_1min = pd.DataFrame(
max_seq_1min, index=label_df_1min.index, columns=label_df_1min.columns
).fillna(0)
flatten the labels and values to 1-d arrays (the format expected by sklearn.metrics)
[16]:
labels = [l for row in label_df_1min.values for l in row]
detections = [d for row in detection_df_1min.values for d in row]
max_seq_flat = [d for row in seq_df_1min.values for d in row]
Calculate performance metrics on the validation set (1 minute clips)
[ ]:
precision, recall, f1, support = precision_recall_fscore_support(labels, detections)
print(f"precision for ruffed grouse presence: {precision[1]:0.3f}")
print(f"recall for ruffed grouse presence: {recall[1]:0.3f}")
print(f"F1 score for ruffed grouse presence: {f1[1]:0.3f}")
precision for ruffed grouse presence: 0.725
recall for ruffed grouse presence: 0.784
F1 score for ruffed grouse presence: 0.753
[ ]:
from sklearn.metrics import confusion_matrix
[[true_negatives, false_negatives], [false_positives, true_positives]] = (
confusion_matrix(labels, detections)
)
print(f"true negatives: {true_negatives}")
print(f"false negatives: {false_negatives}")
print(f"false positives: {false_positives}")
print(f"true positives: {true_positives}")
true negatives: 1072
false negatives: 11
false positives: 8
true positives: 29
Plot a histogram of max sequence length detected for positive and negative clips
[ ]:
fig, (ax1, ax2) = plt.subplots(2, 1, sharex=True)
fig.subplots_adjust(hspace=0.05) # adjust space between axes
positives = np.array(max_seq_flat)[np.array(labels).astype(bool)]
negatives = np.array(max_seq_flat)[~np.array(labels).astype(bool)]
# plot the same data on both axes
ax1.hist(positives, label="positives", alpha=0.4, bins=np.linspace(0, 25, 10))
ax1.hist(negatives, label="negatives", alpha=0.4, bins=np.linspace(0, 25, 10))
ax2.hist(positives, label="positives", alpha=0.4, bins=np.linspace(0, 25, 10))
ax2.hist(negatives, label="negatives", alpha=0.4, bins=np.linspace(0, 25, 10))
# zoom-in / limit the view to different portions of the data
ax1.set_ylim(1070, 1080) # outliers only
ax2.set_ylim(0, 10) # most of the data
# hide the spines between ax and ax2
ax1.spines.bottom.set_visible(False)
ax2.spines.top.set_visible(False)
ax1.xaxis.tick_top()
ax1.tick_params(labeltop=False) # don't put tick labels at the top
ax2.xaxis.tick_bottom()
plt.ylabel("number of sequences")
plt.xlabel("sequence length")
plt.legend()
<matplotlib.legend.Legend at 0x32939d7f0>
Plot precision and recall against sequence length
[ ]:
from sklearn.metrics import precision_recall_curve
precision, recall, threshold = precision_recall_curve(labels, max_seq_flat)
plt.plot(threshold, precision[1:], label="precision")
plt.plot(threshold, recall[1:], label="recall")
plt.legend()
plt.xlabel("sequence length")
Text(0.5, 0, 'sequence length')