Ruffed Grouse recognition method

This notebook demonstrates the use of the Ruffed Grouse drumming detector presented in Lapp et al 2022

We use a signal processing approach to detect ruffed grouse drumming events in a set of 5-minute annotated audio files. We then calculates metrics of the method’s performance.

Before beginning, download the data:

The data used in this notebook is available through Dryad at the following link:

https://doi.org/10.5061/dryad.hdr7sqvmc

contact sam.lapp (at) pitt.edu if the link is broken or you have trouble accessing the data

After downloading the data from Dryad, follow these steps:

  • uncompress or unzip the downloaded folder, which may be called doi_10.5061_dryad.hdr7sqvmc__v2. This contains a README file and a second, zipped folder with the dataset.

  • within this folder, uncompres or unzip the ruffed_grouse_validation_set.zip

  • specify the path to the folder ruffed_grouse_validation_set in the cell below

Related Manuscript: > Lapp, S., Larkin, J., Parker, H., Larkin, J., Shaffer, D., Tett, C., McNeil, D., Fiss, C., Kitzes, J., 2023. “Automated recognition of Ruffed Grouse drumming in field recordings”. Wildlife Society Bulletin.

[ ]:
dataset_path = "./ruffed_grouse_validation_set/"  # change to the location of your data. current value is only correct if the folder is in the same directory as this notebook

import required packages

[ ]:
from opensoundscape.audio import Audio
from opensoundscape.spectrogram import Spectrogram
from opensoundscape.signal_processing import detect_peak_sequence_cwt

import numpy as np
import pandas as pd
from glob import glob
from pathlib import Path

from tqdm import tqdm

from matplotlib import pyplot as plt
[3]:
plt.rcParams['figure.figsize']=[15,5] #for big visuals
%config InlineBackend.figure_format = 'retina'

parameters

These parameter choices match those used in the WSB 2022 manuscript (Lapp et al. 2022) and are designed to detect ruffed grouse drumming events

[ ]:
sr = 400  # resample audio to this sample rate
wavelet = "morl"
peak_threshold = 0.2
window_len = 60  # sec to analyze in one chunk
center_frequency = 50  # for cwt
peak_separation = 15 / 400  # min duration (sec) between peaks

analysis function

The function detect_peak_sequence_cwt takes a path to an audio file and performs the automated drumming recognition on sequential non-overlapping clips (windows) of audio.

Each audio file is analyzed in 60-second non-overlapping windows. For each 60 second window, the function detect_peak_sequence_cwt performs two steps to detect drumming events:

  1. Extracts peaks from a continuous wavelet transform (50 Hz center frequency) on the audio data

  2. Searches the detected peaks for accelerating sequences that match the pattern of ruffed grouse drumming

For further details on the detection method and parameter choices, see the manuscript.

[ ]:
def detect_ruffed_grouse(path):
    # load the audio file into and Audio object
    audio = Audio.from_file(path)

    # analyze 1 window at a time with the recognition method

    # All parameters used here are the default parameters of this function in OpenSoundscape 0.7.0,
    # however we write them exiplicitly in case defaults change in a future OpenSoundscape package version
    return detect_peak_sequence_cwt(
        audio,
        sample_rate=sr,
        window_len=window_len,
        center_frequency=center_frequency,
        wavelet=wavelet,
        peak_threshold=peak_threshold,
        peak_separation=peak_separation,
        dt_range=[0.05, 0.8],
        dy_range=[-0.2, 0],
        d2y_range=[-0.05, 0.15],
        max_skip=3,
        duration_range=[1, 15],
        points_range=[9, 100],
        plot=False,
    )  # returns a dataframe summarizing all detected sequences

Analyze one file and examine a detection

[ ]:
path = f"{dataset_path}/audio/20200425/LSD-0028/20200425_114000_0s-300s.wav"
detect_ruffed_grouse(path)
sequence_y sequence_t window_start_t seq_len seq_start_time seq_end_time seq_midpoint_time
0 [0.5425226051085454, 0.46501937580732555, 0.40... [65.5077294887287, 66.05025209383724, 66.51527... 60.0 24 65.507729 70.335431 67.921580
1 [0.5950247926996965, 0.4900204175173961, 0.437... [149.8887453643902, 150.48377015708988, 150.97... 120.0 22 149.888745 155.296471 152.592608
2 [0.16000666694445598, 0.15750656277344888, 0.1... [240.29751239634984, 240.68252843868495, 241.0... 240.0 11 240.297512 241.992583 241.145048

TIP: if you recieve an error, make sure you’ve downloaded the data and moved the entire folder titled ruffed_grouse_validation_set and correctly specified the path to it in the dataset_path variable

We can see that the first detection, which contains 24 pulses in a sequence, begins in the window starting at 60 seconds, and starts 5.5 seconds into that window. Let’s look at that section of the audio file as a spectrogram (note that we bandpass the spectrogram to 0-1000 Hz to zoom in on the low frequencies):

[ ]:
# load the relevant audio
audio_clip_with_detection = Audio.from_file(path, offset=65, duration=10)

# create a spectrogram
spec = Spectrogram.from_audio(audio_clip_with_detection).bandpass(0, 1000)

# plot the spectrogram
spec.plot()

# display the audio in a playable IPython object
from IPython import display

display.Audio(
    audio_clip_with_detection.samples, rate=audio_clip_with_detection.sample_rate
)
../_images/tutorials_ruffed_grouse_detector_13_1.png

We can clearly see the ruffed grouse drumming event in the spectrogram. If you have speakers or headphones with good low-frequency response, you can also hear the accellerating thumping of the ruffed grouse drumming display by clicking play on the audio player.

Analyze all files in the validation set

The next cell will take a while to run because it requires analyzing 1120 minutes of audio. If you wish, you can skip this cell (which analyzes the audio) and instead run the commented out cell below to load the results of the analysis from a .csv file.

[ ]:
# get a list of all files
files = glob("./ruffed_grouse_validation_set/audio/*/*/*.wav")
[ ]:
print(f"analyzing {len(files)} files")

# run the detector on each file, saving output dataframes in a list
all_results = []
for f in tqdm(files):
    results_df = detect_ruffed_grouse(f)
    if len(results_df) > 0:
        results_df.index = [f] * len(results_df)
        results_df.index.name = "file"
        all_results.append(results_df)

if len(all_results) > 0:
    all_results_df = pd.concat(all_results)
else:
    print("no detections in this set of files")
analyzing 224 files
100%|██████████| 224/224 [00:20<00:00, 11.19it/s]
[10]:
# save results if desired
# all_results_df.to_csv('saved_recognizer_outputs.csv')

Take a quick look at the longest detected sequences

[ ]:
all_results_df.sort_values(by="seq_len", ascending=False).head()
sequence_y sequence_t window_start_t seq_len seq_start_time seq_end_time seq_midpoint_time
file
./ruffed_grouse_validation_set/audio/20200425/LSD-0014/20200425_114000_0s-300s.wav [0.6375265636068157, 0.5175215633984749, 0.460... [24.738530772115503, 25.37605733572232, 25.893... 0.0 25 24.738531 30.348765 27.543648
./ruffed_grouse_validation_set/audio/20200425/LSD-0028/20200425_114000_0s-300s.wav [0.5425226051085454, 0.46501937580732555, 0.40... [65.5077294887287, 66.05025209383724, 66.51527... 60.0 24 65.507729 70.335431 67.921580
./ruffed_grouse_validation_set/audio/20200507/LSD-0029/20200507_114000_0s-300s.wav [0.5475228134505628, 0.4650193758073229, 0.410... [94.43643485145213, 94.9839576649027, 95.44897... 60.0 24 94.436435 99.284137 96.860286
./ruffed_grouse_validation_set/audio/20200516/LSD-0014/20200516_114000_0s-300s.wav [0.6175257302387598, 0.5250218759114964, 0.440... [121.91257969082045, 122.5301054210592, 123.05... 120.0 24 121.912580 127.455311 124.683945
./ruffed_grouse_validation_set/audio/20200516/LSD-0028/20200516_114000_0s-300s.wav [0.5550231259635794, 0.4925205216884052, 0.425... [108.9120380015834, 109.50456269011208, 109.99... 60.0 24 108.912038 114.449769 111.680903

load labels for 1-minute clips

This table has a label for each minute of each 5-minute audio clip: 1 if it contains ruffed grouse drumming and 0 otherwise. It was generated from the raven selection tables included in ./validation-set/annotations. To see the details on how the table was generated, scroll to the bottom of this notebook.

[ ]:
label_df_1min = (
    pd.read_csv(f"{dataset_path}/annotations/1minute_presence_absence.csv")
    .set_index("file")
    .astype(int)
)
label_df_1min.head(4)
1 2 3 4 5
file
./validation_set/audio/20200425/LSD-0021/20200425_114000_0s-300s.wav 0 0 0 0 0
./validation_set/audio/20200425/LSD-0019/20200425_114000_0s-300s.wav 0 0 0 0 0
./validation_set/audio/20200425/LSD-0010/20200425_114000_0s-300s.wav 0 0 0 0 0
./validation_set/audio/20200425/LSD-0028/20200425_114000_0s-300s.wav 0 1 1 1 1

Evaluate the performance of the recognizer

Now that we have created 1-minute labels on the validation set and have generated detections with the automated recognizer, we can evaluate its performance using precision, recall, and f1 score (see here for definitions).

[13]:
from sklearn.metrics import precision_recall_fscore_support

Summarize detection/non-detection data (1 or 0) on 1-minute clips for the automated recognizer scores.

[ ]:
all_results_df["minute"] = (all_results_df.window_start_t // 60 + 1).astype(int)
all_results_df.index = [
    Path(f).parent.name + "/" + Path(f).name for f in all_results_df.index
]
label_df_1min.index = [
    Path(f).parent.name + "/" + Path(f).name for f in label_df_1min.index
]

# use parent folder and file name for unique id

# create a table of 0/1 (non-detection/detection) for 5 minute file x minute
detection_1min = [
    [
        len(all_results_df[(all_results_df.index == id) & (all_results_df.minute == m)])
        > 0
        for m in [1, 2, 3, 4, 5]
    ]
    for id in label_df_1min.index
]
detection_df_1min = pd.DataFrame(
    detection_1min, index=label_df_1min.index, columns=label_df_1min.columns
)

Also summarize the maximum sequence length detected per clip per minute

[ ]:
# create a table of max sequence len for 5 minute file x minute
max_seq_1min = [
    [
        (
            all_results_df[(all_results_df.index == id) & (all_results_df.minute == m)]
        ).seq_len.max()
        for m in [1, 2, 3, 4, 5]
    ]
    for id in label_df_1min.index
]
seq_df_1min = pd.DataFrame(
    max_seq_1min, index=label_df_1min.index, columns=label_df_1min.columns
).fillna(0)

flatten the labels and values to 1-d arrays (the format expected by sklearn.metrics)

[16]:
labels = [l for row in label_df_1min.values for l in row]
detections = [d for row in detection_df_1min.values for d in row]
max_seq_flat = [d for row in seq_df_1min.values for d in row]

Calculate performance metrics on the validation set (1 minute clips)

[ ]:
precision, recall, f1, support = precision_recall_fscore_support(labels, detections)
print(f"precision for ruffed grouse presence: {precision[1]:0.3f}")
print(f"recall for ruffed grouse presence: {recall[1]:0.3f}")
print(f"F1 score for ruffed grouse presence: {f1[1]:0.3f}")
precision for ruffed grouse presence: 0.725
recall for ruffed grouse presence: 0.784
F1 score for ruffed grouse presence: 0.753
[ ]:
from sklearn.metrics import confusion_matrix

[[true_negatives, false_negatives], [false_positives, true_positives]] = (
    confusion_matrix(labels, detections)
)
print(f"true negatives: {true_negatives}")
print(f"false negatives: {false_negatives}")
print(f"false positives: {false_positives}")
print(f"true positives: {true_positives}")
true negatives: 1072
false negatives: 11
false positives: 8
true positives: 29

Plot a histogram of max sequence length detected for positive and negative clips

[ ]:
fig, (ax1, ax2) = plt.subplots(2, 1, sharex=True)
fig.subplots_adjust(hspace=0.05)  # adjust space between axes

positives = np.array(max_seq_flat)[np.array(labels).astype(bool)]
negatives = np.array(max_seq_flat)[~np.array(labels).astype(bool)]
# plot the same data on both axes
ax1.hist(positives, label="positives", alpha=0.4, bins=np.linspace(0, 25, 10))
ax1.hist(negatives, label="negatives", alpha=0.4, bins=np.linspace(0, 25, 10))
ax2.hist(positives, label="positives", alpha=0.4, bins=np.linspace(0, 25, 10))
ax2.hist(negatives, label="negatives", alpha=0.4, bins=np.linspace(0, 25, 10))

# zoom-in / limit the view to different portions of the data
ax1.set_ylim(1070, 1080)  # outliers only
ax2.set_ylim(0, 10)  # most of the data

# hide the spines between ax and ax2
ax1.spines.bottom.set_visible(False)
ax2.spines.top.set_visible(False)
ax1.xaxis.tick_top()
ax1.tick_params(labeltop=False)  # don't put tick labels at the top
ax2.xaxis.tick_bottom()
plt.ylabel("number of sequences")
plt.xlabel("sequence length")
plt.legend()
<matplotlib.legend.Legend at 0x32939d7f0>
../_images/tutorials_ruffed_grouse_detector_35_1.png

Plot precision and recall against sequence length

[ ]:
from sklearn.metrics import precision_recall_curve

precision, recall, threshold = precision_recall_curve(labels, max_seq_flat)
plt.plot(threshold, precision[1:], label="precision")
plt.plot(threshold, recall[1:], label="recall")
plt.legend()
plt.xlabel("sequence length")
Text(0.5, 0, 'sequence length')
../_images/tutorials_ruffed_grouse_detector_37_1.png