Orientation for PyTorch users
OpenSoundscape uses PyTorch “under the hood” for machine learning tasks. If you’re familiar with PyTorch, these connections to the OpenSoundscape API should let you start using pieces of OpenSoundscape within your current workflows without much hacking:
AudioFileDatasetandAudioSplittingDatasetsubclass torch.dataset and include preprocessing and augmentation. If you use them as a drop-in substitution for a PyTorch Dataset, pay attention to your DataLoader’scollate_fn. You can set thecollate_fntoopensoundscape.ml.utils.collate_audio_samples_to_tensorsto stack samples and labels with a leading batch dimension, as is typical in PyTorch.opensoundscape.ml.CNNclass’s.networkattribute is a PyTorch model object.The CNN class implements high-level methods
.train(),.predict(),.eval(), and.generate_cams()APIs which do what you expect them tointegration with the Weights and Biases logging platform is built-in; pass a wandb session object to
CNN‘s.train()or.predict()method, and get live sample logging and metricsthe
CNN‘s.preprocessorattribute (which becomes AudioFileDataset/AudioSplittingDataset’s.preprocessorduring.train()and.predict()) defines a sequential series of preprocessing & augmentation actions performed on each sample. Specifically, preprocessor.pipeline is a pd.Series and each item is an instance of theopensoundscape.preprocess.action.Actionclass (or subclass). See the preprocessing tutorial notebook for specifics on modifying preprocessing.