ISCA Archive Interspeech 2021
ISCA Archive Interspeech 2021

ORCA-SLANG: An Automatic Multi-Stage Semi-Supervised Deep Learning Framework for Large-Scale Killer Whale Call Type Identification

Christian Bergler, Manuel Schmitt, Andreas Maier, Helena Symonds, Paul Spong, Steven R. Ness, George Tzanetakis, Elmar Nöth

Identification of animal-specific vocalization patterns is an imperative requirement to decode animal communication. In bioacoustics, passive acoustic recording setups are increasingly deployed to acquire large-scale datasets. Previous knowledge about established animal-specific call types is usually present due to historically conducted research. However, time- and human-resource constraints, combined with a lack of available machine-based approaches, only allow manual analysis of comparatively small data corpora and strongly distort the actual data representation and information value. Such data limitations cause restrictions in terms of identifying existing population-, group-, and individual-specific call types, sub-categories, as well as unseen vocalization patterns. Thus, machine learning forms the basis for animal-specific call type recognition, to facilitate more profound insights into communication. The current study is the first fusing task-specific neural networks to develop a fully automated, multi-stage, deep-learning-based framework, entitled ORCA-SLANG, performing semi-supervised call type identification in one of the largest animal-specific bioacoustic archives — the Orchive. Orca/noise segmentation, denoising, and subsequent feature learning provide robust representations for semi-supervised clustering/classification. This results in a machine-annotated call type data repository containing 235,369 unique calls.


doi: 10.21437/Interspeech.2021-616

Cite as: Bergler, C., Schmitt, M., Maier, A., Symonds, H., Spong, P., Ness, S.R., Tzanetakis, G., Nöth, E. (2021) ORCA-SLANG: An Automatic Multi-Stage Semi-Supervised Deep Learning Framework for Large-Scale Killer Whale Call Type Identification. Proc. Interspeech 2021, 2396-2400, doi: 10.21437/Interspeech.2021-616

@inproceedings{bergler21_interspeech,
  author={Christian Bergler and Manuel Schmitt and Andreas Maier and Helena Symonds and Paul Spong and Steven R. Ness and George Tzanetakis and Elmar Nöth},
  title={{ORCA-SLANG: An Automatic Multi-Stage Semi-Supervised Deep Learning Framework for Large-Scale Killer Whale Call Type Identification}},
  year=2021,
  booktitle={Proc. Interspeech 2021},
  pages={2396--2400},
  doi={10.21437/Interspeech.2021-616}
}