Skip to main content

Advertisement

Log in

Bioacoustic signal denoising: a review

  • Published:
Artificial Intelligence Review Aims and scope Submit manuscript

Abstract

Animal biodiversity has been experiencing rapid decline due to various reasons such as habitat loss and degradation, invasive species, and environment pollution. Recent advances in acoustic sensors provide a novel way to monitor animals through investigating collected bioacoustic recordings. To accurately monitor animals, the precondition is the high performance of developed bioacoustic signal recognition model. However, since bioacoustic recordings are often obtained in an open environment, various sources of noise will affect the audio quality, which causes problems for automated analysis of animal sound recordings. Although various methods have been developed for addressing the noise in different bioacoustic recordings, to the best of our knowledge, there is still no paper that reviews and summarizes those methods. The main aim of this paper is to provide a systematic survey of the existing literature related to bioacoustic signal denoising. By investigating the existing denoising methods for bioacoustic recordings, current challenges, possible opportunities, and future research directions are discussed and concluded.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Notes

  1. https://github.com/Jerry-Jie-Xie/BioacousticSignalDenoising.

  2. https://cis.whoi.edu/science/B/whalesounds/index.cfm.

  3. https://www.ecosounds.org/.

  4. https://archive.ics.uci.edu/ml/datasets/Anuran+Calls+%28MFCCs%29.

  5. https://www.kaggle.com/c/mlsp-2013-birds/data.

  6. http://www.avianz.net/.

  7. https://en.wikipedia.org/wiki/Xeno-canto.

References

  • Alonso JB, Cabrera J, Shyamnani R, Travieso CM, Bolaños F, García A, Villegas A, Wainwright M (2017) Automatic anuran identification using noise removal and audio activity detection. Expert Syst Appl 72:83–92

    Google Scholar 

  • Baker MC, Logue DM (2003) Population differentiation in a complex bird sound: a comparison of three bioacoustical analysis procedures. Ethology 109(3):223–242

    Google Scholar 

  • Baker MC, Logue DM (2007) A comparison of three noise reduction procedures applied to bird vocal signals. J Field Ornithol 78(3):240–253

    Google Scholar 

  • Bardeli R, Wolff D, Kurth F, Koch M, Tauchert KH, Frommolt KH (2010) Detecting bird sounds in a complex acoustic environment and application to bioacoustic monitoring. Pattern Recognit Lett 31(12):1524–1534

    Google Scholar 

  • Barmatz H, Klein D, Vortman Y, Toledo S, Lavner Y (2019) A method for automatic segmentation and parameter estimation of bird vocalizations. In: 2019 International Conference on Systems, Signals and Image Processing (IWSSIP), pp 211–216

  • Baumgartner MF, Mussoline SE (2011) A generalized baleen whale call detection and classification system. J Acoust Soc Am 129(5):2889–2902

    Google Scholar 

  • Bedoya C, Isaza C, Daza JM, López JD (2014) Automatic recognition of anuran species based on syllable identification. Ecol Inf 24:200–209

    Google Scholar 

  • Bergler C, Schröter H, Cheng RX, Barth V, Weber M, Nöth E, Hofer H, Maier A (2019) Orca-spot: an automatic killer whale sound detection toolkit using deep learning. Sci Rep 9(1):1–17

    Google Scholar 

  • Bermant PC, Bronstein MM, Wood RJ, Gero S, Gruber DF (2019) Deep machine learning techniques for the detection and classification of sperm whale bioacoustics. Sci Rep 9(1):1–10

    Google Scholar 

  • Boll S (1979) Suppression of acoustic noise in speech using spectral subtraction. IEEE Trans Acoust Speech Sig Process 27(2):113–120

    Google Scholar 

  • Brandes TS (2008) Feature vector selection and use with hidden markov models to identify frequency-modulated bioacoustic signals amidst noise. IEEE Trans Audio Speech Language Process 16(6):1173–1180

    Google Scholar 

  • Brown A, Garg S, Montgomery J (2017) Automatic and efficient denoising of bioacoustics recordings using mmse stsa. IEEE Access 6:5010–5022

    Google Scholar 

  • Brown A, Garg S, Montgomery J (2019) Automatic rain and cicada chorus filtering of bird acoustic data. Appl Soft Comput 81:105501

    Google Scholar 

  • Cai J, Ee D, Pham B, Roe P, Zhang J (2007) Sensor network for the monitoring of ecosystem: Bird species recognition. In: 2007 3rd International Conference on Intelligent Sensors, Sensor Networks and Information, pp 293–298, https://doi.org/10.1109/ISSNIP.2007.4496859

  • Chandrakala S, Jayalakshmi S (2019) Generative model-driven representation learning in a hybrid framework for environmental audio scene and sound event recognition. IEEE Trans Multimed 22:3–14

    Google Scholar 

  • Chen WP, Chen SS, Lin CC, Chen YZ, Lin WC (2012) Automatic recognition of frog calls using a multi-stage average spectrum. Comp Math Appl 64(5):1270–1281

    Google Scholar 

  • Colonna JG, Nakamura EF (2018) Unsupervised selection of the singular spectrum components based on information theory for bioacoustic signal filtering. Dig Sig Process 82:64–79

    Google Scholar 

  • Deichmann JL, Acevedo-Charry O, Barclay L, Burivalova Z, Campos-Cerqueira M, d’Horta F, Game ET, Gottesman BL, Hart PJ, Kalan AK et al (2018) It’s time to listen: there is much to be learned from the sounds of tropical ecosystems. Biotropica 50(5):713–718

    Google Scholar 

  • Deller JR, Hansen JHL (1993) Proakis JG (2000) Discrete-time processing of speech signals. Institute of Electrical and Electronics Engineers. Macmillan, New York

    Google Scholar 

  • Ding H, Soon Y, Koh SN, Yeo CK (2009) A spectral filtering method based on hybrid wiener filters for speech enhancement. Speech Commun 51(3):259–267

    Google Scholar 

  • Dionelis N, Brookes M (2019) Modulation-domain kalman filtering for monaural blind speech denoising and dereverberation. IEEE/ACM Trans Audio Speech Language Process 27(4):799–814

    Google Scholar 

  • Donoho DL, Johnstone JM (1994) Ideal spatial adaptation by wavelet shrinkage. Biometrika 81(3):425–455

    MathSciNet  MATH  Google Scholar 

  • Ephraim Y, Malah D (1984) Speech enhancement using a minimum-mean square error short-time spectral amplitude estimator. IEEE Trans Acoust Speech Sig Process 32(6):1109–1121

    Google Scholar 

  • Ephraim Y, Malah D (1985) Speech enhancement using a minimum mean-square error log-spectral amplitude estimator. IEEE Trans Acoust Speech Sig Process 33(2):443–445

    Google Scholar 

  • Esfahanian M, Erdol N, Gerstein E, Zhuang H (2017) Two-stage detection of north atlantic right whale upcalls using local binary patterns and machine learning algorithms. Appl Acoust 120:158–166

    Google Scholar 

  • Fletcher N (2007) Animal bioacoustics. Springer handbook of acoustics. Springer, Berlin, pp 785–804

    Google Scholar 

  • Fu SW, Tsao Y, Lu X (2016) SNR-aware convolutional neural network modeling for speech enhancement. In: Proceedings of the Annual Conference of the International Speech Communication Association, Interspeech, pp 3768–3772

  • Gómez A, Ugarte JP, Gómez DMM (2018) Bioacoustic signals denoising using the undecimated discrete wavelet transform. In: Figueroa-García JC, Villegas JG, Orozco-Arroyave JR, Maya Duque PA (eds) Applied Computer Sciences in Engineering. Springer, Cham, pp 300–308

    Google Scholar 

  • Gur BM, Niezrecki C (2007) Autocorrelation based denoising of manatee vocalizations using the undecimated discrete wavelet transform. J Acoust Soc Am 122(1):188–199

    Google Scholar 

  • Gur MB, Niezrecki C (2011) A wavelet packet adaptive filtering algorithm for enhancing manatee vocalizations. J Acoust Soc Am 129(4):2059–2067

    Google Scholar 

  • Härmä A (2003) Automatic identification of bird species based on sinusoidal modeling of syllables. In: Acoustics, Speech, and Signal Processing, 2003. Proceedings.(ICASSP’03). 2003 IEEE International Conference on, IEEE, vol 5, pp V–545

  • Heim O, Heim DM, Marggraf L, Voigt CC, Zhang X, Luo Y, Zheng J (2019) Variant maps for bat echolocation call identification algorithms. Bioacoustics 29:557–571

    Google Scholar 

  • Henríquez A, Alonso JB, Travieso CM, Rodríguez-Herrera B, Bolaños F, Alpízar P, López-de Ipina K, Henríquez P (2014) An automatic acoustic bat identification system based on the audible spectrum. Expert Syst Appl 41(11):5451–5465

    Google Scholar 

  • Hu W, Van Nghia Tran, Bulusu N, Chou CT, Jha S, Taylor A (2005) The design and evaluation of a hybrid sensor network for cane-toad monitoring. In: IPSN 2005. Fourth International Symposium on Information Processing in Sensor Networks, 2005., pp 503–508, https://doi.org/10.1109/IPSN.2005.1440984

  • Hu Y, Loizou PC (2006) Evaluation of objective measures for speech enhancement. In: Ninth International Conference on Spoken Language Processing

  • Huang CJ, Chen YJ, Chen HM, Jian JJ, Tseng SC, Yang YJ, Hsu PA (2014) Intelligent feature extraction and classification of anuran vocalizations. Appl Soft Comput 19:1–7

    Google Scholar 

  • Hussein W, Hussein M, Becker T (2012) Spectrogram enhancement by edge detection approach applied to bioacoustics calls classification. Sig Image Process 3(2):1

    Google Scholar 

  • Islam MT, Shahnaz C, Zhu WP, Ahmad MO (2015) Speech enhancement based on student \( t \) modeling of teager energy operated perceptual wavelet packet coefficients and a custom thresholding function. IEEE/ACM Trans Audio Speech Language Process 23(11):1800–1811

    Google Scholar 

  • Kandia V, Stylianou Y, Dutoit T (2008) Improve the accuracy of tdoa measurement using the teager-kaiser energy operator. In: 2008 New Trends for Environmental Monitoring Using Passive Systems, pp 1–6

  • Kim HG, Obermayer K, Bode M, Ruwisch D (2000) Real-time noise canceling based on spectral minimum detection and diffusive gain factors. J Acoust Soc Am 108(5):2484–2484

    Google Scholar 

  • Klatt D (1982) Prediction of perceived phonetic distance from critical-band spectra: A first step. In: ICASSP’82. IEEE International Conference on Acoustics, Speech, and Signal Processing, IEEE, vol 7, pp 1278–1281

  • Knight EC, Poo Hernandez S, Bayne EM, Bulitko V, Tucker BV (2019) Pre-processing spectrogram parameters improve the accuracy of bioacoustic classification using convolutional neural networks. Bioacoustics 29:337–355

    Google Scholar 

  • Koluguri NR, Meenakshi GN, Ghosh PK (2017) Spectrogram enhancement using multiple window savitzky-golay (mwsg) filter for robust bird sound detection. IEEE/ACM Trans Audio Speech Language Process 25(6):1183–1192

    Google Scholar 

  • Kong Q, Xu Y, Plumbley MD (2017) Joint detection and classification convolutional neural network on weakly labelled bird audio detection. In: 2017 25th European Signal Processing Conference (EUSIPCO), pp 1749–1753, https://doi.org/10.23919/EUSIPCO.2017.8081509

  • Lamel L, Rabiner L, Rosenberg A, Wilpon J (1981) An improved endpoint detector for isolated word recognition. IEEE Trans Acoust Speech Sig Process 29(4):777–785

    Google Scholar 

  • Le Roux J, Hershey JR, Weninger F (2015) Deep nmf for speech separation. In: 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp 66–70, https://doi.org/10.1109/ICASSP.2015.7177933

  • Lefkimmiatis S (2018) Universal denoising networks: a novel cnn architecture for image denoising. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3204–3213

  • Li J, Sakamoto S, Hongo S, Akagi M, Suzuki Y (2011) Two-stage binaural speech enhancement with wiener filter for high-quality speech communication. Speech Commun 53(5):677–689

    Google Scholar 

  • Lim J, Oppenheim A (1978) All-pole modeling of degraded speech. IEEE Trans Acoust Speech Sig Process 26(3):197–210

    MATH  Google Scholar 

  • Lin T, Yang H, Huang J, Yao C, Lien Y, Wang P, Hu F (2019) Evaluating changes in the marine soundscape of an offshore wind farm via the machine learning-based source separation. In: 2019 IEEE Underwater Technology (UT), pp 1–6

  • Lin TH, Tsao Y (2019) Source separation in ecoacoustics: A roadmap towards versatile soundscape information retrieval. Remote Sens Ecol Conserv 1–12

  • Lin TH, Chou LS, Akamatsu T, Chan HC, Chen CF (2013) An automatic detection algorithm for extracting the representative frequency of cetacean tonal sounds. J Acoust Soc Am 134(3):2477–2485

    Google Scholar 

  • Lin TH, Fang SH, Tsao Y (2017) Improving biodiversity assessment via unsupervised separation of biological sounds from long-duration recordings. Sci Rep 7(1):1–10

    Google Scholar 

  • Lostanlen V, Palmer K, Knight E, Clark C, Klinck H, Farnsworth A, Wong T, Cramer J, Bello JP (2019) Long-distance detection of bioacoustic events with per-channel energy normalization. arXiv preprint arXiv:191100417

  • Lu X, Tsao Y, Matsuda S, Hori C (2013) Speech enhancement based on deep denoising autoencoder. In: Proceedings Interspeech, pp 436–440

  • Luque A, Romero-Lemos J, Carrasco A, Barbancho J (2018) Non-sequential automatic classification of anuran sounds for the estimation of climate-change indicators. Expert Syst Appl 95:248–260

    Google Scholar 

  • McAulay R, Malpass M (1980) Speech enhancement using a soft-decision noise suppression filter. IEEE Trans Acoust Speech Sig Process 28(2):137–145

    Google Scholar 

  • Mellinger DK (2004) A comparison of methods for detecting right whale calls. Can Acoust 32(2):55–65

    MathSciNet  Google Scholar 

  • Neal L, Briggs F, Raich R, Fern XZ (2011) Time-frequency segmentation of bird song in noisy acoustic environments. In: Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on, IEEE, pp 2012–2015

  • Oikarinen T, Srinivasan K, Meisner O, Hyman JB, Parmar S, Fanucci-Kiss A, Desimone R, Landman R, Feng G (2019) Deep convolutional network for animal sound classification and source attribution using dual audio recordings. J Acoust Soc Am 145(2):654–662

    Google Scholar 

  • Pandey PC, Pratapwar SS, Lehana PK (2004) Enhancement of electrolaryngeal speech by reducing leakage noise using spectral subtraction with quantile based dynamic estimation of noise. In: Proceeding of the 18th international congress on acoustics ICA 2004, pp 3029–3032

  • Patti A, Williamson GA (2013) Methods for classification of nocturnal migratory bird vocalizations using pseudo wigner-ville transform. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, IEEE, pp 758–762

  • Pijanowski BC, Villanueva-Rivera LJ, Dumyahn SL, Farina A, Krause BL, Napoletano BM, Gage SH, Pieretti N (2011) Soundscape ecology: the science of sound in the landscape. BioScience 61(3):203–216

    Google Scholar 

  • Pourhomayoun M, Dugan P, Popescu M, Clark C (2013) Bioacoustic signal classification based on continuous region processing, grid masking and artificial neural network. arXiv preprint arXiv:13053635

  • Priyadarshani N, Marsland S, Castro I, Punchihewa A (2016) Birdsong denoising using wavelets. PloS One 11(1):e0146790

    Google Scholar 

  • Priyadarshani N, Marsland S, Castro I (2018) Automated birdsong recognition in complex acoustic environments: a review. J Avian Biol 49(5):jav–01447

    Google Scholar 

  • Quackenbush SR (1995) Objective measures of speech quality. PhD thesis, Georgia Institute of Technology

  • Ren Y, Johnson MT, Tao J (2008) Perceptually motivated wavelet packet transform for bioacoustic signal enhancement. J Acoust Soc Am 124(1):316–327

    Google Scholar 

  • Rethage D, Pons J, Serra X (2018) A wavenet for speech denoising. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5069–5073

  • Rix AW, Beerends JG, Hollier MP, Hekstra AP (2001) Perceptual evaluation of speech quality (pesq)-a new method for speech quality assessment of telephone networks and codecs. In: 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No. 01CH37221), IEEE, vol 2, pp 749–752

  • Roger V, Bartcus M, Chamroukhi F, Glotin H (2018) Unsupervised Bioacoustic Segmentation by Hierarchical Dirichlet Process Hidden Markov Model. Springer, Cham, pp 113–130

    Google Scholar 

  • Ruiz-Muñoz JF, You Z, Raich R, Fern XZ (2018) Dictionary learning for bioacoustics monitoring with applications to species classification. J Sig Process Syst 90(2):233–247

    Google Scholar 

  • Simões Amorim TO, Rezende de Castro F, Rodrigues Moron J, Ribeiro Duque B, Couto Di Tullio J, Resende Secchi E, Andriolo A (2019) Integrative bioacoustics discrimination of eight delphinid species in the western south atlantic ocean. PLOS ONE 14(6):1–17

    Google Scholar 

  • Souza LS, Gatto BB, Fukui K (2018) Grassmann singular spectrum analysis for bioacoustics classification. In: 2018 IEEE International Conference on Acoustics. Speech and Signal Processing (ICASSP), IEEE, pp 256–260

  • Souza LS, Gatto BB, Fukui K (2019) Classification of bioacoustic signals with tangent singular spectrum analysis. In: ICASSP 2019–2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 351–355

  • Stowell D, Wood MD, Pamuła H, Stylianou Y, Glotin H (2019) Automatic acoustic detection of birds through deep learning: the first bird audio detection challenge. Methods Ecol Evol 10(3):368–380

    Google Scholar 

  • Sun R, Marye Y, Zhao HA (2013) Wavelet transform digital sound processing to identify wild bird species. In: Wavelet Analysis and Pattern Recognition (ICWAPR), 2013 International Conference on, pp 306–309

  • Towsey MW, Planitz B, Nantes A, Wimmer J, Roe P (2012) A toolbox for animal call recognition. Bioacoust Int J Animal Sound Record 21(2):107–125

    Google Scholar 

  • Xie J, Towsey M, Zhang J, Roe P (2015) Image processing and classification procedure for the analysis of australian frog vocalisations. In: Proceedings of the 2Nd International Workshop on Environmental Multimedia Retrieval, ACM, Shanghai, China, EMR ’15, pp 15–20

  • Xie J, Towsey M, Zhang J, Roe P (2016a) Acoustic classification of australian frogs based on enhanced features and machine learning algorithms. Appl Acoust 113:193–201

    Google Scholar 

  • Xie J, Towsey M, Zhang J, Roe P (2016b) Adaptive frequency scaled wavelet packet decomposition for frog call classification. Ecol Inf 32:134–144

    Google Scholar 

  • Xie J, Towsey M, Zhang J, Roe P (2018) Frog call classification: a survey. Artif Intell Rev 49(3):375–391

    Google Scholar 

  • Xie J, Li X, Xing Z, Zhang B, Bao W, Zhang J (2019) Improved distributed minimum variance distortionless response (mvdr) beamforming method based on a local average consensus algorithm for bird audio enhancement in wireless acoustic sensor networks. Appl Sci 9(15):3153

    Google Scholar 

  • Xie J, Hu K, Zhu M, Guo Y (2020) Bioacoustic signal classification in continuous recordings: syllable-segmentation vs. sliding-window. Expert Sys Appl 152:113390

  • Yan Z, Niezrecki C, Beusse DO (2005) Background noise cancellation for improved acoustic detection of manatee vocalizations. J Acoust Soc Am 117(6):3566–3573

    Google Scholar 

  • Yan Z, Niezrecki C, Cattafesta LN III, Beusse DO (2006) Background noise cancellation of manatee vocalizations using an adaptive line enhancer. J Acoust Soc Am 120(1):145–152

    Google Scholar 

  • Yang Q, Yan P, Zhang Y, Yu H, Shi Y, Mou X, Kalra MK, Zhang Y, Sun L, Wang G (2018) Low-dose ct image denoising using a generative adversarial network with wasserstein distance and perceptual loss. IEEE Trans Med Imag 37(6):1348–1357

    Google Scholar 

  • Yu S, Ma J, Wang W (2019) Deep learning for denoising. Geophysics 84(6):V333–V350

    Google Scholar 

  • Zaugg S, Van Der Schaar M, Houégnigan L, Gervaise C, André M (2010) Real-time acoustic classification of sperm whale clicks and shipping impulses from deep-sea observatories. Appl Acoust 71(11):1011–1019

    Google Scholar 

  • Zavarehei E (2020a) Berouti spectral subtraction (https://www.mathworks.com/matlabcentral/fileexchange/7675-boll-spectral-subtraction). MATLAB Central File Exchange Retrieved July 23, 2020

  • Zavarehei E (2020b) Boll spectral subtraction (https://www.mathworks.com/matlabcentral/fileexchange/7675-boll-spectral-subtraction). MATLAB Central File Exchange Retrieved July 23, 2020

  • Zavarehei E (2020c) Mmse stsa (https://www.mathworks.com/matlabcentral/fileexchange/10143-mmse-stsa). MATLAB Central File Exchange Retrieved July 23, 2020

  • Zavarehei E (2020d) Wiener filter (https://www.mathworks.com/matlabcentral/fileexchange/7673-wiener-filter). MATLAB Central File Exchange Retrieved July 23, 2020

  • Zeppelzauer M, Stöger AS, Breiteneder C (2013) Acoustic detection of elephant presence in noisy environments. In: Proceedings of the 2nd ACM international workshop on Multimedia analysis for ecological data, ACM, pp 3–8

  • Zhang K, Zuo W, Chen Y, Meng D, Zhang L (2017) Beyond a gaussian denoiser: residual learning of deep cnn for image denoising. IEEE Trans Image Process 26(7):3142–3155

    MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

This work is supported by the 111 Project. This work is also supported by Fundamental Research Funds for the Central Universities (Grant No: JUSRP11924) and Jiangsu Key Laboratory of Advanced Food Manufacturing Equipment & Technology (Grant No: FM-2019-06). This work is partially supported by National Natural Science Foundation of China (Grant No: 61902154). This work is also partially supported by Natural Science Foundation of Jiangsu Province (Grant No: BK2019043526) and Jiangsu Province Post Doctoral Fund (Grant No: 2020Z430). We also want to thank the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES) for the institutional support.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jie Xie.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xie, J., Colonna, J.G. & Zhang, J. Bioacoustic signal denoising: a review. Artif Intell Rev 54, 3575–3597 (2021). https://doi.org/10.1007/s10462-020-09932-4

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10462-020-09932-4

Keywords

Navigation