Dictionary-based active learning for sound event classification

Ji, Wanting; Wang, Ruili; Ma, Junbo

doi:10.1007/s11042-018-6380-z

Dictionary-based active learning for sound event classification

Published: 25 July 2018

Volume 78, pages 3831–3842, (2019)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Wanting Ji^1,2,
Ruili Wang^1,2 &
Junbo Ma^1,2

562 Accesses
8 Citations
Explore all metrics

Abstract

This paper proposes a new dictionary-based active learning method for sound event classification, which significantly reduces the required amount of labeled samples in the process of classifier training. Active learning is a process of selecting samples to be labeled. In our method, the active learning is based on clustering. We use dictionary-based clustering as the dictionary learning is more suitable to sound event classification. Our classifier will be trained using both unlabelled sound segments (that have predicted labels), and a small number of labeled samples. The proposed method and other reference methods are implemented on a public urban sound dataset with 8732 sound segments, the classification accuracy is used to measure the performance of these classifiers. Experimental results show that the proposed method has higher classification accuracy but requires much less labeled samples than other methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A survey on semi-supervised learning

Article Open access 15 November 2019

A survey of methods for time series change point detection

Article 08 September 2016

A Comprehensive Survey of Anomaly Detection Algorithms

Article 26 November 2021

References

Barkana BD, Uzkent B (2011) Environmental noise classifier using a new set of feature parameters based on pitch range. Appl Acoust 72(11):841–848
Article Google Scholar
Chu S, Narayanan S, Jay Kuo C-C (2009) Environmental sound recognition with time-frequency audio features. IEEE Trans Audio Speech Lang Process 17(6):1142–1158
Article Google Scholar
Cohn D, Atlas L, Ladner R (1994) Improving generalization with active learning. Mach Learn 15(2):201–221
Google Scholar
Duan S, Zhang J, Roe P, Towsey M (2012) A survey of tagging techniques for music, speech and environmental sound. Artif Intell Rev 42(4):637–661
Article Google Scholar
Fleury A, Noury N, Vacher M, Glasson H, Seri JF (2008) Sound and speech detection and classification in a health smart home. In: Proc. IEEE Int. Conf. Engineering in Medicine and Biology Society, p 4644–4647
Foggia P, Petkov N, Saggese A, Strisciuglio N, Vento M (2016) Audio surveillance of roads: a system for detecting anomalous sounds. IEEE Trans Intell Transp Syst 17(1):279–288
Article Google Scholar
Gadde A, Anis A, Ortega A (2014) Active semi-supervised learning using sampling theory for graph signals. In: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, p 492–501
Ghofrani S, McLernon DC, Ayatollahi A (2003) Comparing Gaussian and chirplet dictionaries for time-frequency analysis using matching pursuit decomposition. In: Signal Processing and Information Technology, 2003. ISSPIT 2003. Proceedings of the 3rd IEEE International Symposium on. IEEE, p 713–716
Gold B, Morgan N, Ellis D (2011) Speech and audio signal processing: processing and perception of speech and music. Wiley, Hoboken
Book Google Scholar
Han W, Coutinho E, Ruan H, Li H, Schuller B, Yu X, Zhu X (2016) Semi-supervised active learning for sound classification in hybrid learning environments. PLoS One 11(9):e0162075
Article Google Scholar
Krogh A, Vedelsby J (1995) Neural network ensembles, cross validation, and active learning. In: Advances in neural information processing systems, p 231–238
Lei C, Zhu X (2017) Unsupervised feature selection via local structure learning and sparse learning. Multimedia Tools and Appl: 1–18
Maijala P, Shuyang Z, Heittola T, Virtanen T (2018) Environmental noise monitoring using source classification in sensors. Appl Acoust 129:258–267
Article Google Scholar
Mallat SG, Zhang Z (1993) Matching pursuits with time-frequency dictionaries. IEEE Trans Signal Process 41(12):3397–3415
Article Google Scholar
Morrison D, Wang R, De Silva LC (2007) Ensemble methods for spoken emotion recognition in call-centres. Speech Comm 49(2):98–112
Article Google Scholar
Park H-S, Jun C-H (2009) A simple and fast algorithm for K-medoids clustering. Expert Syst Appl 36(2):3336–3341
Article Google Scholar
Phuong NC, Dat TD (2013) Sound classification for event detection: Application into medical telemonitoring. In: Proc. Int. Conf. Computing, Management and Telecommunications (ComManTel), p 330–333
Piczak KJ (2015) ESC: dataset for environmental sound classification. In: Proc. ACM Int. Conf. Multimedia, p 1015–1018
Ren J, Jiang X, Yuan J, Magnenat-Thalmann N (2017) Sound-event classification using robust texture features for robot hearing. IEEE Trans Multimedia 19(3):447–458
Article Google Scholar
Riccardi G, Hakkani-Tur D (2005) Active learning: theory and applications to automatic speech recognition. IEEE Trans Speech Audio Process 13(4):504–511
Article Google Scholar
Rubinstein R, Bruckstein AM, Elad M (2010) Dictionaries for sparse representation modeling. Proc IEEE 98(6):1045–1057
Article Google Scholar
Salamon J, Jacoby C, Bello JP (2014) A dataset and taxonomy for urban sound research. In: Proceedings of the 22nd ACM international conference on Multimedia. ACM, p. 1041–1044
Schröder J, Anemiiller J, Goetze S (2016) Classification of human cough signals using spectro-temporal Gabor filterbank features. In: Acoustics, Speech and Signal Processing (ICASSP), 2016 IEEE International Conference on. IEEE, p. 6455–6459
Sharan RV, Moir TJ (2017) Robust acoustic event classification using deep neural networks. Inf Sci 396:24–32
Article Google Scholar
Shuyang Z, Heittola T, Virtanen T (2017) Active learning for sound event classification by clustering unlabeled data. In: Acoustics, Speech and Signal Processing (ICASSP), 2017 IEEE International Conference on, p 751–755
Sugden P, Canagarajah N (2004) Underdetermined noisy blind separation using dual matching pursuits. In: Acoustics, Speech, and Signal Processing (ICASSP'04). IEEE International Conference on, vol. 5, p V-557. IEEE
Vera-Candeas P, Ruiz-Reyes N, Rosa-Zurera M, Martinez-Munoz D, López-Ferreras F (2004) Transient modeling by matching pursuits with a wavelet dictionary for parametric audio coding. IEEE Signal Process Lett 11(3):349–352
Article Google Scholar
Wang R, Zong M (2018) Unsupervised feature selection based on self-representation and subspace learning. World Wide Web. https://doi.org/10.1007/s11280-017-0508-3
Wang J-C, Lin C-H, Chen B-W, Tsai M-K (2014) Gabor-based nonuniform scale-frequency map for environmental sound classification in home automation. IEEE Trans Autom Sci Eng 11(2):607–613
Article Google Scholar
Wang C-Y, Wang J-C, Santoso A, Chiang C-C, Wu C-H (2017) Sound event recognition using auditory-receptive-field binary pattern and hierarchical-diving deep belief network. IEEE/ACM Transactions on Audio, Speech, and Language Processing, p 1–16
Wang R, Ji W, Liu M, Wang X, Weng J, Deng S, Gao S, Yuan C-a. (2018) Review on mining data from multiple data sources. Pattern Recogn Lett
Zhang Z, Schuller B (2012) Semi-supervised learning helps in sound event classification. In: Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on, p 333–336
Zhang S, Li X, Zong M, Zhu X, Wang R (2017) Efficient knn classification with different numbers of nearest neighbors. IEEE Trans Neural Netw Learn Syst
Zheng W, Zhu X, Zhu Y, Hu R, Lei C (2017) Dynamic graph learning for spectral feature selection. Multimed Tools Appl: 1–17
Zhu X (2006) Semi-supervised learning literature survey. University of Wisconsin-Madison, Technical Report 1530, Wisconsin
Zhu X, Zhang S, Hu R, Zhu Y (2018) Local and global structure preservation for robust unsupervised spectral feature selection. IEEE Trans Knowl Data Eng 30(3):517–529
Article Google Scholar

Download references

Acknowledgments

This work was supported in part by the Natural Science Foundation of Zhejiang Province (No. LY18F010008) and the Marsden Fund of New Zealand.

Author information

Authors and Affiliations

Zhejiang Gongshang University, Hangzhou, China
Wanting Ji, Ruili Wang & Junbo Ma
Masssy University, Auckland, New Zealand
Wanting Ji, Ruili Wang & Junbo Ma

Authors

Wanting Ji
View author publications
You can also search for this author in PubMed Google Scholar
Ruili Wang
View author publications
You can also search for this author in PubMed Google Scholar
Junbo Ma
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wanting Ji.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ji, W., Wang, R. & Ma, J. Dictionary-based active learning for sound event classification. Multimed Tools Appl 78, 3831–3842 (2019). https://doi.org/10.1007/s11042-018-6380-z

Download citation

Received: 05 November 2017
Revised: 16 June 2018
Accepted: 05 July 2018
Published: 25 July 2018
Issue Date: February 2019
DOI: https://doi.org/10.1007/s11042-018-6380-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Dictionary-based active learning for sound event classification

Abstract

Access this article

Similar content being viewed by others

A survey on semi-supervised learning

A survey of methods for time series change point detection

A Comprehensive Survey of Anomaly Detection Algorithms

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Dictionary-based active learning for sound event classification

Abstract

Access this article

Similar content being viewed by others

A survey on semi-supervised learning

A survey of methods for time series change point detection

A Comprehensive Survey of Anomaly Detection Algorithms

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation