Skip to main content
Log in

Classification of acoustical signals by combining active learning strategies with semi-supervised learning schemes

  • S.I. : Information, Intelligence, Systems and Applications
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

In real-world cases, handling both labeled and unlabeled data has raised the interest of several Data Scientists and Machine Learning engineers, leading to several demonstrations that apply data-augmenting approaches in order to obtain a robust and, at the same time, accurate enough learning behavior. The main reason is the existence of much unlabeled data that are ignored by conventional supervised approaches, reducing the chance of enriching the final formatted hypothesis. However, the majority of the proposed methods that operate using both kinds of these data are oriented toward exploiting only one category of these algorithms, without combining their strategies. Since the most popular of them regarding the classification task are Active and Semi-supervised Learning approaches, we aim to design a framework that combines both of them trying to fuse their advantages during the main core of the learning process. Thus, we conduct an empirical evaluation of such a combinatory approach over three problems, which stem from various fields but are all tackled through the use of acoustical signals, operating under the pool-based scenario: gender identification, emotion detection and automatic speaker recognition. Into the proposed combinatory framework, which operates under training sets with small cardinality, our results prove the benefits of adopting such kind of semi-automated approaches regarding both the achieved predictive correctness when reduced consumption of resources takes place, as well as the smoothness of the learning convergence. Several learners have been examined for reaching to more general conclusions, and a variant of self-training scheme has been also examined.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  1. Khamassi I, Sayed-Mouchaweh M, Hammami M, Ghédira K (2018) Discussion and review on evolving data streams and concept drift adapting. Evol Syst 9:1–23. https://doi.org/10.1007/s12530-016-9168-2

    Article  Google Scholar 

  2. Shayaa S, Jaafar NI, Bahri S, Sulaiman A, Seuk Wai P, Wai Chung Y, Piprani AZ, Al-Garadi MA (2018) Sentiment analysis of big data: methods, applications, and open challenges. IEEE Access 6:37807–37827. https://doi.org/10.1109/ACCESS.2018.2851311

    Article  Google Scholar 

  3. Nguyen AT, Wallace BC, Lease M (2015) Combining crowd and expert labels using decision theoretic active learning. In: HCOMP. pp 120–129

  4. Schwenker F, Trentin E (2014) Pattern classification and clustering: a review of partially supervised learning approaches. Pattern Recognit Lett 37:4–14. https://doi.org/10.1016/j.patrec.2013.10.017

    Article  Google Scholar 

  5. Kostopoulos G, Karlos S, Kotsiantis S, Ragos O (2018) Semi-supervised regression: a recent review. J Intell Fuzzy Syst 35:1483–1500. https://doi.org/10.3233/JIFS-169689

    Article  Google Scholar 

  6. Settles B (2012) Active learning. Morgan & Claypool Publishers, San Rafael

    Book  MATH  Google Scholar 

  7. Akyürek HA, Koçer B (2019) Semi-supervised fuzzy neighborhood preserving analysis for feature extraction in hyperspectral remote sensing images. Neural Comput Appl 31:3385–3415. https://doi.org/10.1007/s00521-017-3279-y

    Article  Google Scholar 

  8. Liu W, Zhang L, Tao D, Cheng J (2017) Support vector machine active learning by Hessian regularization. J Vis Commun Image Represent 49:47–56. https://doi.org/10.1016/j.jvcir.2017.08.001

    Article  Google Scholar 

  9. Long B, Bian J, Chapelle O, Zhang Y, Inagaki Y, Chang Y (2015) Active learning for ranking through expected loss optimization. IEEE Trans Knowl Data Eng 27:1180–1191. https://doi.org/10.1109/TKDE.2014.2365785

    Article  Google Scholar 

  10. Freund Y, Seung HS, Shamir E, Tishby N (1997) Selective sampling using the query by committee algorithm. Mach Learn 28:133–168. https://doi.org/10.1023/A:1007330508534

    Article  MATH  Google Scholar 

  11. Granell E, Romero V, Martínez-Hinarejos CD (2018) Multimodality, interactivity, and crowdsourcing for document transcription. Comput Intell 34:398–419. https://doi.org/10.1111/coin.12169

    Article  MathSciNet  Google Scholar 

  12. Elahi M, Ricci F, Rubens N (2016) A survey of active learning in collaborative filtering recommender systems. Comput Sci Rev 20:29–50. https://doi.org/10.1016/j.cosrev.2016.05.002

    Article  MathSciNet  MATH  Google Scholar 

  13. Zhang C (2015) Active learning from weak and strong labelers. In: NIPS. pp 703–711

  14. Karlos S, Fazakis N, Kotsiantis S, Sgarbas K (2016) A semisupervised cascade classification algorithm. Appl Comput Intell Soft Comput 2016:14. https://doi.org/10.1155/2016/5919717

    Article  Google Scholar 

  15. Triguero I, García S, Herrera F (2015) Self-labeled techniques for semi-supervised learning: taxonomy, software and empirical study. Knowl Inf Syst 42:245–284. https://doi.org/10.1007/s10115-013-0706-y

    Article  Google Scholar 

  16. Kang P, Kim D, Cho S (2016) Semi-supervised support vector regression based on self-training with label uncertainty: an application to virtual metrology in semiconductor manufacturing. Expert Syst Appl 51:85–106. https://doi.org/10.1016/j.eswa.2015.12.027

    Article  Google Scholar 

  17. Dalal MK, Zaveri MA (2013) Semisupervised learning based opinion summarization and classification for online product reviews. Appl Comput Intell Soft Comput 2013:1–8. https://doi.org/10.1155/2013/910706

    Article  Google Scholar 

  18. Wu D, Luo X, Wang G, Shang M, Yuan Y, Yan H (2018) A highly accurate framework for self-labeled semisupervised classification in industrial applications. IEEE Trans Ind Inform 14:909–920. https://doi.org/10.1109/TII.2017.2737827

    Article  Google Scholar 

  19. Wang Y, Xu X, Zhao H, Hua Z (2010) Semi-supervised learning based on nearest neighbor rule and cut edges. Knowl Based Syst 23:547–554. https://doi.org/10.1016/j.knosys.2010.03.012

    Article  Google Scholar 

  20. Sabata T, Pulc P, Holena M (2018) Semi-supervised and active learning in video scene classification from statistical features. In: Krempl G, Lemaire V, Kottke D, Calma A, Holzinger A, Polikar R, Sick B (eds.), IAL@PKDD/ECML. CEUR-WS.org, pp 24–35

  21. Yarowsky D, David (1995) Unsupervised word sense disambiguation rivaling supervised methods. In: Proceedings of the 33rd annual meeting on association for computational linguistics. Association for Computational Linguistics, Morristown, NJ, USA, pp 189–196

  22. Potapova R, Potapov V (2016) On Individual Polyinformativity of Speech and Voice Regarding Speakers Auditive Attribution (Forensic Phonetic Aspect). Speech and Computer. SPECOM. Lecture Notes in Computer Science, vol 9811. Springer, Cham, pp 507–514

    Google Scholar 

  23. Kunešová M, Radová V (2015) Ideas for clustering of similar models of a speaker in an online speaker diarization system. TSD. Springer, Cham, pp 225–233

    Google Scholar 

  24. McCallumzy Andrew Kachites;Nigamy K (1998) Employing EM and pool-based active learning for text classification. In: ICML. pp 350–358

  25. Muslea I, Minton S, Knoblock CA (2002) Active+ semi-supervised learning = robust multi-view learning. In: ICML. pp 435–442

  26. Zhou Z-H, Chen K-J, Dai H-B (2006) Enhancing relevance feedback in image retrieval using unlabeled data. ACM Trans Inf Syst 24:219–244. https://doi.org/10.1145/1148020.1148023

    Article  Google Scholar 

  27. Hanneke S (2014) Theory of disagreement-based active learning. Found Trends® Mach Learn 7:131–309. https://doi.org/10.1561/2200000037

    Article  MATH  Google Scholar 

  28. Zhou ZH, Li M (2010) Semi-supervised learning by disagreement. Knowl Inf Syst 24:415–439. https://doi.org/10.1007/s10115-009-0209-z

    Article  Google Scholar 

  29. Yu D, Varadarajan B, Deng L, Acero A (2010) Active learning and semi-supervised learning for speech recognition: a unified framework using the global entropy reduction maximization criterion. Comput Speech Lang 24:433–444. https://doi.org/10.1016/j.csl.2009.03.004

    Article  Google Scholar 

  30. Hajmohammadi MS, Ibrahim R, Selamat A, Fujita H (2015) Combination of active learning and self-training for cross-lingual sentiment classification with density analysis of unlabelled samples. Inf Sci (Ny) 317:67–77

    Article  Google Scholar 

  31. Han W, Coutinho E, Ruan H, Li H, Schuller B, Yu X, Zhu X (2016) Semi-supervised active learning for sound classification in hybrid learning environments. PLoS ONE 11:1–23. https://doi.org/10.1371/journal.pone.0162075

    Article  Google Scholar 

  32. Tran VC, Nguyen NT, Fujita H, Hoang DT, Hwang D (2017) A combination of active learning and self-learning for named entity recognition on Twitter using conditional random fields. Knowl Based Syst 132:179–187. https://doi.org/10.1016/J.KNOSYS.2017.06.023

    Article  Google Scholar 

  33. Calma A, Reitmaier T, Sick B (2018) Semi-supervised active learning for support vector machines: a novel approach that exploits structure information in data. Inf Sci (Ny) 456:13–33. https://doi.org/10.1016/J.INS.2018.04.063

    Article  MathSciNet  MATH  Google Scholar 

  34. Reitmaier T, Sick B (2013) Let us know your decision: Pool-based active training of a generative classifier with the selection strategy 4DS. Inf Sci (Ny) 230:106–131. https://doi.org/10.1016/J.INS.2012.11.015

    Article  Google Scholar 

  35. Ding S, Zhu Z, Zhang X (2017) An overview on semi-supervised support vector machine. Neural Comput Appl 28:969–978. https://doi.org/10.1007/s00521-015-2113-7

    Article  Google Scholar 

  36. van Engelen JE, Hoos HH (2020) A survey on semi-supervised learning. Mach Learn 109:373–440. https://doi.org/10.1007/s10994-019-05855-6

    Article  MathSciNet  MATH  Google Scholar 

  37. Hou S, Liu H, Sun Q (2019) Sparse regularized discriminative canonical correlation analysis for multi-view semi-supervised learning. Neural Comput Appl 31:7351–7359. https://doi.org/10.1007/s00521-018-3582-2

    Article  Google Scholar 

  38. Hwa R, Osborne M, Sarkar A, Steedman M (2003) Corrected Co-training for Statistical Parsers. In: ICML 2003

  39. Wang W, Zhou Z-H (2008) On multi-view active learning and the combination with semi-supervised learning. In: Proceedings of the 25th international conference on machine learning. association for computing machinery, New York, NY, USA, pp 1152–1159

  40. Huang L, Liu Y, Liu X, Wang X, Lang B (2014) Graph-based active semi-supervised learning: a new perspective for relieving multi-class annotation labor. In: 2014 IEEE international conference on multimedia and expo (ICME). IEEE, pp 1–6

  41. Li M, Zhou Z-H (2005) {SETRED:} Self-training with Editing. In: Ho TB, Cheung DW-L, Liu H (eds.), Advances in Knowledge Discovery and Data Mining, 9th Pacific-Asia Conf. {PAKDD}, Hanoi, Vietnam, Proceedings, Springer, pp 611–621. https://doi.org/10.1007/11430919_71

  42. Tur G, Hakkani-Tür D, Schapire RE (2005) Combining active and semi-supervised learning for spoken language understanding. Speech Commun 45:171–186. https://doi.org/10.1016/J.SPECOM.2004.08.002

    Article  Google Scholar 

  43. Yu C, Hansen JHL (2017) Active learning based constrained clustering for speaker diarization. IEEE/ACM Trans Audio Speech Lang Process 25:2188–2198

    Article  Google Scholar 

  44. Gender Recognition by Voice | Kaggle. https://www.kaggle.com/primaryobjects/voicegender

  45. Cummins F, Grimaldi M, Leonard T, Simko J (2006) The CHAINS Speech Corpus: CHAracterizing INdividual Speakers. In: Proc SPECOM, pp 1–6

  46. Wang J-C, Wang C-Y, Chin Y-H, Liu Y-T, Chen E-T, Chang P-C (2017) Spectral-temporal receptive fields and MFCC balanced feature extraction for robust speaker recognition. Multimed Tools Appl 76:4055–4068. https://doi.org/10.1007/s11042-016-3335-0

    Article  Google Scholar 

  47. Karlos S, Fazakis N, Karanikola K, Kotsiantis S, Sgarbas K (2016) Speech recognition combining MFCCs and image features. In: Speech and Computer. SPECOM 2016, LNCS (LNAI). Springer, Cham, pp 651–658

  48. Chatzichristofis SA, Boutalis YS (2008) FCTH: Fuzzy color and texture histogram—a low level feature for accurate image retrieval. In: 2008 ninth international workshop on image analysis for multimedia interactive services. IEEE, pp 191–196

  49. Klaylat S, Osman Z, Zantout R, Hamandi L (2018) Arabic Natural Audio Dataset, v1. In: Mendeley Data. https://data.mendeley.com/datasets/xm232yxf7t/1

  50. Karlos S, Kanas VG, Aridas C, Fazakis N, Kotsiantis S (2019) Combining active learning with self-train algorithm for classification of multimodal problems. In: 10th international conference on information, intelligence, systems and applications (IISA). IEEE, pp 1–8

  51. Qin Y, Langari R, Wang Z, Xiang C, Dong M (2017) Road excitation classification for semi-active suspension system with deep neural networks. J Intell Fuzzy Syst 33:1907–1918. https://doi.org/10.3233/JIFS-161860

    Article  Google Scholar 

  52. Demiröz G, Güvenir HA (1997) Classification by voting feature intervals. Springer, Berlin, Heidelberg, pp 85–92

    Google Scholar 

  53. Geurts P, Ernst D, Wehenkel L (2006) Extremely randomized trees. Mach Learn 63:3–42. https://doi.org/10.1007/s10994-006-6226-1

    Article  MATH  Google Scholar 

  54. Breiman L (2001) Random forests. Mach Learn 45:5–32. https://doi.org/10.1023/A:1010933404324

    Article  MATH  Google Scholar 

  55. He K, Zhang X, Ren S, Sun J (2015) Delving deep into rectifiers: surpassing human-level performance on ImageNet classification

  56. Cai Y, Ji D, Cai D (2010) A KNN research paper classification method based on shared nearest neighbor. In: Proceedings of the 8th NTCIR Work Meet Eval Inf Access Technol Inf Retrieval, Quest Answering Cross-Lingual Inf Access, pp 336–340

  57. Chen H, Liu W, Wang L (2016) Naive Bayesian classification of uncertain objects based on the theory of interval probability. Int J Artif Intell Tools 25:1–31. https://doi.org/10.1142/S0218213016500123

    Article  Google Scholar 

  58. Aridas CK (2020) vfi: Classification by voting feature intervals in Python

  59. Buitinck L, Louppe G, Blondel M, Pedregosa F, Müller AC, Grisel O, Niculae V, Prettenhofer P, Gramfort A, Grobler J, Layton R, Vanderplas J, Joly A, Holt B, Varoquaux G (2013) API design for machine learning software: experiences from the scikit-learn project

  60. Saito T, Rehmsmeier M (2015) The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets. PLoS ONE 10:1–21. https://doi.org/10.1371/journal.pone.0118432

    Article  Google Scholar 

  61. Rodríguez-Fdez I, Canosa A, Mucientes M, Bugarín A (2015) STAC: a web platform for the comparison of algorithms using statistical tests. In: FUZZ-IEEE. pp 1–8

  62. Hollander M, Wolfe DA, Chicken E (2013) Nonparametric statistical methods, 3rd edn. Wiley, Hoboken

    MATH  Google Scholar 

  63. Holzinger A (2016) Interactive machine learning for health informatics: when do we need the human-in-the-loop? Brain Inform 3:119–131. https://doi.org/10.1007/s40708-016-0042-6

    Article  Google Scholar 

  64. Singh A, Nowak R, Zhu J (2008) Unlabeled data: now it helps, now it doesn’t. In: Koller D, Schuurmans D, Bengio Y, Bottou L (eds.), NIPS. Curran Associates, Inc., pp 1513–1520

  65. Leng Y, Xu X, Qi G (2013) Combining active learning and semi-supervised learning to construct SVM classifier. Knowl Based Syst 44:121–131. https://doi.org/10.1016/J.KNOSYS.2013.01.032

    Article  Google Scholar 

  66. Reitmaier T, Calma A, Sick B (2015) Transductive active learning—a new semi-supervised learning approach based on iteratively refined generative models to capture structure in data. Inf Sci (Ny) 293:275–298. https://doi.org/10.1016/J.INS.2014.09.009

    Article  Google Scholar 

  67. Batista AJL, Campello RJGB, Sander J (2016) Active semi-supervised classification based on multiple clustering hierarchies. In: DSAA. pp 11–20

  68. Wang Q, Downey C, Wan L, Mansfield PA, Moreno IL (2017) Speaker Diarization with LSTM

  69. I. Del Carmen Grau Garcia D. Sengupta MMGL, Nowé A (2018) Interpretable self-labeling semi-supervised classifier. In: Proceedings of the 2nd workshop on explainable artificial intelligence

  70. Ioannis M, Nick B, Ioannis V, Grigorios T (2020) LionForests: local interpretation of random forests. In: Alessandro S, Luciano S, Paul L (eds.), First international workshop on new foundations for human-centered AI (NeHuAI 2020), Aachen, pp 17–24

  71. Wang X, Wen J, Alam S, Jiang Z, Wu Y (2016) Semi-supervised learning combining transductive support vector machine with active learning. Neurocomputing 173:1288–1298. https://doi.org/10.1016/j.neucom.2015.08.087

    Article  Google Scholar 

  72. Yan J, Song Y, Dai LR, McLoughlin I (2020) Task-Aware Mean Teacher Method for Large Scale Weakly Labeled Semi-Supervised Sound Event Detection. In: Proceedings of the ICASSP, IEEE international conference on acoustics, speech and signal processing. Institute of Electrical and Electronics Engineers Inc., pp 326–330

  73. Kee S, del Castillo E, Runger G (2018) Query-by-committee improvement with diversity and density in batch active learning. Inf Sci (Ny) 454–455:401–418. https://doi.org/10.1016/j.ins.2018.05.014

    Article  MathSciNet  Google Scholar 

  74. Huang E, Pao H, Lee Y (2017) Big active learning. In: BigData. pp 94–101

  75. Hsu W-N, Lin H-T (2015) Active learning by learning. In: AAAI conference on artificial intelligence, pp 2659–2665

  76. Yue Y, Broder J, Kleinberg R, Joachims T (2012) The K-armed dueling bandits problem. J Comput Syst Sci 78:1538–1556. https://doi.org/10.1016/J.JCSS.2011.12.028

    Article  MathSciNet  MATH  Google Scholar 

  77. Huang S-J, Jin R, Zhou Z-H (2014) Active learning by querying informative and representative examples. IEEE Trans Pattern Anal Mach Intell 36:1936–1949

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Stamatis Karlos.

Ethics declarations

Conflict of interest

The authors declare that there is no conflict of interest regarding the publication of this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Karlos, S., Aridas, C., Kanas, V.G. et al. Classification of acoustical signals by combining active learning strategies with semi-supervised learning schemes. Neural Comput & Applic 35, 3–20 (2023). https://doi.org/10.1007/s00521-021-05749-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-021-05749-6

Keywords

Navigation