Acoustic domain classification and recognition through ensemble based multilevel classification

Rathor, Sandeep; Jadon, R. S.

doi:10.1007/s12652-018-1087-6

Acoustic domain classification and recognition through ensemble based multilevel classification

Original Research
Published: 11 October 2018

Volume 10, pages 3617–3627, (2019)
Cite this article

Journal of Ambient Intelligence and Humanized Computing Aims and scope Submit manuscript

250 Accesses
Explore all metrics

Abstract

To make the best use of speech recognition, it is imperative that it can recognize not just speech or speaker, but also the domain of communication. This paper proposes an approach for recognition of the acoustic domain using ensemble-based 3-level architecture instead of a single classifier for training and testing. It is estimated the predictions of various classifiers and then selects a set of three classifiers such that, any of the three classifiers must contain the target predictions and finally, these predictions are used to train another random forest classifier. It yields the final classification results of test data set. Experimental results indicate that the proposed method has consistent performance even if data size is increased with acceptable accuracy i.e. 76.36%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Handling high dimensional features by ensemble learning for emotion identification from speech signal

Article 02 November 2021

Ensemble softmax regression model for speech emotion recognition

Article 02 April 2016

A computationally efficient speech emotion recognition system employing machine learning classifiers and ensemble learning

Article 30 March 2024

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Casale S, Russo a, Scebba G, Serrano S (2008) Speech emotion classification using machine learning algorithms. 2008 IEEE Int Conf Semantic Comput 118(13):167–174
Google Scholar
Chuang Z, Wu C-h (2004) Multi-modal emotion recognition from speech and text. J Comput Linguist Chin 9(2):45–62
Google Scholar
Dahl G, Yu D, Deng L, Acero A (2012) Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition. IEEE Trans Audio Speech Lang Process 20(1):30–42
Article Google Scholar
Davletcharova A, Sugathan S, Abraham B, James A (2015) Detection and analysis of emotion from speech signals. Proc Comput Sci 58:91–96
Article Google Scholar
Deng L, Li X (2013) Machine learning paradigms for speech recognition: an overview. IEEE Trans Audio Speech Lang Process 21(5):1060–1089
Article Google Scholar
Garofolo JS (1993) TIMIT acoustic phonetic continuous speech corpus. Linguistic Data Consortium 1993
Giannoulis D, Benetos E, Stowell D, Rossignol M, Lagrange M, Plumbley MD (2015) Detection and classification of acoustic scenes and events. An IEEE AASP challenge. In: 2013 IEEE workshop on applications of signal processing to audio and acoustics (WASPAA). IEEE, pp 1–4
Huang CW, Narayanan S (2017) Characterizing types of convolution in deep convolutional recurrent neural networks for robust speech emotion recognition 1–19. arXiv preprint arXiv:1706.02901
Imoto K, Ono N (2017) Spatial cepstrum as a spatial feature using a distributed microphone array for acoustic scene analysis. IEEE/ACM Trans Audio Speech Lang Process 25(6):1335–1343
Article Google Scholar
Ming J, Crookes D (2017) Speech enhancement based on full-sentence correlation and clean speech recognition. IEEE/ACM Trans Audio Speech Lang Process 25(3):531–543
Article Google Scholar
Ming J, Srinivasan R, Crookes D (2011) A corpus-based approach to speech enhancement from nonstationary noise. IEEE Trans Audio Speech Lang Process 19(4):822–836
Article Google Scholar
Mun S, Park S, Han DK, Ko H (2017) Generative adversarial network based acoustic scene training set augmentation and selection using SVM hyper-plane. Proc. DCASE, pp 93–97
Panayotov V, Chen G, Povey D, Khudanpur S (2015) Librispeech: an ASR corpus based on public domain audio books. In: Acoustics, speech and signal processing (ICASSP), 2015 IEEE international conference on, pp 5206–5210)
Sarikaya R, Hinton G, Deoras A (2014) Application of deep belief networks for natural language understanding. IEEE Trans Audio Speech Lang Process 22(4):778–784
Article Google Scholar
Valenti M, Diment A, Parascandolo G, Squartini S, Virtanen T (2016) Acoustic scene classification using convolutional neural networks. Proceedings of the detection and classification of acoustic scenes and events 2016 workshop (DCASE2016) (September), pp 95–99
Wu C-H, Chuang Z-J, Lin Y-C (2006) Emotion recognition from text using semantic labels and separable mixture models. ACM Trans Asian Lang Inf Process 5(2):165–183
Article Google Scholar
Yadollahi A, Shahraki A, Zaiane O (2017) Current state of text sentiment analysis from opinion to emotion mining. ACM Comput Surv (CSUR) 50(2):25
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of CEA, GLA University, Mathura, India
Sandeep Rathor
Department of MCA, MITS, Gwalior, India
R. S. Jadon

Authors

Sandeep Rathor
View author publications
You can also search for this author in PubMed Google Scholar
R. S. Jadon
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sandeep Rathor.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Rathor, S., Jadon, R.S. Acoustic domain classification and recognition through ensemble based multilevel classification. J Ambient Intell Human Comput 10, 3617–3627 (2019). https://doi.org/10.1007/s12652-018-1087-6

Download citation

Received: 23 April 2018
Accepted: 02 October 2018
Published: 11 October 2018
Issue Date: 01 September 2019
DOI: https://doi.org/10.1007/s12652-018-1087-6

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Acoustic domain classification and recognition through ensemble based multilevel classification

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Handling high dimensional features by ensemble learning for emotion identification from speech signal

Ensemble softmax regression model for speech emotion recognition

A computationally efficient speech emotion recognition system employing machine learning classifiers and ensemble learning

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now