Robust vocabulary recognition clustering model using an average estimator least mean square filter in noisy environments

Ahn, Chan-Shik; Oh, Sang-Yeob

doi:10.1007/s00779-013-0732-5

Robust vocabulary recognition clustering model using an average estimator least mean square filter in noisy environments

Original Article
Published: 22 October 2013

Volume 18, pages 1295–1301, (2014)
Cite this article

Personal and Ubiquitous Computing Aims and scope Submit manuscript

Chan-Shik Ahn¹ &
Sang-Yeob Oh²

1202 Accesses
4 Citations
Explore all metrics

Abstract

Noise estimation and detection algorithms must adapt to a changing environment quickly, so they use a least mean square (LMS) filter. However, there is a downside. An LMS filter is very low, and it consequently lowers speech recognition rates. In order to overcome such a weak point, we propose a method to establish a robust speech recognition clustering model for noisy environments. Since this proposed method allows the cancelation of noise with an average estimator least mean square (AELMS) filter in a noisy environment, a robust speech recognition clustering model can be established. With the AELMS filter, which can preserve source features of speech and decrease the degradation of speech information, noise in a contaminated speech signal gets canceled, and a Gaussian state model is clustered as a method to make noise more robust. By composing a Gaussian clustering model, which is a robust speech recognition clustering model, in a noisy environment, recognition performance was evaluated. The study shows that the signal-to-noise ratio of speech, which was improved by canceling environment noise that kept changing, was enhanced by 2.8 dB on average, and recognition rate improved by 4.1 %.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Review on Kalman Filter Models

Article 01 October 2022

An ensemble of optimal smoothing and minima controlled through iterative averaging for speech enhancement under uncontrolled environment

Article Open access 17 April 2024

Deep neural network techniques for monaural speech enhancement and separation: state of the art analysis

Article Open access 25 October 2023

References

Wu BF, Wang KC (2005) Robust endpoint detection algorithm based on the adaptive band-partitioning spectral entropy in adverse environments. IEEE Trans Speech Audio Process 13(5):762–775
Article Google Scholar
Elmezain M, Al-Hamadi A, Appenrodt J, Michaelis B (2008) A hidden markov model-based continuous gesture recognition system for hand motion trajectory. ICPR 2008, pp 1–4
Homer J, Mareels I (2004) LS detection guided NLMS estimation of sparse system. Proceedings of the IEEE 2004 international conference on acoustic. Speech and signal processing (ICASSP). Montreal, Quebec, Canada
Han JS, Chung KY, Kim GJ (2013) Policy on literature content based on software as service. Multimed Tools Appl. doi:10.1007/s11042-013-1664-9
Google Scholar
Li Q, Zheng J, Tsai A, Zhou Q (2002) Robust endpoint detection and energy normalization for real-time speech and speaker recognition. IEEE Trans Speech Audio Process 10(3):146–157
Article Google Scholar
ETSI standard document (2003) Speech processing, transmission and quality aspects (STQ); Distributed speech recognition; Advanced front-end feature extraction algorithm; Compression algorithms, ETSI ES 202 050 v.1.1.3 (2003-11)
Ahmed B, Holmes PH (2004) A voice activity detector using the Chi square test. In: Acoustics, speech, and signal processing, 2004. Proceedings. Royal Melbourne Institute of Technology, Victoria, pp I-625-8
Yamagishi J, Kobayashi T, Nakano Y, Ogata K, Isogai J (2009) Analysis of speaker adaptation algorithms for HMM-based speech synthesis and a constrained SMAPLR adaptation algorithm. IEEE Trans Audio Speech Lang Process 17(1):66–83
Article Google Scholar
Nose T, Yamagishi J, Kobayashi T (2007) A style control technique for HMM-based expressive speech synthesis. IEICE Trans Inf Syst E90-D(9):1406–1413
Article Google Scholar
Zen H, Tokuda K, Masuko T, Kobayashi T, Kitamura T (2007) A hidden semi-Markov model-based speech synthesis system. IEICE Trans Inf Syst E90-D(5):825–834
Article Google Scholar
Yamagishi J, Nose T, Zen H, Toda T, Ling Z-H, Toda T, Tokuda K, King S, Renals S (2009) A robust speaker-adaptive HMM-based text-to-speech synthesis. IEEE Trans Audio Speech Lang Process 17(6):1208–1230
Google Scholar
Oh SY, Chung KY (2013) Target speech feature extraction using non-parametric correlation coefficient. Cluster Comput. doi:10.1007/s10586-013-0284-5
Google Scholar
Kim GH, Kim YG, Chung KY (2013) Towards virtualized and automated software performance test architecture. Multimed Tools Appl. doi:10.1007/s11042-013-1536-3
Google Scholar
Kang SK, Chung KY, Lee JH (2013) Development of head detection and tracking systems for visual surveillance. Pers Ubiquit Comput. doi:10.1007/s00779-013-0668-9
Google Scholar
Kim JH, Chung KY (2013) Ontology-based healthcare context information model to implement ubiquitous environment. Multimed Tools Appl. doi:10.1007/s11042-011-0919-6
Google Scholar
Chung KY (2013) Effect of facial makeup style recommendation on visual sensibility. Multimed Tools Appl. doi:10.1007/s11042-013-1355-6
Google Scholar
Kim SH, Chung KY (2013) 3D simulator for stability analysis of finite slope causing plane activity. Multimed Tools Appl. doi:10.1007/s11042-013-1356-5
Google Scholar
Tuske Z, Mihajlik P, Tobler Z, Fegyo T (2005) Robust voice activity detection based on the entropy of noise suppressed spectrum, interspeech 2005, Lisbon Portugal, pp 245–248
Baek SJ, Han JS, Chung KY (2013) Dynamic reconfiguration based on goal-scenario by adaptation strategy. Wireless Pers Commun. doi:10.1007/s11277-013-1239-0
Google Scholar
Kim SH, Chung KY (2013) Medical information service system based on human 3D anatomical model. Multimed Tools Appl. doi:10.1007/s11042-013-1584-8
Google Scholar
Ko JW, Chung KY, Han JS (2013) Model transformation verification using similarity and graph comparison algorithm. Multimed Tools Appl. doi:10.1007/s11042-013-1581-y
Google Scholar
Kozel D, Apostoaia C (2007) Colored noise reduction using bark scale spectral subtraction, statistics, and multiple time frames. IEEE EIT proceedings 2007, Chicago USA, pp 416–421
Wang KC, Tsai YH (2008) Voice activity detection algorithm with low signal-to-noise ratios based on spectrum entropy. Second international symposium on universal communication 2008, pp 423–428

Download references

Acknowledgments

This work was supported by the Gachon University research fund of 2013 (GCU-2013-R235).

Author information

Authors and Affiliations

Seoul Metro Rapid Transit Media Co., Ltd, 85-2 KT Solution Support Centers, 4th Floor, Yeomri-dong, Mapo-Gu, Seoul, Korea
Chan-Shik Ahn
Department of Interactive Media, Gachon University, Bokjeong-dong, Sujeong-gu, Seongnam-si, Gyeonggi-do, 461-701, Korea
Sang-Yeob Oh

Authors

Chan-Shik Ahn
View author publications
You can also search for this author in PubMed Google Scholar
Sang-Yeob Oh
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sang-Yeob Oh.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ahn, CS., Oh, SY. Robust vocabulary recognition clustering model using an average estimator least mean square filter in noisy environments. Pers Ubiquit Comput 18, 1295–1301 (2014). https://doi.org/10.1007/s00779-013-0732-5

Download citation

Received: 02 July 2013
Accepted: 26 September 2013
Published: 22 October 2013
Issue Date: August 2014
DOI: https://doi.org/10.1007/s00779-013-0732-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Robust vocabulary recognition clustering model using an average estimator least mean square filter in noisy environments

Abstract

Access this article

Similar content being viewed by others

A Review on Kalman Filter Models

An ensemble of optimal smoothing and minima controlled through iterative averaging for speech enhancement under uncontrolled environment

Deep neural network techniques for monaural speech enhancement and separation: state of the art analysis

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Robust vocabulary recognition clustering model using an average estimator least mean square filter in noisy environments

Abstract

Access this article

Similar content being viewed by others

A Review on Kalman Filter Models

An ensemble of optimal smoothing and minima controlled through iterative averaging for speech enhancement under uncontrolled environment

Deep neural network techniques for monaural speech enhancement and separation: state of the art analysis

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation