Tree distributions approximation model for robust discrete speech recognition

Hammami, Nacereddine; Bedda, Mouldi; Farah, Nadir

doi:10.1007/s10772-012-9141-9

Tree distributions approximation model for robust discrete speech recognition

Published: 17 April 2012

Volume 15, pages 455–462, (2012)
Cite this article

International Journal of Speech Technology Aims and scope Submit manuscript

Nacereddine Hammami¹,
Mouldi Bedda² &
Nadir Farah¹

237 Accesses
1 Citation
Explore all metrics

Abstract

This paper proposes a new discrete speech recognition method which investigates the capability of graphical models based on tree distributions that are widely used in many optimization areas. A novel spanning tree structure that utilizes the temporal nature of speech signal is proposed. The proposed tree structure significantly reduces complexity in so far that can reflect simply a few essential relationships rather than all possible structures of trees. The application of this model is illustrated with different isolated word databases. Experimentally it has been shown that, the proposed approaches compared to the conventional discrete hidden Markov model (DHMM) yield reduced error rates of 2.54 %–12 % and improve recognition speed minimum 3-fold. In addition, an impressive gain in learning time is observed. The overall recognition accuracy was 93.09 %–95.34 %, thereby confirming the effectiveness of the proposed methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A survey of methods for time series change point detection

Article 08 September 2016

The cocktail-party problem revisited: early processing and selection of multi-talker speech

Article Open access 01 April 2015

Detecting Speech Disorders Using A Machine-Learning Guided Method in Spontaneous Tunisian Dialect Speech

Article 17 April 2024

References

Bilmes, J. A., & Bartels, C. (2005). Graphical model architectures for speech recognition. IEEE Signal Processing Magazine, 22, 89–100.
Article Google Scholar
Chow, C., & Liu, C. (1968). Approximating discrete probability distributions with dependence trees. IEEE Transactions on Information Theory, 14, 462–467.
Article MATH Google Scholar
El Fkihi, S., Daoudi, M., & Aboutajdine, D. (2008). The mixture of k-optimal-spanning-trees based probability approximation: application to skin detection. Image and Vision Computing, 26, 1574–1590.
Article Google Scholar
Gormen, T. H., Leiserson, C. E., Rivest, R. L., & Stein, C. (1990). Introduction to algorithms. Cambridge: MIT Press.
Google Scholar
Gray, R. (1984). Vector quantization. IEEE ASSP Magazine , 1, 4–29.
Article Google Scholar
Hammami, N., & Sellam, M. (2009). Tree distribution classifier for automatic spoken Arabic digit recognition. In International conference for internet technology and secured transactions, 2009, IEEE, ICITST 2009 (pp. 1–4).
Google Scholar
Hammami, N., Beda, M., & Farah, N. (2011). HMM parameters estimation based on cross-validation for spoken Arabic digits recognition. In IEEE international conference on communications, computing and control applications (CCCA) (pp. 1–4).
Chapter Google Scholar
Ioffe, S., & Forsyth, D. (2001). Mixtures of trees for object recognition. In Proceedings of the 2001 IEEE computer society conference on computer vision and pattern recognition, 2001, CVPR 2001 (Vol. 2, pp. II-180–II-185).
Chapter Google Scholar
Kudo, M., Toyama, J., & Shimbo, M. (1999). Japanese Vowels. UCI machine learning repository. http://archive.ics.uci.edu/ml/datasets/Japanese+Vowels.
Meila, M. (1999). An accelerated Chow and Liu algorithm: fitting tree distributions to high dimensional sparse data.
Miguel, A., Ortega, A., Buera, L., & Lleida, E. (2011). Bayesian networks for discrete observation distributions in speech recognition. IEEE Transactions on Audio, Speech, and Language Processing, 19, 1476–1489.
Article Google Scholar
Pearl, J. (1988). Probabilistic reasoning in intelligent systems: networks of plausible inference. San Mateo: Morgan Kaufmann.
Google Scholar
Rabiner, L. R. (1989). A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE, 77, 257–286.
Article Google Scholar
Songfang, H., & Renals, S. (2010). Hierarchical Bayesian language models for conversational speech recognition. IEEE Transactions on Audio, Speech, and Language Processing, 18, 1941–1954.
Article Google Scholar
Tan, V. Y. F., Anandkumar, A., & Willsky, A. S. (2010). Learning Gaussian tree models: analysis of error exponents and extremal structures. IEEE Transactions on Signal Processing, 58, 2701–2714.
Article MathSciNet Google Scholar
Tan, V. Y. F., Anandkumar, A., Lang, T., & Willsky, A. S. (2011). A large-deviation analysis of the maximum-likelihood learning of Markov tree structures. IEEE Transactions on Information Theory, 57, 1714–1735.
Article Google Scholar
Torsello, A., & Hancock, E. R. (2006). Learning shape-classes using a mixture of tree-unions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28, 954–967.
Article Google Scholar
U. o. B.-M. Laboratory of Automatic and Signals (2008). Spoken Arabic digits. UCI machine learning repository. http://archive.ics.uci.edu/ml/datasets/Spoken+Arabic+Digit.
Wiesel, A., Eldar, Y. C., & Hero, A. O. (2010). Covariance estimation in decomposable Gaussian graphical models. IEEE Transactions on Signal Processing, 58, 1482–1492.
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Laboratoire LabGed, Universite Badji Mokhtar Annaba, Annaba, 23000, Algeria
Nacereddine Hammami & Nadir Farah
College of Engineering, Aljouf University, Sakaka, KSA
Mouldi Bedda

Authors

Nacereddine Hammami
View author publications
You can also search for this author in PubMed Google Scholar
Mouldi Bedda
View author publications
You can also search for this author in PubMed Google Scholar
Nadir Farah
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nacereddine Hammami.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hammami, N., Bedda, M. & Farah, N. Tree distributions approximation model for robust discrete speech recognition. Int J Speech Technol 15, 455–462 (2012). https://doi.org/10.1007/s10772-012-9141-9

Download citation

Received: 03 January 2012
Accepted: 14 March 2012
Published: 17 April 2012
Issue Date: December 2012
DOI: https://doi.org/10.1007/s10772-012-9141-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Tree distributions approximation model for robust discrete speech recognition

Abstract

Access this article

Similar content being viewed by others

A survey of methods for time series change point detection

The cocktail-party problem revisited: early processing and selection of multi-talker speech

Detecting Speech Disorders Using A Machine-Learning Guided Method in Spontaneous Tunisian Dialect Speech

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Tree distributions approximation model for robust discrete speech recognition

Abstract

Access this article

Similar content being viewed by others

A survey of methods for time series change point detection

The cocktail-party problem revisited: early processing and selection of multi-talker speech

Detecting Speech Disorders Using A Machine-Learning Guided Method in Spontaneous Tunisian Dialect Speech

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation