A New Hierarchical Decision Structure Using Wavelet Packet and SVM for Brazilian Phonemes Recognition

de A. Bresolin, Adriano; Neto, Adrião Duarte D.; Alsina, Pablo Javier

doi:10.1007/11893257_18

Adriano de A. Bresolin²⁰,
Adrião Duarte D. Neto²¹ &
Pablo Javier Alsina²¹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4233))

Included in the following conference series:

International Conference on Neural Information Processing

1323 Accesses
1 Citations

Abstract

In this work, a new phonemes recognition system is proposed. The base of decision of the proposed system is the tongue position and roundedness of the lips. The features of the speech are the coefficients of Wavelet Packet Transform with sub-bands selected through the Mel scale. The SVM (Support Vector Machine) is used as classifier in the structure of a Hierarchical Committee Machine. The database used for the recognition was a set of oral vocalic phonemes of the Portuguese language. The experimental results show success rates of 97.50% for the user-dependent case and 91.01% for the user-independent case. This new proposal increased 3.5% the success rate in relation to the “one vs. all” decision strategy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Burrus, S.C., Gopinath, R.A., Guo, H.: Introduction to Wavelets and Wavelets Transforms. Prentice Hall, New Jersey (1998)
Google Scholar
Daubechies, I.: The Wavelet Transform, time-frequency localization and signal analysis. IEEE Trans. Inf. Theory, 961–1005 (1990)
Google Scholar
Duda, R.O., Hart, P.E.: Pattern classification and scene analysis. John Wiley & Sons, New York (1973)
MATH Google Scholar
Farooq, O., Datta, S.: Mel filter-like admissible wavelet packet structure for speech recognition. IEEE Signal Processing Letters 08(07), 196–198 (2001)
Article Google Scholar
Gowdy, J.N., Tufekci, Z.: Mel-scaled discrete wavelet coefficients for speech recognition. In: Proc. of the IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 1351–1354 (2000)
Google Scholar
Haykin, S.: Redes Neurais, Princípios e prática. 2^a Edição, Porto Alegre, Editora Bookman (2001)
Google Scholar
Hosom, J.P.: Automatic Phoneme Alignment Based on Acoustic-Phonetic Modeling. In: International Conference on Spoken Language Processing-ICSLP 2002, September 2002, vol. I, pp. 357–360. Boulder, Co. (2002)
Google Scholar
Juneja, A., Espy-Wilson, C.: Speech segmentation using probabilistic phonetic feature hierarchy and support vector machines. In: Proceedings of International Joint Conference on Neural Networks, Portland, Oregan (2003)
Google Scholar
Russell, M.J., Bilmes, J.A.: Introduction to the special issue on new computational paradigms for acoustic modeling in speech recognition. Editorial, Computer Speech and Language 17, 107–112 (2003)
Article Google Scholar
Santos, S.C., Alcaim, A.: Sílabas como unidades fonéticas para o reconhecimento de voz em português. SBA Controle & Automação 12(01) (2001)
Google Scholar
Silva, T.C.: Fonética e Fonologia do Português. 7º Edição, Paulo, S. (ed.) Contexto (2003)
Google Scholar
Stevens, S.S., Volkman, J., e Newman, E.B.: A Scale for Measurement of the Psychological Magnitude Picth. Journal of the Acoustical Society of America 08, 185–190 (1937)
Article Google Scholar
Vapnik, V.N.: Principles of risk minimization for learning theory. Advances in Neural Information Processing Systems 04, 831–838 (1992)
Google Scholar
Young, S.: A Review of Large-Vocabulary Continuous-Speech Recognition. IEEE Signal Processing Magazine, 45–57 (September 1996)
Google Scholar

Download references

Author information

Authors and Affiliations

UTFPR – Technological Federal University of the Paraná – Brazil, 271, Av. Brasil, 4232, Medianeira, PR, CEP 85.884-000, Brazil
Adriano de A. Bresolin
UFRN – Federal University of the Rio Grande do Norte – Brazil, 1524, Campus Universitário Lagoa Nova, CEP 59072-970, Natal, RN, Brazil
Adrião Duarte D. Neto & Pablo Javier Alsina

Authors

Adriano de A. Bresolin
View author publications
You can also search for this author in PubMed Google Scholar
Adrião Duarte D. Neto
View author publications
You can also search for this author in PubMed Google Scholar
Pablo Javier Alsina
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dept. of Computer Science and Engineering, The Chinese Univ. of Hong Kong, Shatin, N.T., Hong Kong
Irwin King
Department of Mechanical and Automation Engineering, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong, China
Jun Wang
The Chinese University of Hong Kong, Shatin, N.T., Hong Kong
Lai-Wan Chan
Department of Computer Science and Engineering & Center for Cognitive Science, The Ohio State University, OH 43210, Columbus
DeLiang Wang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

de A. Bresolin, A., Neto, A.D.D., Alsina, P.J. (2006). A New Hierarchical Decision Structure Using Wavelet Packet and SVM for Brazilian Phonemes Recognition. In: King, I., Wang, J., Chan, LW., Wang, D. (eds) Neural Information Processing. ICONIP 2006. Lecture Notes in Computer Science, vol 4233. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11893257_18

Download citation

DOI: https://doi.org/10.1007/11893257_18
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-46481-5
Online ISBN: 978-3-540-46482-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics