Automatic Phoneme Border Detection to Improve Speech Recognition

Sergio, Suárez-Guerra; Cristian-Remington, Juárez-Murillo; Luis, Oropeza-Rodríguez José

doi:10.1007/978-3-319-27060-9_11

Suárez-Guerra Sergio¹⁵,
Juárez-Murillo Cristian-Remington¹⁵ &
Oropeza-Rodríguez José Luis¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9413))

Included in the following conference series:

Mexican International Conference on Artificial Intelligence

1124 Accesses

Abstract

A comparative study of speech recognition performance among systems trained with manually labeled corpora and systems trained with semiautomatically labeled corpora is introduced. An automatic labeling system was designed to generate phoneme labels files for all words within the corpus used to train a system of automatic speech recognition. Speech recognition experiments were performed using the same corpus, first training with manually, and later with automatically generated labels. Results show that the recognition performance is better when the training of selected diccionary, is made with automatic label files than when it is made with manual label files. Not only is the automatic labeling of speech corpora faster than manual labeling, but also it is free from the subjectivity inherent in the manual segmentation performed by specialists. The performance achieved in this work is greater than 96 %.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Becchetti, C., Ricotti, L.C.: Speech Recognition. Theory and C ++ Implementation, pp. 305–307. Wiley, New York (1999)
Google Scholar
Fantinato, P.C., et al.: A Fractal-based approach for speech segmentation. In: Tenth IEEE International Symposium on Multimedia, ISM 2008, 15–17 December 2008, pp. 551–555 (2008). doi:10.1109/ISM.2008.123
Galka, J., Ziolko, M.: Wavelets in speech segmentation. In: The 14th IEEE Mediterranean Electrotechnical Conference, MELECON 2008, 5–7 May 2008, pp. 876–879 (2008)
Google Scholar
Hieronymus, J.L.: ASCII Phonetic Symbols for the World’s Languages: Worldbet, pp. 43–44 (1993). http://www.ling.ohio-state.edu/~edwards/WorldBet/worldbet.pdf
Hosom, J.P.: Automatic time alignment of phonemes using acoustic-phonetic information. Oregon Graduate Institute of Science and Technology. Ph.D. thesis, pp. 121–122 (2000)
Google Scholar
Lander, T.: The CSLU Labeling Guide. Center of Spoken Language Understanding, Oregon Graduate Institute, pp. 52–56 (1997). http://www.cslu.ogi.edu/corpora/docs/labeling.pdf
Pineda, L.A., et al.: The Corpus DIMEx100: transcription and evaluation. Lang. Resour. Eval. 44, 347–370 (2009, 2010). doi:10.1007/s10579-009-9109-9
Bansal, P., et al.: Speech synthesis – automatic segmentation. Int. J. Comput. Appl. (0975–8887) 98 (4), (2014)
Google Scholar
Toledano, D.T., Gómez, L.A.H., Grande, L.V.: Automatic phonetic segmentation. IEEE Trans. Speech Audio Process. 11(6), 617–625 (2003). doi:10.1109/TSA.2003.813579
Article Google Scholar
Ziolko, B., Manandhar, S., Wilson, R.C.: Phoneme segmentation of speech. In: 18th International Conference on Pattern Recognition, ICPR 2006, vol. 4, pp. 282–285 (2006). doi:10.1109/ICPR.2006.931

Download references

Acknowledgments

We thank National Polytechnic Institute (IPN - Instituto Politécnico Nacional, México), COFAA-IPN, PIFI-IPN SIP-IPN 20141454 and SIP-IPN 20130617; for their academic and financial support.

Author information

Authors and Affiliations

Centro de Investigación en Computación, Instituto Politécnico Nacional, Mexico City, Mexico
Suárez-Guerra Sergio, Juárez-Murillo Cristian-Remington & Oropeza-Rodríguez José Luis

Authors

Suárez-Guerra Sergio
View author publications
You can also search for this author in PubMed Google Scholar
Juárez-Murillo Cristian-Remington
View author publications
You can also search for this author in PubMed Google Scholar
Oropeza-Rodríguez José Luis
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Suárez-Guerra Sergio .

Editor information

Editors and Affiliations

Instituto Politécnico Nacional, Centro de Investigación en Computación, Mexico City, Mexico
Grigori Sidorov
Facultad de ciencias, Universidad Autónoma Nacional, México, Distrito Federal, Mexico
Sofía N. Galicia-Haro

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sergio, SG., Cristian-Remington, JM., Luis, OR.J. (2015). Automatic Phoneme Border Detection to Improve Speech Recognition. In: Sidorov, G., Galicia-Haro, S. (eds) Advances in Artificial Intelligence and Soft Computing. MICAI 2015. Lecture Notes in Computer Science(), vol 9413. Springer, Cham. https://doi.org/10.1007/978-3-319-27060-9_11

Download citation

DOI: https://doi.org/10.1007/978-3-319-27060-9_11
Published: 30 December 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-27059-3
Online ISBN: 978-3-319-27060-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics