Skip to main content

Automatic Phoneme Border Detection to Improve Speech Recognition

  • Conference paper
  • First Online:
Advances in Artificial Intelligence and Soft Computing (MICAI 2015)

Abstract

A comparative study of speech recognition performance among systems trained with manually labeled corpora and systems trained with semiautomatically labeled corpora is introduced. An automatic labeling system was designed to generate phoneme labels files for all words within the corpus used to train a system of automatic speech recognition. Speech recognition experiments were performed using the same corpus, first training with manually, and later with automatically generated labels. Results show that the recognition performance is better when the training of selected diccionary, is made with automatic label files than when it is made with manual label files. Not only is the automatic labeling of speech corpora faster than manual labeling, but also it is free from the subjectivity inherent in the manual segmentation performed by specialists. The performance achieved in this work is greater than 96 %.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Becchetti, C., Ricotti, L.C.: Speech Recognition. Theory and C ++ Implementation, pp. 305–307. Wiley, New York (1999)

    Google Scholar 

  • Fantinato, P.C., et al.: A Fractal-based approach for speech segmentation. In: Tenth IEEE International Symposium on Multimedia, ISM 2008, 15–17 December 2008, pp. 551–555 (2008). doi:10.1109/ISM.2008.123

  • Galka, J., Ziolko, M.: Wavelets in speech segmentation. In: The 14th IEEE Mediterranean Electrotechnical Conference, MELECON 2008, 5–7 May 2008, pp. 876–879 (2008)

    Google Scholar 

  • Hieronymus, J.L.: ASCII Phonetic Symbols for the World’s Languages: Worldbet, pp. 43–44 (1993). http://www.ling.ohio-state.edu/~edwards/WorldBet/worldbet.pdf

  • Hosom, J.P.: Automatic time alignment of phonemes using acoustic-phonetic information. Oregon Graduate Institute of Science and Technology. Ph.D. thesis, pp. 121–122 (2000)

    Google Scholar 

  • Lander, T.: The CSLU Labeling Guide. Center of Spoken Language Understanding, Oregon Graduate Institute, pp. 52–56 (1997). http://www.cslu.ogi.edu/corpora/docs/labeling.pdf

  • Pineda, L.A., et al.: The Corpus DIMEx100: transcription and evaluation. Lang. Resour. Eval. 44, 347–370 (2009, 2010). doi:10.1007/s10579-009-9109-9

  • Bansal, P., et al.: Speech synthesis – automatic segmentation. Int. J. Comput. Appl. (0975–8887) 98 (4), (2014)

    Google Scholar 

  • Toledano, D.T., Gómez, L.A.H., Grande, L.V.: Automatic phonetic segmentation. IEEE Trans. Speech Audio Process. 11(6), 617–625 (2003). doi:10.1109/TSA.2003.813579

    Article  Google Scholar 

  • Ziolko, B., Manandhar, S., Wilson, R.C.: Phoneme segmentation of speech. In: 18th International Conference on Pattern Recognition, ICPR 2006, vol. 4, pp. 282–285 (2006). doi:10.1109/ICPR.2006.931

Download references

Acknowledgments

We thank National Polytechnic Institute (IPN - Instituto Politécnico Nacional, México), COFAA-IPN, PIFI-IPN SIP-IPN 20141454 and SIP-IPN 20130617; for their academic and financial support.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Suárez-Guerra Sergio .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Sergio, SG., Cristian-Remington, JM., Luis, OR.J. (2015). Automatic Phoneme Border Detection to Improve Speech Recognition. In: Sidorov, G., Galicia-Haro, S. (eds) Advances in Artificial Intelligence and Soft Computing. MICAI 2015. Lecture Notes in Computer Science(), vol 9413. Springer, Cham. https://doi.org/10.1007/978-3-319-27060-9_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-27060-9_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-27059-3

  • Online ISBN: 978-3-319-27060-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics