Skip to main content

Event Detection by HMM, SVM and ANN: A Comparative Study

  • Conference paper
Computational Processing of the Portuguese Language (PROPOR 2008)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5190))

Abstract

The goal of speech event detection (SED) is to reveal the presence of important elements in the speech signal for different sound classes. In a speech recognition system, events can be combined to detect phones, words or sentences, or to identify landmarks with which a decoder could be synchronized. In this paper, we introduce three popular classification techniques, HMM, SVM, ANN and Non-Negative Matrix Deconvolution (NMD) for SED. The main purpose of this paper is to compare the performance of (1) HMM, (2) hybrid SVM/NMD (3) hybrid SVM/HMM and (4) hybrid MLP /HMM approaches to SED and emphasize approaches to reaching lower Event Error Rates (EER). It was found that the hybrid SVM/HMM approach outperformed the HMM system. Regarding EER, an improvement of 6% was achieved. The hybrid MLP/HMM got the best EER rate. Improvements of 11% and 8% were found in comparison with the HMM and hybrid SVM/HMM event detector, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Juneja, A., Espy-Wilson, C.: Segmentation of continuous speech using acoustic-phonetic parameters and statistical learning. In: Proc. ICONIP, Singapore (2002)

    Google Scholar 

  2. Lee, D.D., Seung, H.S.: Algorithms for non-negative matrix factorization. Advances in Neural Information Processing Systems 13 (2000)

    Google Scholar 

  3. Bourlard, H., Morgan, N.: Hybrid HMM/ANN Systems for Speech Recognition: Overview and New. Research Directions. Springer, Heidelberg (1997)

    Google Scholar 

  4. Li, J., Lee, C.H.: On Designing and Evaluating Speech Event Detectors. In: Interspeech 2005, Lisbon (2005)

    Google Scholar 

  5. Garofolo, J.S., et al.: TIMIT Acoustic-Phonetic Continuous Speech Corpus. In: NIST (1990)

    Google Scholar 

  6. Schutte, K., Glass, J.: Robust Detection of Sonorant Landmarks. In: Interspeech (2005)

    Google Scholar 

  7. Lopes, C., Perdigão, F.: Hybrid HMM/SVM Speech Event Detector. In: 6th Conference on Telecommunications, Conftele 2007, Peniche, Portugal, vol. 1, pp. 601–604 (May 2007)

    Google Scholar 

  8. Lopes, C., Perdigão, F.: Speech Event Detection By Non Negative Matrix Deconvolution. In: EUSIPCO-2007, Poznan, Poland, vol. 1, pp. 1280–1284 (September 2007)

    Google Scholar 

  9. Riedmiller, M., Braun, H.: A direct adaptive method for faster backpropagation learning: The RPROP algorithm. In: Proc. of the IEEE ICNN, San Francisco (1993)

    Google Scholar 

  10. Prasanna, S.: Event based analysis of speech, in Dept. of Computer Science and Engineering, Ph.D. Thesis: Indian Institute of Technology Madras, India (2004)

    Google Scholar 

  11. Young, S., et al.: The HTK book. Revised for HTK version 3.4. Cambridge University Engineering Department, Cambridge (December 2006)

    Google Scholar 

  12. Smaragdis: Discovering Auditory Objects through Non-Negativity Constraints. In: Statistical and Perceptual Audio Processing (SAPA 2004), Jeju, Korea (2004)

    Google Scholar 

  13. Joachims, T.: Making large-Scale SVM Learning Practical. In: Schölkopf, B., Burges, C., Smola, A. (eds.) Advances in Kernel Methods - Support Vector Learning. MIT-Press, Cambridge (1999)

    Google Scholar 

  14. Vapnik, V.: Statistical Learning Theory. Wiley Inter-science, Chichester (1998)

    MATH  Google Scholar 

  15. Liu, Y.: Structural Event Detection for Rich Transcription of Speech, Ph.D. Thesis: Purdue University (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

António Teixeira Vera Lúcia Strube de Lima Luís Caldas de Oliveira Paulo Quaresma

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Lopes, C., Perdigão, F. (2008). Event Detection by HMM, SVM and ANN: A Comparative Study. In: Teixeira, A., de Lima, V.L.S., de Oliveira, L.C., Quaresma, P. (eds) Computational Processing of the Portuguese Language. PROPOR 2008. Lecture Notes in Computer Science(), vol 5190. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85980-2_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-85980-2_1

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-85979-6

  • Online ISBN: 978-3-540-85980-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics