Event Detection by HMM, SVM and ANN: A Comparative Study

Lopes, Carla; Perdigão, Fernando

doi:10.1007/978-3-540-85980-2_1

Carla Lopes^1,2 &
Fernando Perdigão^1,3

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5190))

Included in the following conference series:

International Conference on Computational Processing of the Portuguese Language

684 Accesses
1 Citations

Abstract

The goal of speech event detection (SED) is to reveal the presence of important elements in the speech signal for different sound classes. In a speech recognition system, events can be combined to detect phones, words or sentences, or to identify landmarks with which a decoder could be synchronized. In this paper, we introduce three popular classification techniques, HMM, SVM, ANN and Non-Negative Matrix Deconvolution (NMD) for SED. The main purpose of this paper is to compare the performance of (1) HMM, (2) hybrid SVM/NMD (3) hybrid SVM/HMM and (4) hybrid MLP /HMM approaches to SED and emphasize approaches to reaching lower Event Error Rates (EER). It was found that the hybrid SVM/HMM approach outperformed the HMM system. Regarding EER, an improvement of 6% was achieved. The hybrid MLP/HMM got the best EER rate. Improvements of 11% and 8% were found in comparison with the HMM and hybrid SVM/HMM event detector, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Juneja, A., Espy-Wilson, C.: Segmentation of continuous speech using acoustic-phonetic parameters and statistical learning. In: Proc. ICONIP, Singapore (2002)
Google Scholar
Lee, D.D., Seung, H.S.: Algorithms for non-negative matrix factorization. Advances in Neural Information Processing Systems 13 (2000)
Google Scholar
Bourlard, H., Morgan, N.: Hybrid HMM/ANN Systems for Speech Recognition: Overview and New. Research Directions. Springer, Heidelberg (1997)
Google Scholar
Li, J., Lee, C.H.: On Designing and Evaluating Speech Event Detectors. In: Interspeech 2005, Lisbon (2005)
Google Scholar
Garofolo, J.S., et al.: TIMIT Acoustic-Phonetic Continuous Speech Corpus. In: NIST (1990)
Google Scholar
Schutte, K., Glass, J.: Robust Detection of Sonorant Landmarks. In: Interspeech (2005)
Google Scholar
Lopes, C., Perdigão, F.: Hybrid HMM/SVM Speech Event Detector. In: 6th Conference on Telecommunications, Conftele 2007, Peniche, Portugal, vol. 1, pp. 601–604 (May 2007)
Google Scholar
Lopes, C., Perdigão, F.: Speech Event Detection By Non Negative Matrix Deconvolution. In: EUSIPCO-2007, Poznan, Poland, vol. 1, pp. 1280–1284 (September 2007)
Google Scholar
Riedmiller, M., Braun, H.: A direct adaptive method for faster backpropagation learning: The RPROP algorithm. In: Proc. of the IEEE ICNN, San Francisco (1993)
Google Scholar
Prasanna, S.: Event based analysis of speech, in Dept. of Computer Science and Engineering, Ph.D. Thesis: Indian Institute of Technology Madras, India (2004)
Google Scholar
Young, S., et al.: The HTK book. Revised for HTK version 3.4. Cambridge University Engineering Department, Cambridge (December 2006)
Google Scholar
Smaragdis: Discovering Auditory Objects through Non-Negativity Constraints. In: Statistical and Perceptual Audio Processing (SAPA 2004), Jeju, Korea (2004)
Google Scholar
Joachims, T.: Making large-Scale SVM Learning Practical. In: Schölkopf, B., Burges, C., Smola, A. (eds.) Advances in Kernel Methods - Support Vector Learning. MIT-Press, Cambridge (1999)
Google Scholar
Vapnik, V.: Statistical Learning Theory. Wiley Inter-science, Chichester (1998)
MATH Google Scholar
Liu, Y.: Structural Event Detection for Rich Transcription of Speech, Ph.D. Thesis: Purdue University (2004)
Google Scholar

Download references

Author information

Authors and Affiliations

Instituto de Telecomunicações,
Carla Lopes & Fernando Perdigão
Instituto Politécnico de Leiria-ESTG,
Carla Lopes
Universidade de Coimbra - DEEC, Pólo II, P-3030-290 Coimbra, Portugal
Fernando Perdigão

Authors

Carla Lopes
View author publications
You can also search for this author in PubMed Google Scholar
Fernando Perdigão
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

António Teixeira Vera Lúcia Strube de Lima Luís Caldas de Oliveira Paulo Quaresma

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lopes, C., Perdigão, F. (2008). Event Detection by HMM, SVM and ANN: A Comparative Study. In: Teixeira, A., de Lima, V.L.S., de Oliveira, L.C., Quaresma, P. (eds) Computational Processing of the Portuguese Language. PROPOR 2008. Lecture Notes in Computer Science(), vol 5190. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85980-2_1

Download citation

DOI: https://doi.org/10.1007/978-3-540-85980-2_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-85979-6
Online ISBN: 978-3-540-85980-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics