Skip to main content

HMM-Based Acoustic Event Detection with AdaBoost Feature Selection

  • Conference paper
Multimodal Technologies for Perception of Humans (RT 2007, CLEAR 2007)

Abstract

Because of the spectral difference between speech and acous- tic events, we propose using Kullback-Leibler distance to quantify the discriminant capability of all speech feature components in acoustic event detection. Based on these distances, we use AdaBoost to select a discriminant feature set and demonstrate that this feature set outperforms classical speech feature set such as MFCC in one-pass HMM-based acoustic event detection. We implement an HMM-based acoustic events detection system with lattice rescoring using a feature set selected by the above AdaBoost based approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Beaufays, F., Boies, D., Weintraub, M., Zhu, Q.: Using speech/non-speech detection to bias recognition search on noisy data. In: ICASSP 2003, vol. I, pp. 424–427 (2003)

    Google Scholar 

  2. CHIL. Computers in the human interaction loop (2006), http://chil.server.de/

  3. Clavel, C., Ehrette, T., Richard, G.: Events detection for an audio-based surveillance system. In: ICME 2005, pp. 1306–1309 (2005)

    Google Scholar 

  4. Cui, R., Lu, L., Zhung, H.-J., Cai, L.-H.: Highlight sound effects detection in audio stream. In: ICME 2003, vol. III, pp. 37–40 (2003)

    Google Scholar 

  5. Freund, Y., Schapire, R.E.: A short introduction to boosting. Journal of Japanese Society for Artificial Intelligence 14(5), 771–780 (1999)

    Google Scholar 

  6. Hermansky, H.: Mel cepstrum, deltas, double deltas... what else is new? In: Proc. Robust Methods for Speech Recognition in Adverse Condition (1999)

    Google Scholar 

  7. Krishnamurthy, V., Moore, J.: On-line estimation of hidden markov model parameters based on the kullback-leibler information measure. IEEE Trans. on Signal Processing 41(8), 2557–2573 (1993)

    Article  MATH  Google Scholar 

  8. Martin, A., Mauuary, L.: Voicing parameter and energy based speech/non-speech detection for speech recognition in adverse conditions. In: Interspeech 2003, pp. I 3069–3072 (2003)

    Google Scholar 

  9. Ratsch, G., Onoda, T., Muller, K.-R.: Soft margins for adaboost. IEEE Trans. on Signal Processing 42, 287–320 (2001)

    Google Scholar 

  10. Schòlkopf, B., Smola, A.: Learning with Kernels. MIT Press, Cambridge (2002)

    Google Scholar 

  11. Temko, A.: Clear 2007 AED evaluation plan (2007), http://isl.ira.uka.de/clear07

  12. Temko, A., Malkin, R., Zieger, C., Macho, D., Nadeu, C., Omologo, M.: Acoustic event detection and classification in smart-room environments: Evaluation of chil project systems. Cough 65, 5–11 (2006)

    Google Scholar 

  13. Temko, A., Nadeu, C.: Classification of meeting-room acoustic events with support vector machines and variable-feature-set clustering. In: ICASSP 2005, vol. V, pp. 505–508 (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Rainer Stiefelhagen Rachel Bowers Jonathan Fiscus

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Zhou, X., Zhuang, X., Liu, M., Tang, H., Hasegawa-Johnson, M., Huang, T. (2008). HMM-Based Acoustic Event Detection with AdaBoost Feature Selection. In: Stiefelhagen, R., Bowers, R., Fiscus, J. (eds) Multimodal Technologies for Perception of Humans. RT CLEAR 2007 2007. Lecture Notes in Computer Science, vol 4625. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-68585-2_33

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-68585-2_33

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-68584-5

  • Online ISBN: 978-3-540-68585-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics