Skip to main content

Voice/Non-Voice Classification Using Reliable Fundamental Frequency Estimator for Voice Activated Powered Wheelchair Control

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 4523))

Abstract

In this paper, we introduce a non-voice rejection method to perform Voice/Non-Voice (V/NV) classification using a fundamental frequency (F0) estimator called YIN. Although current speech recognition technology has achieved high performance, it is insufficient for some applications where high reliability is required, such as voice control of powered wheelchairs for disabled persons. The V/NV classification algorithm, which rejects non-voice input in Voice Activity Detection (VAD), is helpful for realizing a highly reliable system. The proposed V/NV classification adopts the ratio of a reliable F 0 contour to the whole input interval. To evaluate the performance of our proposed method, we used 1567 voice commands and 447 noises in powered wheelchair control in a real environment. These results indicate that the recall rate is 97% when the lowest threshold is selected for noise classification with 99% precision in VAD.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Lee, S.W., Tanaka, K., Itoh, Y.: Combining Multiple Subword Representations for Open-Vocabulary Spoken Document Retrieval. In: Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing, pp. 505–508 (2005)

    Google Scholar 

  2. Sadohara, K., Lee, S.W., Kojima, H.: Topic Segmentation Using Kernel Principal Component Analysis for Sub-Phonetic Segments. Technical Report of IEICE, AI2004-77, pp. 37–41 (2005)

    Google Scholar 

  3. Suk, S.Y., Lee, S.W., Kojima, H., Makino, S.: Multi-mixture based PDT-SSS Algorithm for Extension of HM-Net Structure. In: Proc. 2005 September Meeting of the Acoustical Society of Japan (2005)

    Google Scholar 

  4. Sasou, A., Asano, F., Tanaka, K., Nakamura, S.: HMM-Based Feature Compensation Method: An Evaluation Using the AURORA2. In: Proc. Int. Conf. Spoken Language Processing, pp. 121–124 (2004)

    Google Scholar 

  5. Jonson, D.H., Dudgeon, D.E.: Array signal processing. Prentice Hall, Englewood Cliffs (1993)

    Google Scholar 

  6. Sasou, A., Kojima, H.: Multi-channel speech input system for a wheelchair. In: Proc. 2006 Mar. Meeting of the Acoustical Society of Japan (2006)

    Google Scholar 

  7. Rouat, J., Liu, Y.C., Morrisette, D.: A pitch determination and voiced/unvoiced decision algorithm for noisy speech. Speech Communication 21 (1997)

    Google Scholar 

  8. Ahmadi, S., Andreas, S.S.: Cepstrum-based pitch detection using a new statistical V/UV classification algorithm. IEEE Trans. Speech Audio Processing 7(3), 333–339 (1999)

    Article  Google Scholar 

  9. Mousset, E., Ainsworth, W.A., Fonollosa, J.A.R.: A comparison of several recent methods of fundamental frequency and voicing decision estimation. In: Proc. Int. Conf. Spoken Language Processing, vol. 2, pp. 1273–1276 (1996)

    Google Scholar 

  10. Lee, A., Kawahara, T., Shikano, K.: Julius — an open source real-time large vocabulary recognition engine. In: Proc. European Conference on Speech Communication and Technology, pp. 1691–1694 (2001)

    Google Scholar 

  11. de Cheveigne, A., Kawahara, H.: YIN, a fundamental frequency estimator for speech and music. The Journal of the Acoustic Society of the America 111 (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Yann-Hang Lee Heung-Nam Kim Jong Kim Yongwan Park Laurence T. Yang Sung Won Kim

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer Berlin Heidelberg

About this paper

Cite this paper

Suk, SY., Chung, HY., Kojima, H. (2007). Voice/Non-Voice Classification Using Reliable Fundamental Frequency Estimator for Voice Activated Powered Wheelchair Control. In: Lee, YH., Kim, HN., Kim, J., Park, Y., Yang, L.T., Kim, S.W. (eds) Embedded Software and Systems. ICESS 2007. Lecture Notes in Computer Science, vol 4523. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-72685-2_33

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-72685-2_33

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-72684-5

  • Online ISBN: 978-3-540-72685-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics