Voice/Non-Voice Classification Using Reliable Fundamental Frequency Estimator for Voice Activated Powered Wheelchair Control

Suk, Soo-Young; Chung, Hyun-Yeol; Kojima, Hiroaki

doi:10.1007/978-3-540-72685-2_33

Voice/Non-Voice Classification Using Reliable Fundamental Frequency Estimator for Voice Activated Powered Wheelchair Control

Soo-Young Suk¹,
Hyun-Yeol Chung² &
Hiroaki Kojima¹

Conference paper

1303 Accesses
2 Citations

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 4523))

Abstract

In this paper, we introduce a non-voice rejection method to perform Voice/Non-Voice (V/NV) classification using a fundamental frequency (F0) estimator called YIN. Although current speech recognition technology has achieved high performance, it is insufficient for some applications where high reliability is required, such as voice control of powered wheelchairs for disabled persons. The V/NV classification algorithm, which rejects non-voice input in Voice Activity Detection (VAD), is helpful for realizing a highly reliable system. The proposed V/NV classification adopts the ratio of a reliable F ₀ contour to the whole input interval. To evaluate the performance of our proposed method, we used 1567 voice commands and 447 noises in powered wheelchair control in a real environment. These results indicate that the recall rate is 97% when the lowest threshold is selected for noise classification with 99% precision in VAD.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Lee, S.W., Tanaka, K., Itoh, Y.: Combining Multiple Subword Representations for Open-Vocabulary Spoken Document Retrieval. In: Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing, pp. 505–508 (2005)
Google Scholar
Sadohara, K., Lee, S.W., Kojima, H.: Topic Segmentation Using Kernel Principal Component Analysis for Sub-Phonetic Segments. Technical Report of IEICE, AI2004-77, pp. 37–41 (2005)
Google Scholar
Suk, S.Y., Lee, S.W., Kojima, H., Makino, S.: Multi-mixture based PDT-SSS Algorithm for Extension of HM-Net Structure. In: Proc. 2005 September Meeting of the Acoustical Society of Japan (2005)
Google Scholar
Sasou, A., Asano, F., Tanaka, K., Nakamura, S.: HMM-Based Feature Compensation Method: An Evaluation Using the AURORA2. In: Proc. Int. Conf. Spoken Language Processing, pp. 121–124 (2004)
Google Scholar
Jonson, D.H., Dudgeon, D.E.: Array signal processing. Prentice Hall, Englewood Cliffs (1993)
Google Scholar
Sasou, A., Kojima, H.: Multi-channel speech input system for a wheelchair. In: Proc. 2006 Mar. Meeting of the Acoustical Society of Japan (2006)
Google Scholar
Rouat, J., Liu, Y.C., Morrisette, D.: A pitch determination and voiced/unvoiced decision algorithm for noisy speech. Speech Communication 21 (1997)
Google Scholar
Ahmadi, S., Andreas, S.S.: Cepstrum-based pitch detection using a new statistical V/UV classification algorithm. IEEE Trans. Speech Audio Processing 7(3), 333–339 (1999)
Article Google Scholar
Mousset, E., Ainsworth, W.A., Fonollosa, J.A.R.: A comparison of several recent methods of fundamental frequency and voicing decision estimation. In: Proc. Int. Conf. Spoken Language Processing, vol. 2, pp. 1273–1276 (1996)
Google Scholar
Lee, A., Kawahara, T., Shikano, K.: Julius — an open source real-time large vocabulary recognition engine. In: Proc. European Conference on Speech Communication and Technology, pp. 1691–1694 (2001)
Google Scholar
de Cheveigne, A., Kawahara, H.: YIN, a fundamental frequency estimator for speech and music. The Journal of the Acoustic Society of the America 111 (2002)
Google Scholar

Download references

Author information

Authors and Affiliations

Information Technology Research Institute, National Institute of Advanced Industrial Science and Technology, AIST Tsukuba Central 2, 1-1-1 Umezono, Tsukuba, Ibaraki, 305-8568, Japan
Soo-Young Suk & Hiroaki Kojima
School of Electrical Engineering and Computer Science, Yeungnam University 214-1, Daedong, Gyungsan, Gyungbuk, 712-749, Korea
Hyun-Yeol Chung

Authors

Soo-Young Suk
View author publications
You can also search for this author in PubMed Google Scholar
Hyun-Yeol Chung
View author publications
You can also search for this author in PubMed Google Scholar
Hiroaki Kojima
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Yann-Hang Lee Heung-Nam Kim Jong Kim Yongwan Park Laurence T. Yang Sung Won Kim

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Suk, SY., Chung, HY., Kojima, H. (2007). Voice/Non-Voice Classification Using Reliable Fundamental Frequency Estimator for Voice Activated Powered Wheelchair Control. In: Lee, YH., Kim, HN., Kim, J., Park, Y., Yang, L.T., Kim, S.W. (eds) Embedded Software and Systems. ICESS 2007. Lecture Notes in Computer Science, vol 4523. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-72685-2_33

Download citation

DOI: https://doi.org/10.1007/978-3-540-72685-2_33
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-72684-5
Online ISBN: 978-3-540-72685-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics