Duration Model-Based Post-processing for the Performance Improvement of a Keyword Spotting System

Lee, Min Ji; Yoon, Jae Sam; Oh, Yoo Rhee; Kim, Hong Kook; Choi, Song Ha; Kim, Ji Woon; Kim, Myeong Bo

doi:10.1007/978-3-642-17604-3_16

Min Ji Lee⁷,
Jae Sam Yoon⁷,
Yoo Rhee Oh⁷,
Hong Kook Kim⁷,
Song Ha Choi⁸,
Ji Woon Kim⁸ &
…
Myeong Bo Kim⁸

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 120))

Included in the following conference series:

International Conference on Future Generation Communication and Networking

985 Accesses

Abstract

In this paper, we propose a post-processing method based on a duration model to improve the performance of a keyword spotting system. The proposed duration model-based post-processing method is performed after detecting a keyword. To detect the keyword, we first combine a keyword model, a non-keyword model, and a silence model. Using the information on the detected keyword, the proposed post-processing method is then applied to determine whether or not the correct keyword is detected. To this end, we generate the duration model using Gaussian distribution in order to accommodate different duration characteristics of each phoneme. Comparing the performance of the proposed method with those of conventional anti-keyword scoring methods, it is shown that the false acceptance and the false rejection rates are reduced.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

A Pitch and Noise Robust Keyword Spotting System Using SMAC Features with Prosody Modification

Article 27 October 2020

Dynamic Thresholding with Short-Time Signal Features in Continuous Bangla Speech Segmentation

A survey on structured discriminative spoken keyword spotting

Article 27 July 2019

References

Kim, M.J., Lee, J.C.: Non-keyword model for the improvement of vocabulary independent keyword spotting system. In: Proceedings of Acoustical Society of Korea Conference, vol. 25, pp. 319–324 (2006)
Google Scholar
Rose, R.C., Paul, D.B.: A hidden Markov model based keyword recognition system. In: Proceedings of ICASSP, pp. 129–132 (1990)
Google Scholar
Li, X.Q., King, I.: Gaussian mixture distance for information retrieval. In: Proceedings of International Conference on Neural Networks, pp. 2544–2549 (1999)
Google Scholar
Johnson, D.H., Sinanović, S.: Symmetrizing the Kullback–Leibler Distance. Rice University, Houston, TX, Technical Report (2001)
Google Scholar
Kim, Y.K., Song, H.J., Kim, H.S.: Performance evaluation of non-keyword modeling for vocabulary-independent keyword spotting. In: Proceedings of International Symposium on Chinese Spoken Language Processing, pp. 420–430 (2006)
Google Scholar
ETSI ES 202 050, Speech Processing, Transmission and Quality Aspects (STQ); Distribution Speech Recognition; Advanced Feature Extraction Algorithm (2002)
Google Scholar
Kim, B.W., Choi, D.L., Kim, Y.I., Lee, K.H., Lee, Y.J.: Current state and future plans at SiTEC for speech corpora for common use, Malsori, pp. 175–186 (2003)
Google Scholar
Kim, S., Oh, S., Jung, H.Y., Jeong, H.B., Kim, J.S.: Common speech database collection. In: Proceedings of Acoustical Society of Korea Conference, pp. 21–24 (2002)
Google Scholar
Zavagliakos, D., Schwartz, R., McDonough, J.: Maximum a posteriori adaptation for large scale HMM recognizers. In: Proceedings of ICASSP, pp. 725–728 (1996)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Information and Communications, Gwangju Institute of Science and Technology (GIST), Gwangju, 500-712, Korea
Min Ji Lee, Jae Sam Yoon, Yoo Rhee Oh & Hong Kook Kim
Camcorder Business Team, Digital Media Business, Samsung Electronics, Suwon-si, Gyenggi-do, 443-742, Korea
Song Ha Choi, Ji Woon Kim & Myeong Bo Kim

Authors

Min Ji Lee
View author publications
You can also search for this author in PubMed Google Scholar
Jae Sam Yoon
View author publications
You can also search for this author in PubMed Google Scholar
Yoo Rhee Oh
View author publications
You can also search for this author in PubMed Google Scholar
Hong Kook Kim
View author publications
You can also search for this author in PubMed Google Scholar
Song Ha Choi
View author publications
You can also search for this author in PubMed Google Scholar
Ji Woon Kim
View author publications
You can also search for this author in PubMed Google Scholar
Myeong Bo Kim
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Hannam University, Daejeon, South Korea
Tai-hoon Kim
University of Western Macedonia, Kozani, Greece
Thanos Vasilakos
Faculty of Information Science and Electrical Engineering, Kyushu University, 6-10-1 Hakozaki, 812-8581, Fukuoka, Japan
Kouichi Sakurai
The University of Alabama, Tuscaloosa, AL, USA
Yang Xiao
Sun Yat-sen University, 510275, Guangzhou, P.R. China
Gansen Zhao
University of Warsaw & Infobright Inc., Poland
Dominik Ślęzak

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lee, M.J. et al. (2010). Duration Model-Based Post-processing for the Performance Improvement of a Keyword Spotting System. In: Kim, Th., Vasilakos, T., Sakurai, K., Xiao, Y., Zhao, G., Ślęzak, D. (eds) Communication and Networking. FGCN 2010. Communications in Computer and Information Science, vol 120. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-17604-3_16

Download citation

DOI: https://doi.org/10.1007/978-3-642-17604-3_16
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-17603-6
Online ISBN: 978-3-642-17604-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics