Abstract
Natural audio-visual interface between human user and machine requires understanding of user’s audio-visual commands. This does not necessarily require full speech and image recognition. It does require, just as the interaction with any working animal does, that the machine is capable of reacting to certain particular sounds and/or gestures while ignoring the rest. Towards this end, we are working on sound identification and classification approaches that would ignore most of the acoustic input and react only to a particular sound (keyword).
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Hermansky, H.: Perceptual linear predictive (PLP) analysis of speech. J. Acoust. Soc. Am. 87(4) (April 1990)
Hermansky, H., Ellis, D.P.W.E., Sharma, S.: Connectionist Feature Extraction for Conventional HMM Systems. In: Proc. of ICASSP 2000, Istanbul, Turkey (2000)
Hermansky, H., Fousek, P.: Multiresolution RASTA filtering for TANDEM-based ASR. In: Proc. of Interspeech 2005, Lisbon, Portugal (September 2005)
Cole, R.A., Noel, M., Lander, T., Durham, T.: New Telephone Speech Corpora at CSLU. In: Proc. of Eurospeech 1995, Madrid, Spain, pp. 821–824 (1995)
Lehtonen, M., Fousek, P., Hermansky, H.: Hierarchical Approach for Spotting Keywords, IDIAP Research Report (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Hermansky, H., Fousek, P., Lehtonen, M. (2005). The Role of Speech in Multimodal Human-Computer Interaction. In: Matoušek, V., Mautner, P., Pavelka, T. (eds) Text, Speech and Dialogue. TSD 2005. Lecture Notes in Computer Science(), vol 3658. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11551874_2
Download citation
DOI: https://doi.org/10.1007/11551874_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-28789-6
Online ISBN: 978-3-540-31817-0
eBook Packages: Computer ScienceComputer Science (R0)