Abstract
Acquiring knowledge about persons is a key functionality for humanoid robots. In a natural environment, the robot not only interacts with different people who he recognizes and who he knows. He will also have to interact with unknown persons, and by acquiring information about them, the robot can memorize these persons and provide extended personalized services. Today, researchers build systems to recognize a person’s face, voice and other features. Most of them depend on pre-collected data. We think that with the given technology it is about time to build a system that collects data autonomously and thus gets to know and learns to recognize persons completely on its own.
This paper describes the integration of different perceptual and dialog components and their individual functionality to build a robot that can contact persons, learns their names, and learns to recognize them in future encounters.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Fong, T., Nourbakhsh, I., Dautenhahn, K.: A survey of socially interactive robots. Robotics and Autonomous Systems 42, 143–166 (2003)
Breazeal, C.: Social Interactions in HRI: The Robot View. IEEE Transactions on Man Cybernetics and Systems 34(2), 181–186 (2004)
Byers, Z., et al.: Say Cheese!: Experiences with a Robot Photographer. AI Magazine 25(3), 37–46 (2004)
Schulz, D., et al.: Tracking Multiple Moving Targets with a Mobile Robot using Particle Filters and Statistical Data Association. In: Int. Conf. on Robotics and Automation (ICRA), New Orleans, IEEE Press, Los Alamitos (2001)
Lang, S., et al.: Providing the basis for human-robot-interaction: A multi-modal attention system for a mobile robot. In: Proceedings of the Int. Conf. on Multimodal Interfaces, Vancouver, Canada, pp. 28–35 (2003)
Sidner, C., et al.: Where to look: A study of human-robot engagement. In: Proceedings of the International Conference on Intelligent User Interfaces (IUI), pp. 78–84. ACM, New York (2004)
Dusan, S., Flanagan, J.: Adaptive Dialog Based upon Multimodal Language Acquisition. In: Proceedings of the Fourth Int. Conf. on Multimodal Interfaces, Pittsburgh, PA, USA (2002)
Dusan, S., Flanagan, J.: A System for Multimodal Dialogue and Language Acquisition. Invited, The 2nd Romanian Academy Conference on Speech Technology and Human-Computer Dialogue, Romanian Academy, Bucharest, Romania (2003)
Gorniak, P., Roy, D.: Probabilistic Grounding of Situated Speech using Plan Recognition and Reference Resolution. In: Proceedings of the Seventh International Conference on Multimodal Interfaces (2005)
Chung, G., Seneff, S.: Integrating Speech with Keypad Input for Automatic Entry of Spelling and Pronunciation of New Words. In: Proceedings of ICSLP’02, Denver, CO, USA, pp. 2061–2064 (2002)
Chung, G., et al.: A Dynamic Vocabulary Spoken Dialogue Interface. In: Proceedings of ICSLP’04, Jeju Island, Korea (2004)
Scharenborg, O., Seneff, S.: Two-pass strategy for handling OOVs in a large vocabulary recognition task. In: Proceedings of Interspeech’05, pp. 1669–1672 (2005)
Schaa, C.: Proaktive Initiierung von Dialogen für humanoide Roboter. Diploma Thesis, Universität Karlsruhe (2005)
Ekenel, H.K., Stiefelhagen, R.: Local Appearance based Face Recognition Using Discrete Cosine Transform. In: Proceedings of the 13th European Signal Processing Conference (EUSIPCO), Antalya, Turkey (2005)
Ekenel, H.K., Stiefelhagen, R.: A Generic Face Representation Approach for Local Appearance based Face Verification. In: Proceedings of the CVPR IEEE Workshop on FRGC Experiments, San Diego, CA, USA (2005)
Ekenel, H.K., Stiefelhagen, R.: Analysis of Local Appearance-based Face Recognition on FRGC 2.0 Database. In: Face Recognition Grand Challenge Workshop (FRGC), Arlington, VA, USA (2006)
Ekenel, H.K., Pnevmatikakis, A.: Video-Based Face Recognition Evaluation in the CHIL Project - Run 1. In: Proceedings of the 7th Intl. Conf. on Automatic Face and Gesture Recognition (FG 2006), Southampton, UK (2006)
Ekenel, H.K., Jin, Q.: ISL Person Identification Systems in the CLEAR Evaluations. In: Stiefelhagen, R., Garofolo, J.S. (eds.) CLEAR 2006. LNCS, vol. 4122, pp. 249–257. Springer, Heidelberg (2007)
Snelick, R., Indovina, M.: Large Scale Evaluation of Multimodal Biometric Authentication Using State-of-the-Art Systems. IEEE Trans. Pattern Anal. Mach. Intell. 27(3), 450–455 (2005)
Kittler, J., et al.: On Combining Classifiers. IEEE Trans. Pattern Anal. Mach. Intell. 20(3) (1998)
Holzapfel, H.: Building multilingual spoken dialogue systems. Special issue of Archives of Control Sciences, G., Vetulani, Z. (ed.) 15(4) (2005)
Holzapfel, H., Gieselmann, P.: A Way Out of Dead End Situations in Dialogue Systems for Human-Robot Interaction. Humanoids (2004)
Denecke, M.: Rapid Prototyping for Spoken Dialogue Systems. In: Proc. of the 19th Int. Conf. on Computational Linguistics (COLING), Taiwan (2002)
Finke, M., et al.: The karlsruhe-verbmobil speech recognition engine. In: Proc. of ICASSP’97, Germany (1997)
Soltau, H., et al.: A one pass- decoder based on polymorphic linguistic context assignment. In: Proceedings of ASRU’01, Madonna di Campiglio, Trento, Italy (2001)
Fügen, C., Holzapfel, H., Waibel, A.: Tight coupling of speech recognition and dialog management - dialog-context grammar weighting for speech recognition. In: Proceedings of ICSLP’04, Jeju Island, Korea (2004)
Gieselmann, P., Holzapfel, H.: Multimodal Context Management within Intelligent Rooms. In: Proceedings of the 10th International Conference on Speech and Computer (SPECOM’05), Patras, Greece (2005)
Schaaf, T.: Erkennen und Lernen neuer Wörter. PhD Thesis, Universität Karlsruhe (2004)
Schaaf, T.: Detection Of OOV Words Using Generalized Word Models And A Semantic Class Language Model. In: Proceedings of Eurospeech (2001)
Hetherington, L.: A Characterization of the problem of new, out-of-vocabulary words in continuous speech recognition and understanding. Ph.D.-Thesis, MIT (1995)
Asadi, A., Schwartz, R., Makhoul, J.: Automatic detection of new words in a large-vocabulary continuous speech recognition system. In: Proceedings of ICASSP’90, IEEE Signal Processing Society, Albuquerque, New Mexico, USA (1990)
Hild, H.: Buchstabiererkennung mit neuronalen Netzen in Auskunftssystemen (Ph.D.-Thesis, Universität Karlsruhe, Fakultät für Informatik). Shaker, Aachen (1997)
Holzapfel, H., Nickel, K., Stiefelhagen, R.: Implementation and Evaluation of a Constraint-Based Multimodal Fusion System for Speech and 3D Pointing Gestures. In: Proc. of the Int. Conf. on Multimodal Interfaces, State College, PA, USA (2004)
Kawamura, K.: Cognitive Approach to a Human Adaptive Robot Development. In: Proceedings of the Int. Workshop on Robot and Human Interactive Communication (RO-MAN), Nashville, TN, USA (2005)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Holzapfel, H., Schaaf, T., Ekenel, H.K., Schaa, C., Waibel, A. (2007). A Robot Learns to Know People—First Contacts of a Robot. In: Freksa, C., Kohlhase, M., Schill, K. (eds) KI 2006: Advances in Artificial Intelligence. KI 2006. Lecture Notes in Computer Science(), vol 4314. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-69912-5_23
Download citation
DOI: https://doi.org/10.1007/978-3-540-69912-5_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-69911-8
Online ISBN: 978-3-540-69912-5
eBook Packages: Computer ScienceComputer Science (R0)