Abstract
MOBSY is a fully integrated autonomous mobile service robot system. It acts as an automatic dialogue based receptionist for visitors of our institute. MOBSY incorporates many techniques from different research areas into one working stand-alone system. Especially the computer vision and dialogue aspects are of main interest from the pattern recognition’s point of view. To summarize shortly, the involved techniques range from object classification over visual self-localization and recalibration to object tracking with multiple cameras. A dialogue component has to deal with speech recognition, understanding and answer generation. Further techniques needed are navigation, obstacle avoidance, and mechanisms to provide fault tolerant behavior. This contribution introduces our mobile system MOBSY. Among the main aspects vision and speech, we focus also on the integration aspect, both on the methodological and on the technical level. We describe the task and the involved techniques. Finally, we discuss the experiences that we gained with MOBSY during a live performance at the 25th anniversary of our institute.
Abstract
This work was supported by the “Deutsche Forschungsgemeinschaft” under grant SFB603/TP B2, C2 and by the “Bayerische Forschungsstiftung” under grant DIROKOL.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
U. Ahlrichs, J. Fischer, J. Denzler, Ch. Drexler, H. Niemann, E. Nöth, and D. Paulus. Knowledge based image and speech analysis for service robots. In Proceedings Integration of Speech and Image Understanding, pages 21–47, Corfu, Greece, 1999. IEEE Computer Society.
J. Bins B. Draper and K. Baek. Adore: Adaptive object recognition. In Christensen [9], pages 522–537.
R. Bischoff. Recent advances in the development of the humanoid service robot hermes. In 3rd EUREL Workshop and Masterclass–European Advanced Robotics Systems Development, volume I, pages 125–134, Salford, U.K., April 2000.
R. Bischoff and T. Jain. Natural communication and interaction with humanoid robots. In Second International Symposium on Humanoid Robots, pages 121–128, Tokyo, 1999.
A. Black, P. Taylor, R. Caley, and R. Clark. The festival speech synthesis system, last visited 4/10/2001. http://www.cstr.ed.ac.uk/projects/festival.html.
W. Burgard, A.B. Cremers, D. Fox, D. Hähnel, G. Lakemeyer, D. Schulz, W. Steiner, and S. Thrun. The interactive museum tour-guide robot. In Proceedings of the Fifteenth National Conference on Artificial Intelligence, pages 11–18, Madison, Wisconsin, July 1998.
Douglas Chai and King N. Ngan. Locating facial region of a head-and-shoulders color image. In Proceedins Third IEEE International Conference on Automatic Face and Gesture Recognition, pages 124–129, Nara, Japan, 1998. IEEE Computer Society Technical Commitee on Pattern Analysis and Machine Intelligence (PAMI).
H. Christensen, editor. Computer Vision Systems, Heidelberg, Jan. 1999. Springer.
J. Denzler, R. Beβ J. Hornegger, H. Niemann, and D. Paulus. Learning, tracking and recognition of 3D objects. In V. Graefe, editor, International Conference on Intelligent Robots and Systems–Advanced Robotic Systems and Real World, volume 1, pages 89–96, München, 1994.
F. Gallwitz. Integrated Stochastic Models for Spontaneous Speech Recognition. Phd-thesis, Technische Fakultät der Universität Erlangen-Nürnberg, Erlangen. to appear.
F. Gallwitz, M. Aretoulaki, M. Boros, J. Haas, S. Harbeck, R. Huber, H. Niemann, and E. Nöth. The Erlangen Spoken Dialogue System EVAR: A State-of-the-Art Information Retrieval System. In Proceedings of 1998 International Symposium on Spoken Dialogue (ISSD 98), pages 19–26, Sydney, Australia, 1998.
U. Hanebeck, C. Fischer, and G Schmidt. Roman: A mobile robotic assistant for indoor service applications. In Proceedings of the IEEE RSJ International Conference on Intelligent Robots and Systems (IROS), pages 518–525, 1997.
B. Heigl, J. Denzler, and H. Niemann. Combining computer graphics and computer vision for probabilistic visual robot navigation. In Jacques G. Verly, editor, Enhanced and Synthetic Vision 2000, volume 4023 of Proceedings of SPIE, pages 226–235, Orlando, FL, USA, April 2000.
B. Heigl, R. Koch, M. Pollefeys, J. Denzler, and L. Van Gool. Plenoptic modeling and rendering from image sequences taken by a hand-held camera. In W. Förstner, J.M. Buhmann, A. Faber, and P. Faber, editors, Mustererkennung 1999, pages 94–101, Heidelberg, 1999. Springer.
Th. Joachims. Making large-scale support vector machine learning practical. In Schölkopf et al. [23], pages 169–184.
Gregor Möhler, Bernd Möbius, Antje Schweitzer, Edmilson Morais, Norbert Braunschweiler, and Martin Haase. The german festival system, last visited 4/10/2001. http://www.ims.unistuttgart.de/phonetik/synthesis/index.html.
H. Niemann, V. Fischer, D. Paulus, and J. Fischer. Knowledge based image understanding by iterative optimization. In G. Görz and St. Hölldobler, editors, KI-96: Advances in Artificial Intelligence, volume 1137 (Lecture Notes in Artificial Intelligence), pages 287–301. Springer, Berlin, 1996.
E. Nöth, J. Haas, V. Warnke, F. Gallwitz, and M. Boros. A hybrid approach to spoken dialogue understanding: Prosody, statistics and partial parsing. In Proceedings European Conference on Speech Communication and Technology, volume 5, pages 2019–2022, Budapest, Hungary, 1999.
D. Paulus, U. Ahlrichs, B. Heigl, J. Denzler, J. Hornegger, and H. Niemann. Active knowledge based scene analysis. In Christensen [9], pages 180–199.
D. Paulus and J. Hornegger. Applied pattern recognition: A practical introduction to image and speech processing in C++. Advanced Studies in Computer Science. Vieweg, Braunschweig, 3rd edition, 2001.
D. Paulus, J. Hornegger, and H. Niemann. Software engineering for image processing and analysis. In B. Jähne, P. Geiβler, and H. Hauβecker, editors, Handbook of Computer Vision and Applications, volume 3, pages 77–103. Academic Press, San Diego, 1999.
B. Schölkopf, Ch. Burges, and A. Smola, editors. Advances in Kernel Methods: Support Vector Learning. The MIT Press, Cambridge, London, 1999.
S. Thrun, M. Bennewitz, W. Burgard, A. Cremers, F. Dellaert, D. Fox, D. Hahnel, C. Rosenberg, J. Schulte, and D. Schulz. Minerva: A second-generation museum tour-guide robot. In Proceedings of the IEEE International Conference on Robotics Automation (ICRA), pages 1999–2005, 1999.
R.Y. Tsai. A versatile camera calibration technique for high-accuracy 3d machine vision metrology using off-the-shelf tv cameras and lenses. IEEE Journal of Robotics and Automation, 3(4):323–344, 1987.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zobel, M. et al. (2001). MOBSY: Integration of Vision and Dialogue in Service Robots. In: Schiele, B., Sagerer, G. (eds) Computer Vision Systems. ICVS 2001. Lecture Notes in Computer Science, vol 2095. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48222-9_4
Download citation
DOI: https://doi.org/10.1007/3-540-48222-9_4
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42285-3
Online ISBN: 978-3-540-48222-2
eBook Packages: Springer Book Archive