Abstract
This chapter describes key aspects of a visual perception system as a key component for language game experiments on physical robots. The vision system is responsible for segmenting the continuous flow of incoming visual stimuli into segments and computing a variety of features for each segment. This happens by a combination of bottom-up way processing that work on the incoming signal and top-down processing based on expectations about what was seen before or objects stored in memory. This chapter consists of two parts. The first one is concerned with extracting and maintaining world models about spatial scenes, without any prior knowledge of the possible objects involved. The second part deals with the recognition of gestures and actions which establish the joint attention and pragmatic feedback that is an important aspect of language games. experiments.
Key words
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Aherne F, Thacker NA, Rockett PI (1998) The Bhattacharyya metric as an absolute similarity measure for frequency coded data. Kybernetika 34(4):363–368
Baddeley AD (1983) Working memory. Philosophical Transactions of the Royal Society of London Series B, Biological Sciences (1934-1990) 302(1110):311– 324
Baillie JC, Ganascia JG (2000) Action categorization from video sequences. In:
Horn W (ed) Proceedings ECAI, IOS Press, pp 643–647
Ballard DH, Hayhoe MM, Pook PK, Rao RPN (1997) Deictic codes for the embodiment of cognition. Behavioural and Brain Sciences 20(4):723–742
Belpaeme T, Steels L, Van Looveren J (1998) The construction and acquisition of
visual categories. In: Proceedings EWLR-6, Springer, LNCS, vol 1545, pp 1–12
Bhattacharyya A (1943) On a measure of divergence between two statistical populations defined by their probability distributions. Bulletin Calcutta Mathematical Society 35:99–110
Breazeal C (2002) Designing Sociable Robots. MIT Press
Breazeal C (2003) Toward sociable robots. Robotics and Autonomous Systems 42(3-4):167–175
Brooks A, Arkin R (2007) Behavioral overlays for non-verbal communication expression on a humanoid robot. Autonomous Robots 22(1):55–74
Cassell J, Torres OE, Prevost S (1999) Turn taking vs. discourse structure: how best to model multimodal conversation. Machine Conversations pp 143–154
Chella A, Frixione M, Gaglio S (2003) Anchoring symbols to conceptual spaces: the case of dynamic scenarios. Robotics and Autonomous Systems 43(2-3):175–188
Colombo C, Del Bimbo A, Valli A (2003) Visual capture and understanding of hand
pointing actions in a 3-D environment. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics 3(4):677–686
Coradeschi S, Saffiotti A (2003) An introduction to the anchoring problem. Robotics and Autonomous Systems 43(2-3):85–96
Cruse H, Durr V, Schmitz J (2007) Insect walking is based on a decentralized architecture revealing a simple and robust controller. Phil Trans R Soc A 365:221–250
Dautenhahn K, Odgen B, Quick T (2002) From embodied to socially embedded agents–implications for interaction-aware robots. Cognitive Systems Research 3(3):397–428
Dominey PF, Boucher JD (2005) Learning to talk about events from narrated video in a construction grammar framework. Artificial Intelligence 167(1-2):31–61
Fong T, Nourbakhsh I, Dautenhahn K (2002) A survey of socially interactive robots. Robotics and Autonomous Systems 42(3-4):143–166
Fujita M, Kuroki Y, Ishida T, Doi TT (2003) Autonomous behavior control architecture of entertainment humanoid robot sdr-4x. In: Proceedings IROS ’03, pp 960–967, vol. 1
Gardenfors P (2000) Conceptual Spaces: The Geometry of Thought. MIT Press
Haasch A, Hofemann N, Fritsch J, Sagerer G (2005) A multi-modal object attention system for a mobile robot. In: Proceedings IROS ’05, pp 2712–2717
Hafner V, Kaplan F (2005) Learning to interpret pointing gestures: experiments with four-legged autonomous robots. In: Biomimetic Neural Learning for Intelligent
Robots, LNCS, vol 3575, Springer, pp 225–234
Hager GD, Belhumeur PN (1998) Efficient region tracking with parametric models of geometry and illumination. IEEE Transactions on Pattern Analysis and Machine Intelligence 20(10):1025–1039
Hurford JR (2003) The neural basis of predicate-argument structure. Behavioral and Brain Sciences 26(3):261–316
Imai M, Ono T, Ishiguro H (2004) Physical relation and expression: joint attention for human-robot interaction. IEEE Transactions on Industrial Electronics 50(4):636–643
Ishiguro H (2006) Android science: conscious and subconscious recognition. Connection Science 18(4):319–332
Jungel M, Hoffmann J, Lotzsch M (2004) A real-time auto-adjusting vision system for robotic soccer. In: Polani D, Browning B, Bonarini A (eds) RoboCup 2003:
Robot Soccer World Cup VII, Springer, LNCS, vol 3020, pp 214–225
Kalman RE (1960) A new approach to linear filtering and prediction problems. Transactions of the ASME-Journal of Basic Engineering 82(1):35–45
Kanda T, Kamasima M, Imai M, Ono T, Sakamoto D, Ishiguro H, Anzai Y (2007) A humanoid robot that pretends to listen to route guidance from a human. Autonomous Robots 22(1):87–100 Kaplan F, Hafner V (2006) The challenges of joint attention. Interaction Studies 7(2):129–134
Kato H, Billinghurst M (1999) Marker tracking and HMD calibration for a videobased augmented reality conferencing system. In: Proceedings ISAR ’99, pp 85– 94
Kopp S (2010) Social resonance and embodied coordination in face-to-face conversation with artificial interlocutors. Speech Communication 52(6):587–597
Kortenkamp D, Huber E, Bonasso RP (1996) Recognizing and interpreting gestures on a mobile robot. In: Proceedings AAAI-96, pp 915–921
Kozima H, Yano H (2001) A robot that learns to communicate with human caregivers. In: Proceedings EPIROB ’01
Kroger B, Kopp S, Lowit A (2009) A model for production, perception, and acquisition of actions in face-to-face communication. Cognitive Processing
Marjanovic M, Scassellati B, Williamson M (1996) Self-taught visually-guided
pointing for a humanoid robot. In: Proceedings SAB ’96, The MIT Press, pp 35–44
Marr D (1982) Vision: A Computational Investigation into the Human Representation and Processing of Visual Information. W. H. Freeman, San Francisco, CA
Martin C, Steege FF, Gross HM (2009) Estimation of pointing poses for visually instructing mobile robots under real world conditions. Robotics and Autonomous Systems 58(2):174–185
Mishkin M, Ungerleider LG, Macko KA (1983) Object vision and spatial vision: two cortical pathways. Trends in Neurosciences 6:414–417
Nagai Y, Hosada K, Morita A, Asada M (2003) A constructive model for the development of joint attention. Connection Science 15(4):211–229
Nickel K, Stiefelhagen R (2007) Visual recognition of pointing gestures for humanrobot
interaction. Image and Vision Computing 25(12):1875–1884
Perez P, Hue C, Vermaak J, Gangnet M (2002) Color-based probabilistic tracking. In: Proceedings ECCV ’02, Springer, LNCS, vol 2350, pp 661–675
Pfeifer R, Lungarella M, Iida F (2007) Self-organization, embodiment, and biologically inspired robotics. Science 318:1088–1093
Pylyshyn ZW (1989) The role of location indexes in spatial perception: A sketch of the FINST spatial-index model. Cognition 32(1):65–97
Pylyshyn ZW (2001) Visual indexes, preconceptual objects, and situated vision. Cognition 80(1):127–158
Scassellati B (1999) Imitation and mechanisms of joint attention: A developmental
structure for building social skills on a humanoid robot. In: Nehaniv CL (ed)
Computation for Metaphors, Analogy, and Agents, LNCS, vol 1562, Springer, pp 176–195
Siskind JM (1995) Grounding language in perception. Artificial Intelligence Review 8(5-6):371–391
Siskind JM (2001) Grounding the lexical semantics of verbs in visual perception using force dynamics and event logic. Journal of Artificial Intelligence Research 15:31–90
Soille P (2003) Morphological Image Analysis: Principles and Applications. Springer
Spelke ES (1990) Principles of object perception. Cognitive Science 14(1):29–56
Spranger M (2008) World models for grounded language games
Spranger M, Pauw S, Loetzsch M, Steels L (2012) Open-ended procedural semantics. In: Steels L, Hild M (eds) Grounding Language in Robots, Springer Verlag, Berlin
Steels L (1998) The origins of syntax in visually grounded robotic agents. Artificial Intelligence 103(1-2):133–156
Steels L, Baillie JC (2003) Shared grounding of event descriptions by autonomous robots. Robotics and Autonomous Systems 43(2-3):163–173
Steels L, Kaplan F (1998) Stochasticity as a source of innovation in language games. In: Proceedings ALIFE ’98, MIT Press, pp 368–376
Steels L, Vogt P (1997) Grounding adaptive language games in robotic agents. In:
Proceedings ECAL ’97, The MIT Press, pp 473–484
Tomasello M (1995) Joint attention as social cognition. In: Moore C, Dunham PJ (eds) Joint Attention: Its Origins and Role in Development, Lawrence Erlbaum Associates, Hillsdale, NJ
Tomasello M (1999) The Cultural Origins of Human Cognition. Harvard University Press, Harvard
Tomasello M, Carpenter M, Call J, Behne T, Moll H (2005) Understanding and sharing intentions: The origins of cultural cognition. Behavioral and Brain Sciences 28:675–691
Treisman AM, Gelade G (1980) A feature-integration theory of attention. Cognitive Psychology 12(1):97–136
Vinciarelli A, Pantic M, Bourlard H (2009) Social signal processing: Survey of an emerging domain. Image and Vision Computing 27(12):1743–1759
Wagner D, Schmalstieg D (2007) ARToolKitPlus for pose tracking on mobile devices. In: Proceedings CVWW ’07
Yilmaz A, Javed O, Shah M (2006) Object tracking: A survey. ACM Computing Surveys 38(13):1–45
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer Science+Business Media, LLC
About this chapter
Cite this chapter
Spranger, M., Loetzsch, M., Steels, L. (2012). A Perceptual System for Language Game Experiments. In: Steels, L., Hild, M. (eds) Language Grounding in Robots. Springer, Boston, MA. https://doi.org/10.1007/978-1-4614-3064-3_5
Download citation
DOI: https://doi.org/10.1007/978-1-4614-3064-3_5
Published:
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4614-3063-6
Online ISBN: 978-1-4614-3064-3
eBook Packages: Computer ScienceComputer Science (R0)