Abstract
The core cognitive ability to perceive and synthesize ‘shapes’ underlies all our basic interactions with the world, be it shaping one’s fingers to grasp a ball or shaping one’s body while imitating a dance. In this article, we describe our attempts to understand this multifaceted problem by creating a primitive shape perception/synthesis system for the baby humanoid iCub. We specifically deal with the scenario of iCub gradually learning to draw or scribble shapes of gradually increasing complexity, after observing a demonstration by a teacher, by using a series of self evaluations of its performance. Learning to imitate a demonstrated human movement (specifically, visually observed end-effector trajectories of a teacher) can be considered as a special case of the proposed computational machinery. This architecture is based on a loop of transformations that express the embodiment of the mechanism but, at the same time, are characterized by scale invariance and motor equivalence. The following transformations are integrated in the loop: (a) Characterizing in a compact, abstract way the ‘shape’ of a demonstrated trajectory using a finite set of critical points, derived using catastrophe theory: Abstract Visual Program (AVP); (b) Transforming the AVP into a Concrete Motor Goal (CMG) in iCub’s egocentric space; (c) Learning to synthesize a continuous virtual trajectory similar to the demonstrated shape using the discrete set of critical points defined in CMG; (d) Using the virtual trajectory as an attractor for iCub’s internal body model, implemented by the Passive Motion Paradigm which includes a forward and an inverse motor model; (e) Forming an Abstract Motor Program (AMP) by deriving the ‘shape’ of the self generated movement (forward model output) using the same technique employed for creating the AVP; (f) Comparing the AVP and AMP in order to generate an internal performance score and hence closing the learning loop. The resulting computational framework further combines three crucial streams of learning: (1) motor babbling (self exploration), (2) imitative action learning (social interaction) and (3) mental simulation, to give rise to sensorimotor knowledge that is endowed with seamless compositionality, generalization capability and body-effectors/task independence. The robustness of the computational architecture is demonstrated by means of several experimental trials of gradually increasing complexity using a state of the art humanoid platform.
Similar content being viewed by others
References
Amedi, A., Stern, W., Camprodon, A. J., Bermpohl, F., Merabet, L., Rotman, S., Hemond, C., Meijer, P., & Pascual-Leone, A. (2007). Shape conveyed by visual-to-auditory sensory substitution activates the lateral occipital complex. Nature Neuroscience, 10(6), 687–689.
Anquetil, E., & Lorette, G. (1997). Perceptual model of handwriting drawing: application to the handwriting segmentation problem. In Proceedings of the fourth international conference on document analysis and recognition (pp. 112–117).
Aparna, K. H., Subramanian, V., Kasirajan, M., Prakash, G. V., Chakravarthy, V. S., & Madhvanath, S. (2004). Online handwriting recognition for tamil. In Proceedings of ninth international workshop on frontiers in handwriting recognition.
Argall, B. D., Chernova, S., Veloso, M., & Browning, B. (2009). A survey of robot learning from demonstration. Robotics and Autonomous Systems, 57(5), 469–483.
Atkeson, C. G., & Schaal, S. (1997a). Learning tasks from a single demonstration. Proceedings of the IEEE International Conference on Robotics and Automation, 2, 1706–1712.
Atkeson, C. G., & Schaal, S. (1997b). Robot learning from demonstration. In Proceedings of the fourteenth international conference on machine learning (pp. 12–20).
Basteris, A., Bracco, L., & Sanguineti, V. (2010). Intermanual transfer of handwriting skills: role of visual and haptic assistance. In Proceedings of the 4th IMEKO TC 18 symposium: measurement, analysis and modelling of human functions.
Belkasim, S., Shridhar, M., & Ahmadi, M. (1991). Pattern recognition with moment invariants: a comparative study and new results. Pattern Recognition, 24, 1117–1138.
Bentivegna, D. C., Ude, A., Atkeson, C. G., & Cheng, G. (2002). Humanoid robot learning and game playing using PC-based vision. In Proceedings of the IEEE/RSJ international conference on intelligent robots and systems.
Billard, A., & Mataric, M. (2001). Learning human arm movements by imitation: evaluation of a biologically- inspired architecture. Robotics and Autonomous Systems, 941, 1–16.
Bizzi, E., Polit, A., & Morasso, P. (1976). Mechanisms underlying achievement of final position. Journal of Neurophysiology, 39, 435–444.
Blum, H. (1967). A transformation for extracting new descriptors of shape. In A. Whaten-Dunn (Ed.), Models for the perception of speech and visual forms (pp. 362–380). Cambridge: MIT Press.
Boronat, C., Buxbaum, L., Coslett, H., Tang, K., Saffran, E., Kimberg, D., & Detre, J. (2005). Distinction between manipulation and function knowledge of objects: evidence from functional magnetic resonance imaging. Cognitive Brain Research, 23, 361–373.
Braun, D. A., Mehring, C., & Wolpert, D. M. (2010). Structure learning in action. Behavioural Brain Research, 206, 157–165.
Brown, H. D. (1987). Principles of language learning and teaching. New York: Prentice-Hall.
Bullock, D., & Grossberg, S. (1988). Neural dynamics of planned arm movements: emergent invariants and speed-accuracy properties. Psychological Reviews, 95, 49–90.
Casadio, M., Morasso, P., Sanguineti, V., & Arrichiello, V. (2006). Braccio di Ferro: a new haptic workstation for neuromotor rehabilitation. Technology and Health Care, 14, 123–142.
Cattaneo, L., & Rizzolatti, G. (2009). The mirror neuron system. Archives of Neurology, 66(5), 557–560.
Chakravarthy, V. S., & Kompella, B. (2003). The shape of handwritten characters. Pattern Recognition Letters, 24, 1901–1913.
Chella, A., Dindo, H., & Infantino, I. (2006). A cognitive framework for imitation learning. Robotics and Autonomous Systems, 54(5), 403–408. Special issue: the social mechanisms of robot programming by demonstration.
Chen, S., Keller, J., & Crownover, R. (1990). Shape from fractal geometry. Artificial Intelligence, 43, 199–218.
Clark, J. J. (1988). Singularity theory and phantom edges in scale-space. IEEE Transactions on Pattern Analysis and Machine Intelligence, 10(5), 720–727.
Dautenhahn, K., & Nehaniv, C. L. (2002). Imitation in animals and artifacts. London: MIT Press. ISBN:0262042037.
Demiris, Y., & Simmons, G. (2006a). Perceiving the unusual: temporal properties of hierarchical motor representations for action perception. Neural Networks, 19(3), 272–284.
Demiris, Y., & Khadhouri, B. (2006b). Hierarchical Attentive Multiple Models for Execution and Recognition (HAMMER). Robotics and Autonomous Systems, 54, 361–369.
Duda, R. O., & Hart, P. E. (1973). Pattern classification and scene analysis. New York: Wiley.
Duncan, C. P. (1960). Description of learning to learn in human subjects. The American Journal of Psychology, 73(1), 108–114.
Ellis, R., & Tucker, M. (2000). Micro-affordance: the potentiation of components of action by seen objects. British Journal of Psychology, 91(4), 451–471.
Feldman, A. G. (1966). Functional tuning of the nervous system with control of movement or maintenance of a steady posture, II: controllable parameters of the muscles. Biophysics, 11, 565–578.
Fischler, M. A., & Wolf, H. C. (1994). Locating perceptually salient points on planar curves. IEEE Transactions on Pattern Analysis and Machine Intelligence, 16(2), 113–129.
Gaglio, S., Grattarola, A., Massone, L., & Morasso, P. (1987). Structure and texture in shape representation. Journal of Intelligent Systems, 1(1), 1–41.
Gallese, V., & Lakoff, G. (2005). The Brain’s concepts: the role of the sensory-motor system in reason and language. Cognitive Neuropsychology, 22, 455–479.
Gibson, J. J. (1979). The ecological approach to visual perception. Boston: Houghton Mifflin.
Gilmore, R. (1981). Catastrophe theory for scientists and engineers. New York: Wiley-Interscience.
Grafton, S. T., Arbib, M. A., Fadiga, L., & Rizzolatti, G. (1996). Localization of grasp representation in humans by positron emission tomography: 2 observation compared with imagination. Experimental Brain Research, 112, 103–111.
Halford, G. S., Wilson, W. H., & Phillips, S. (1998). Processing capacity defined by relational complexity: implications for comparative, developmental, and cognitive psychology. Behavioral and Brain Sciences, 21, 723–802.
Harlow, H. F. (1949). The formation of learning sets. Psychological Review, 56, 51–65.
Hebb, D. O. (1949). The organization of behavior: a neuropsychological theory. New York: Wiley.
Hersch, M., & Billard, A. G. (2008). Reaching with multi-referential dynamical systems. Autonomous Robots, 25, 71–83.
Hoff, W., & Ahuja, N. (1989). Surfaces from stereo: integrating feature matching, disparity estimation, and contour detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 11, 121–136.
Hoffmann, H., Pastor, P., Asfour, T., & Schaal, S. (2009). Learning and generalization of motor skills by learning from demonstration. In Proceedings of the international conference on robotics and automation.
Horn, B. K. P. (1990). Height and gradient from shading. International Journal of Computer Vision, 5, 37–75.
Iacoboni, M., Koski, L. M., Brass, M., Bekkering, H., Woods, R. P., Dubeau, M. C., Mazziotta, J. C., & Rizzolatti, G. (2001). Reafferent copies of imitated actions in the right superior temporal cortex. Proceedings of the National Academy of Sciences of the United States of America, 98, 13995–13999.
Iacoboni, M. (2009). Imitation, empathy, and mirror neurons. Annual Review of Psychology.
Ijspeert, J. A., Nakanishi, J., & Schaal, S. (2002). Movement imitation with nonlinear dynamical systems in humanoid robots. In Proceedings of the IEEE international conference on robotics and automation.
Iyer, N., Jayanti, S., Lou, K., Kalyanaraman, Y., & Ramani, K. (2005). Three-dimensional shape searching: state-of-the-art review and future trends. Computer Aided Design, 37, 509–530.
Jagadish, H. V., & Bruckstein, A. M. (1992). On sequential shape descriptions. Pattern Recognition, 25, 165–172.
Koenderink, J. J., & van Doorn, A. J. (1986). Dynamic shape. Biological Cybernetics, 53, 383–396.
Koski, L., Wohlschlager, A., Bekkering, H., Woods, R. P., Dubeau, M. C., Mazziotta, J. C., & Iacoboni, M. (2002). Modulation of motor and premotor activity during imitation of target-directed actions. Cerebral Cortex, 12, 847–855.
Li, X., & Yeung, D. Y. (1997). On-line alphanumeric character recognition using dominant points in strokes. Pattern Recognition, 30(1), 31–44.
Loncaric, S. (1998). A survey of shape analysis techniques. Pattern Recognition, 31(8), 983–1001.
Lopes, M., Melo, F., Montesano, L., & Santos-Victor, J. (2010). Abstraction levels for robotic imitation: overview and computational approaches. In O. Sigaud & J. Peters (Eds.), Series: studies in computational intelligence. From motor learning to interaction learning in robots. Berlin: Springer.
Madduri, K., Aparna, H. K., & Chakravarthy, V. S. (2004). PATRAM—A handwritten word processor for Indian languages. In Proceedings of ninth international workshop on frontiers in handwriting recognition.
Manikandan, B. J., Shankar, G., Anoop, V., Datta, A., & Chakravarthy, V. S. (2002). LEKHAK: a system for online recognition of handwritten tamil characters. In Proceedings of the international conference on natural language processing.
Marr, D. (1982). Vision: a computational investigation into the human representation and processing of visual information. New York: Freeman.
Mehrotra, R., Nichani, S., & Ranganathan, N. (1990). Corner detection. Pattern Recognition, 23(11), 1223–1233.
Metta, G., Fitzpatrick, P., & Natale, L. (2006). YARP: yet another robot platform. International Journal on Advanced Robotics Systems, 3(1), 43–48. Special issue on Software Development and Integration in Robotics.
Mohan, V., & Morasso, P. (2007). Towards reasoning and coordinating action in the mental space. International Journal of Neural Systems, 17(4), 1–13.
Mohan, V., & Morasso, P. (2008). Reaching extended’: unified computational substrate for mental simulation and action execution in cognitive robots. In Proceedings of third international conference of cognitive science.
Mohan, V., Morasso, P., Metta, G., & Sandini, G. (2009a). A biomimetic, force-field based computational model for motion planning and bimanual coordination in humanoid robots. Autonomous Robots, 27(3), 291–301.
Mohan, V., Zenzeri, J., Morasso, P., & Metta, G. (2009b). Composing and coordinating body models of arbitrary complexity and redundancy: a biomimetic field computing approach. In Proceedings of the IEEE/RSJ international conference on intelligent robots and systems.
Morasso, P., Mussa Ivaldi, F. A., & Ruggiero, C. (1983). How a discontinuous mechanism can produce continuous patterns in trajectory formation and handwriting. Acta Psychologica, 54, 83–98.
Morasso, P., Casadio, M., Mohan, V., & Zenzeri, J. (2010). A neural mechanism of synergy formation for whole body reaching. Biological Cybernetics, 102(1), 45–55.
Mussa Ivaldi, F. A., Morasso, P., & Zaccaria, R. (1988). Kinematic networks. A distributed model for representing and regularizing motor redundancy. Biological Cybernetics, 60, 1–16.
Perrett, D. I., & Emery, N. J. (1994). Understanding the intentions of others from visual signals: neurophysiological evidence. Current Psychology of Cognition, 13, 683–694.
Poston, T., & Stewart, I. N. (1998). Catastrophe theory and its applications. London: Pitman.
Ramachandran, V. S., & Hubbard, E. M. (2003). Hearing colors, tasting shapes. Scientific American, 288(5), 42–49.
Rizzolatti, G., & Arbib, M. A. (1998). Language within our grasp. Trends in Neurosciences, 21, 188–194.
Rizzolatti, G., Fogassi, L., & Gallese, V. (2001). Neurophysiological mechanisms underlying action understanding and imitation. Nature Reviews. Neuroscience, 2, 661–670.
Rizzolatti, G., Fadiga, L., Matelli, M., Bettinardi, V., Paulesu, E., Perani, D., & Fazio, F. (1996). Localization of grasp representations in humans by PET: 1. Observation versus execution. Experimental Brain Research, 111, 246–252.
Rocha, J., & Pavlidis, T. (1994). A shape analysis model with application to a character recognition system. IEEE Transactions on Pattern Analysis and Machine Intelligence, 16(4), 393–404.
Sandini, G., Metta, G., & Vernon, D. (2004). RobotCub: an open framework for research in embodied cognition. In Proceedings of the 4th IEEE/RAS international conference on humanoid robots (pp. 13–32).
Sanfeliu, A., & Fu, K. (1983). A distance measure between attributed relational graphs for pattern recognition. IEEE Transactions on Systems, Man, and Cybernetics, 13(3), 353–362.
Schaal, S. (1999). Is imitation learning the route to humanoid robots? Trends in Cognitive Sciences, 3, 233–242.
Schaal, S., Ijspeert, A., & Billard, A. (2003). Computational approaches to motor learning by imitation. Philosophical Transaction of the Royal Society of London B, 358, 537–547.
Shankar, G., Anoop, V., & Chakravarthy, V. S. (2003). LEKHAK [MAL]: a system for online recognition of handwritten Malayalam characters. In Proceedings of the national conference on communications, IIT, Madras.
Shapiro, R. (1978). Direct linear transformation method for three-dimensional cinematography. Restoration Quarterly, 49, 197–205.
Smith, L. B., Yu, C., & Pereira, A. F. (2010). Not your mother’s view: the dynamics of toddler visual experience. Developmental Science. doi:10.1111/j.1467-7687.2009.00947.x.
Stiny, G., & Gips, J. (1978). Algorithmic aesthetics: computer models for criticism and design in the arts. California: University of California Press.
Stiny, G. (2006). Shape: talking about seeing and doing. Cambridge: MIT Press.
Symes, E., Ellis, R., & Tucker, M. (2007). Visual object affordances: object orientation. Acta Psychologica, 124, 238–255.
Teh, C. H., & Chin, R. T. (1989). On the detection of dominant points on digital curves. IEEE Transactions on Pattern Analysis and Machine Intelligence, 11(8), 859–872.
Ternovskiy, I., Jannson, T., & Caulfield, J. (2002). Is catastrophe theory the basis for visual perception? Three-dimensional holographic imaging. New York: Wiley. doi:10.1002/0471224545.ch10.
Thom, R. (1975). Structural stability and morphogenesis. Reading: Addison-Wesley.
Tikhanoff, V., Cangelosi, A., Fitzpatrick, P., Metta, G., Natale, L., & Nori, F. (2008). An open-source simulator for cognitive robotics research. Cogprints, article 6238.
Tsuji, T., Morasso, P., Shigehashi, K., & Kaneko, M. (1995). Motion planning for manipulators using artificial potential field approach that can adjust convergence time of generated arm trajectory. Journal of the Robotics Society of Japan, 13(3), 285–290.
Ulupinar, F., & Nevatia, R. (1990). Inferring shape from contour for curved surfaces. In Proceedings of the international conference on pattern recognition (pp. 147–154).
Visalberghi, E., & Tomasello, M. (1997). Primate causal understanding in the physical and in the social domains. Behavioral Processes, 42, 189–203.
Wallace, T., & Wintz, P. (1980). An efficient three-dimensional aircraft recognition algorithm using normalized Fourier descriptors. Computer Graphics and Image Processing, 13, 99–126.
Yu, C., Smith, L. B., Shen, H., Pereira, A. F., & Smith, T. G. (2009). Active information selection: visual attention through the hands. IEEE Transactions on Autonomous Mental Development, 1(2), 141–151.
Zak, M. (1988). Terminal attractors for addressable memory in neural networks. Physical Letters A, 133, 218–222.
Zeeman, E. C. (1977). Catastrophe theory-selected papers 1972–1977. Reading: Addison-Wesley.
Zöllner, R., Asfour, T., & Dillman, R. (2004). Programming by demonstration: dual-arm manipulation tasks for humanoid robots. In Proceedings of the IEEE/RSJ international conference on intelligent robots systems.
Author information
Authors and Affiliations
Corresponding author
Electronic Supplementary Material
Below are the links to the electronic supplementary material.
Observing a tool being used.wmv (WMV 11.2 MB)
Auro2iCubArtShort.wmv (WMV 26.7 MB)
Rights and permissions
About this article
Cite this article
Mohan, V., Morasso, P., Zenzeri, J. et al. Teaching a humanoid robot to draw ‘Shapes’. Auton Robot 31, 21–53 (2011). https://doi.org/10.1007/s10514-011-9229-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10514-011-9229-0