Skip to main content
Log in

Teaching a humanoid robot to draw ‘Shapes’

  • Published:
Autonomous Robots Aims and scope Submit manuscript

Abstract

The core cognitive ability to perceive and synthesize ‘shapes’ underlies all our basic interactions with the world, be it shaping one’s fingers to grasp a ball or shaping one’s body while imitating a dance. In this article, we describe our attempts to understand this multifaceted problem by creating a primitive shape perception/synthesis system for the baby humanoid iCub. We specifically deal with the scenario of iCub gradually learning to draw or scribble shapes of gradually increasing complexity, after observing a demonstration by a teacher, by using a series of self evaluations of its performance. Learning to imitate a demonstrated human movement (specifically, visually observed end-effector trajectories of a teacher) can be considered as a special case of the proposed computational machinery. This architecture is based on a loop of transformations that express the embodiment of the mechanism but, at the same time, are characterized by scale invariance and motor equivalence. The following transformations are integrated in the loop: (a) Characterizing in a compact, abstract way the ‘shape’ of a demonstrated trajectory using a finite set of critical points, derived using catastrophe theory: Abstract Visual Program (AVP); (b) Transforming the AVP into a Concrete Motor Goal (CMG) in iCub’s egocentric space; (c) Learning to synthesize a continuous virtual trajectory similar to the demonstrated shape using the discrete set of critical points defined in CMG; (d) Using the virtual trajectory as an attractor for iCub’s internal body model, implemented by the Passive Motion Paradigm which includes a forward and an inverse motor model; (e) Forming an Abstract Motor Program (AMP) by deriving the ‘shape’ of the self generated movement (forward model output) using the same technique employed for creating the AVP; (f) Comparing the AVP and AMP in order to generate an internal performance score and hence closing the learning loop. The resulting computational framework further combines three crucial streams of learning: (1) motor babbling (self exploration), (2) imitative action learning (social interaction) and (3) mental simulation, to give rise to sensorimotor knowledge that is endowed with seamless compositionality, generalization capability and body-effectors/task independence. The robustness of the computational architecture is demonstrated by means of several experimental trials of gradually increasing complexity using a state of the art humanoid platform.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Amedi, A., Stern, W., Camprodon, A. J., Bermpohl, F., Merabet, L., Rotman, S., Hemond, C., Meijer, P., & Pascual-Leone, A. (2007). Shape conveyed by visual-to-auditory sensory substitution activates the lateral occipital complex. Nature Neuroscience, 10(6), 687–689.

    Article  Google Scholar 

  • Anquetil, E., & Lorette, G. (1997). Perceptual model of handwriting drawing: application to the handwriting segmentation problem. In Proceedings of the fourth international conference on document analysis and recognition (pp. 112–117).

    Chapter  Google Scholar 

  • Aparna, K. H., Subramanian, V., Kasirajan, M., Prakash, G. V., Chakravarthy, V. S., & Madhvanath, S. (2004). Online handwriting recognition for tamil. In Proceedings of ninth international workshop on frontiers in handwriting recognition.

    Google Scholar 

  • Argall, B. D., Chernova, S., Veloso, M., & Browning, B. (2009). A survey of robot learning from demonstration. Robotics and Autonomous Systems, 57(5), 469–483.

    Article  Google Scholar 

  • Atkeson, C. G., & Schaal, S. (1997a). Learning tasks from a single demonstration. Proceedings of the IEEE International Conference on Robotics and Automation, 2, 1706–1712.

    Google Scholar 

  • Atkeson, C. G., & Schaal, S. (1997b). Robot learning from demonstration. In Proceedings of the fourteenth international conference on machine learning (pp. 12–20).

    Google Scholar 

  • Basteris, A., Bracco, L., & Sanguineti, V. (2010). Intermanual transfer of handwriting skills: role of visual and haptic assistance. In Proceedings of the 4th IMEKO TC 18 symposium: measurement, analysis and modelling of human functions.

    Google Scholar 

  • Belkasim, S., Shridhar, M., & Ahmadi, M. (1991). Pattern recognition with moment invariants: a comparative study and new results. Pattern Recognition, 24, 1117–1138.

    Article  Google Scholar 

  • Bentivegna, D. C., Ude, A., Atkeson, C. G., & Cheng, G. (2002). Humanoid robot learning and game playing using PC-based vision. In Proceedings of the IEEE/RSJ international conference on intelligent robots and systems.

    Google Scholar 

  • Billard, A., & Mataric, M. (2001). Learning human arm movements by imitation: evaluation of a biologically- inspired architecture. Robotics and Autonomous Systems, 941, 1–16.

    Google Scholar 

  • Bizzi, E., Polit, A., & Morasso, P. (1976). Mechanisms underlying achievement of final position. Journal of Neurophysiology, 39, 435–444.

    Google Scholar 

  • Blum, H. (1967). A transformation for extracting new descriptors of shape. In A. Whaten-Dunn (Ed.), Models for the perception of speech and visual forms (pp. 362–380). Cambridge: MIT Press.

    Google Scholar 

  • Boronat, C., Buxbaum, L., Coslett, H., Tang, K., Saffran, E., Kimberg, D., & Detre, J. (2005). Distinction between manipulation and function knowledge of objects: evidence from functional magnetic resonance imaging. Cognitive Brain Research, 23, 361–373.

    Article  Google Scholar 

  • Braun, D. A., Mehring, C., & Wolpert, D. M. (2010). Structure learning in action. Behavioural Brain Research, 206, 157–165.

    Article  Google Scholar 

  • Brown, H. D. (1987). Principles of language learning and teaching. New York: Prentice-Hall.

    Google Scholar 

  • Bullock, D., & Grossberg, S. (1988). Neural dynamics of planned arm movements: emergent invariants and speed-accuracy properties. Psychological Reviews, 95, 49–90.

    Article  Google Scholar 

  • Casadio, M., Morasso, P., Sanguineti, V., & Arrichiello, V. (2006). Braccio di Ferro: a new haptic workstation for neuromotor rehabilitation. Technology and Health Care, 14, 123–142.

    Google Scholar 

  • Cattaneo, L., & Rizzolatti, G. (2009). The mirror neuron system. Archives of Neurology, 66(5), 557–560.

    Article  Google Scholar 

  • Chakravarthy, V. S., & Kompella, B. (2003). The shape of handwritten characters. Pattern Recognition Letters, 24, 1901–1913.

    Article  Google Scholar 

  • Chella, A., Dindo, H., & Infantino, I. (2006). A cognitive framework for imitation learning. Robotics and Autonomous Systems, 54(5), 403–408. Special issue: the social mechanisms of robot programming by demonstration.

    Article  Google Scholar 

  • Chen, S., Keller, J., & Crownover, R. (1990). Shape from fractal geometry. Artificial Intelligence, 43, 199–218.

    Article  MATH  Google Scholar 

  • Clark, J. J. (1988). Singularity theory and phantom edges in scale-space. IEEE Transactions on Pattern Analysis and Machine Intelligence, 10(5), 720–727.

    Article  MATH  Google Scholar 

  • Dautenhahn, K., & Nehaniv, C. L. (2002). Imitation in animals and artifacts. London: MIT Press. ISBN:0262042037.

    Google Scholar 

  • Demiris, Y., & Simmons, G. (2006a). Perceiving the unusual: temporal properties of hierarchical motor representations for action perception. Neural Networks, 19(3), 272–284.

    Article  MATH  Google Scholar 

  • Demiris, Y., & Khadhouri, B. (2006b). Hierarchical Attentive Multiple Models for Execution and Recognition (HAMMER). Robotics and Autonomous Systems, 54, 361–369.

    Article  Google Scholar 

  • Duda, R. O., & Hart, P. E. (1973). Pattern classification and scene analysis. New York: Wiley.

    MATH  Google Scholar 

  • Duncan, C. P. (1960). Description of learning to learn in human subjects. The American Journal of Psychology, 73(1), 108–114.

    Article  Google Scholar 

  • Ellis, R., & Tucker, M. (2000). Micro-affordance: the potentiation of components of action by seen objects. British Journal of Psychology, 91(4), 451–471.

    Article  Google Scholar 

  • Feldman, A. G. (1966). Functional tuning of the nervous system with control of movement or maintenance of a steady posture, II: controllable parameters of the muscles. Biophysics, 11, 565–578.

    Google Scholar 

  • Fischler, M. A., & Wolf, H. C. (1994). Locating perceptually salient points on planar curves. IEEE Transactions on Pattern Analysis and Machine Intelligence, 16(2), 113–129.

    Article  Google Scholar 

  • Gaglio, S., Grattarola, A., Massone, L., & Morasso, P. (1987). Structure and texture in shape representation. Journal of Intelligent Systems, 1(1), 1–41.

    Google Scholar 

  • Gallese, V., & Lakoff, G. (2005). The Brain’s concepts: the role of the sensory-motor system in reason and language. Cognitive Neuropsychology, 22, 455–479.

    Article  Google Scholar 

  • Gibson, J. J. (1979). The ecological approach to visual perception. Boston: Houghton Mifflin.

    Google Scholar 

  • Gilmore, R. (1981). Catastrophe theory for scientists and engineers. New York: Wiley-Interscience.

    MATH  Google Scholar 

  • Grafton, S. T., Arbib, M. A., Fadiga, L., & Rizzolatti, G. (1996). Localization of grasp representation in humans by positron emission tomography: 2 observation compared with imagination. Experimental Brain Research, 112, 103–111.

    Article  Google Scholar 

  • Halford, G. S., Wilson, W. H., & Phillips, S. (1998). Processing capacity defined by relational complexity: implications for comparative, developmental, and cognitive psychology. Behavioral and Brain Sciences, 21, 723–802.

    Google Scholar 

  • Harlow, H. F. (1949). The formation of learning sets. Psychological Review, 56, 51–65.

    Article  Google Scholar 

  • Hebb, D. O. (1949). The organization of behavior: a neuropsychological theory. New York: Wiley.

    Google Scholar 

  • Hersch, M., & Billard, A. G. (2008). Reaching with multi-referential dynamical systems. Autonomous Robots, 25, 71–83.

    Article  Google Scholar 

  • Hoff, W., & Ahuja, N. (1989). Surfaces from stereo: integrating feature matching, disparity estimation, and contour detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 11, 121–136.

    Article  Google Scholar 

  • Hoffmann, H., Pastor, P., Asfour, T., & Schaal, S. (2009). Learning and generalization of motor skills by learning from demonstration. In Proceedings of the international conference on robotics and automation.

    Google Scholar 

  • Horn, B. K. P. (1990). Height and gradient from shading. International Journal of Computer Vision, 5, 37–75.

    Article  Google Scholar 

  • Iacoboni, M., Koski, L. M., Brass, M., Bekkering, H., Woods, R. P., Dubeau, M. C., Mazziotta, J. C., & Rizzolatti, G. (2001). Reafferent copies of imitated actions in the right superior temporal cortex. Proceedings of the National Academy of Sciences of the United States of America, 98, 13995–13999.

    Article  Google Scholar 

  • Iacoboni, M. (2009). Imitation, empathy, and mirror neurons. Annual Review of Psychology.

  • Ijspeert, J. A., Nakanishi, J., & Schaal, S. (2002). Movement imitation with nonlinear dynamical systems in humanoid robots. In Proceedings of the IEEE international conference on robotics and automation.

    Google Scholar 

  • Iyer, N., Jayanti, S., Lou, K., Kalyanaraman, Y., & Ramani, K. (2005). Three-dimensional shape searching: state-of-the-art review and future trends. Computer Aided Design, 37, 509–530.

    Article  Google Scholar 

  • Jagadish, H. V., & Bruckstein, A. M. (1992). On sequential shape descriptions. Pattern Recognition, 25, 165–172.

    Article  Google Scholar 

  • Koenderink, J. J., & van Doorn, A. J. (1986). Dynamic shape. Biological Cybernetics, 53, 383–396.

    Article  MATH  MathSciNet  Google Scholar 

  • Koski, L., Wohlschlager, A., Bekkering, H., Woods, R. P., Dubeau, M. C., Mazziotta, J. C., & Iacoboni, M. (2002). Modulation of motor and premotor activity during imitation of target-directed actions. Cerebral Cortex, 12, 847–855.

    Article  Google Scholar 

  • Li, X., & Yeung, D. Y. (1997). On-line alphanumeric character recognition using dominant points in strokes. Pattern Recognition, 30(1), 31–44.

    Article  Google Scholar 

  • Loncaric, S. (1998). A survey of shape analysis techniques. Pattern Recognition, 31(8), 983–1001.

    Article  Google Scholar 

  • Lopes, M., Melo, F., Montesano, L., & Santos-Victor, J. (2010). Abstraction levels for robotic imitation: overview and computational approaches. In O. Sigaud & J. Peters (Eds.), Series: studies in computational intelligence. From motor learning to interaction learning in robots. Berlin: Springer.

    Google Scholar 

  • Madduri, K., Aparna, H. K., & Chakravarthy, V. S. (2004). PATRAM—A handwritten word processor for Indian languages. In Proceedings of ninth international workshop on frontiers in handwriting recognition.

    Google Scholar 

  • Manikandan, B. J., Shankar, G., Anoop, V., Datta, A., & Chakravarthy, V. S. (2002). LEKHAK: a system for online recognition of handwritten tamil characters. In Proceedings of the international conference on natural language processing.

    Google Scholar 

  • Marr, D. (1982). Vision: a computational investigation into the human representation and processing of visual information. New York: Freeman.

    Google Scholar 

  • Mehrotra, R., Nichani, S., & Ranganathan, N. (1990). Corner detection. Pattern Recognition, 23(11), 1223–1233.

    Article  Google Scholar 

  • Metta, G., Fitzpatrick, P., & Natale, L. (2006). YARP: yet another robot platform. International Journal on Advanced Robotics Systems, 3(1), 43–48. Special issue on Software Development and Integration in Robotics.

    Google Scholar 

  • Mohan, V., & Morasso, P. (2007). Towards reasoning and coordinating action in the mental space. International Journal of Neural Systems, 17(4), 1–13.

    Article  Google Scholar 

  • Mohan, V., & Morasso, P. (2008). Reaching extended’: unified computational substrate for mental simulation and action execution in cognitive robots. In Proceedings of third international conference of cognitive science.

    Google Scholar 

  • Mohan, V., Morasso, P., Metta, G., & Sandini, G. (2009a). A biomimetic, force-field based computational model for motion planning and bimanual coordination in humanoid robots. Autonomous Robots, 27(3), 291–301.

    Article  Google Scholar 

  • Mohan, V., Zenzeri, J., Morasso, P., & Metta, G. (2009b). Composing and coordinating body models of arbitrary complexity and redundancy: a biomimetic field computing approach. In Proceedings of the IEEE/RSJ international conference on intelligent robots and systems.

    Google Scholar 

  • Morasso, P., Mussa Ivaldi, F. A., & Ruggiero, C. (1983). How a discontinuous mechanism can produce continuous patterns in trajectory formation and handwriting. Acta Psychologica, 54, 83–98.

    Article  Google Scholar 

  • Morasso, P., Casadio, M., Mohan, V., & Zenzeri, J. (2010). A neural mechanism of synergy formation for whole body reaching. Biological Cybernetics, 102(1), 45–55.

    Article  Google Scholar 

  • Mussa Ivaldi, F. A., Morasso, P., & Zaccaria, R. (1988). Kinematic networks. A distributed model for representing and regularizing motor redundancy. Biological Cybernetics, 60, 1–16.

    Google Scholar 

  • Perrett, D. I., & Emery, N. J. (1994). Understanding the intentions of others from visual signals: neurophysiological evidence. Current Psychology of Cognition, 13, 683–694.

    Google Scholar 

  • Poston, T., & Stewart, I. N. (1998). Catastrophe theory and its applications. London: Pitman.

    Google Scholar 

  • Ramachandran, V. S., & Hubbard, E. M. (2003). Hearing colors, tasting shapes. Scientific American, 288(5), 42–49.

    Article  Google Scholar 

  • Rizzolatti, G., & Arbib, M. A. (1998). Language within our grasp. Trends in Neurosciences, 21, 188–194.

    Article  Google Scholar 

  • Rizzolatti, G., Fogassi, L., & Gallese, V. (2001). Neurophysiological mechanisms underlying action understanding and imitation. Nature Reviews. Neuroscience, 2, 661–670.

    Article  Google Scholar 

  • Rizzolatti, G., Fadiga, L., Matelli, M., Bettinardi, V., Paulesu, E., Perani, D., & Fazio, F. (1996). Localization of grasp representations in humans by PET: 1. Observation versus execution. Experimental Brain Research, 111, 246–252.

    Article  Google Scholar 

  • Rocha, J., & Pavlidis, T. (1994). A shape analysis model with application to a character recognition system. IEEE Transactions on Pattern Analysis and Machine Intelligence, 16(4), 393–404.

    Article  Google Scholar 

  • Sandini, G., Metta, G., & Vernon, D. (2004). RobotCub: an open framework for research in embodied cognition. In Proceedings of the 4th IEEE/RAS international conference on humanoid robots (pp. 13–32).

    Chapter  Google Scholar 

  • Sanfeliu, A., & Fu, K. (1983). A distance measure between attributed relational graphs for pattern recognition. IEEE Transactions on Systems, Man, and Cybernetics, 13(3), 353–362.

    MATH  Google Scholar 

  • Schaal, S. (1999). Is imitation learning the route to humanoid robots? Trends in Cognitive Sciences, 3, 233–242.

    Article  Google Scholar 

  • Schaal, S., Ijspeert, A., & Billard, A. (2003). Computational approaches to motor learning by imitation. Philosophical Transaction of the Royal Society of London B, 358, 537–547.

    Article  Google Scholar 

  • Shankar, G., Anoop, V., & Chakravarthy, V. S. (2003). LEKHAK [MAL]: a system for online recognition of handwritten Malayalam characters. In Proceedings of the national conference on communications, IIT, Madras.

    Google Scholar 

  • Shapiro, R. (1978). Direct linear transformation method for three-dimensional cinematography. Restoration Quarterly, 49, 197–205.

    Google Scholar 

  • Smith, L. B., Yu, C., & Pereira, A. F. (2010). Not your mother’s view: the dynamics of toddler visual experience. Developmental Science. doi:10.1111/j.1467-7687.2009.00947.x.

    Google Scholar 

  • Stiny, G., & Gips, J. (1978). Algorithmic aesthetics: computer models for criticism and design in the arts. California: University of California Press.

    Google Scholar 

  • Stiny, G. (2006). Shape: talking about seeing and doing. Cambridge: MIT Press.

    Google Scholar 

  • Symes, E., Ellis, R., & Tucker, M. (2007). Visual object affordances: object orientation. Acta Psychologica, 124, 238–255.

    Article  Google Scholar 

  • Teh, C. H., & Chin, R. T. (1989). On the detection of dominant points on digital curves. IEEE Transactions on Pattern Analysis and Machine Intelligence, 11(8), 859–872.

    Article  Google Scholar 

  • Ternovskiy, I., Jannson, T., & Caulfield, J. (2002). Is catastrophe theory the basis for visual perception? Three-dimensional holographic imaging. New York: Wiley. doi:10.1002/0471224545.ch10.

    Google Scholar 

  • Thom, R. (1975). Structural stability and morphogenesis. Reading: Addison-Wesley.

    MATH  Google Scholar 

  • Tikhanoff, V., Cangelosi, A., Fitzpatrick, P., Metta, G., Natale, L., & Nori, F. (2008). An open-source simulator for cognitive robotics research. Cogprints, article 6238.

  • Tsuji, T., Morasso, P., Shigehashi, K., & Kaneko, M. (1995). Motion planning for manipulators using artificial potential field approach that can adjust convergence time of generated arm trajectory. Journal of the Robotics Society of Japan, 13(3), 285–290.

    Google Scholar 

  • Ulupinar, F., & Nevatia, R. (1990). Inferring shape from contour for curved surfaces. In Proceedings of the international conference on pattern recognition (pp. 147–154).

    Chapter  Google Scholar 

  • Visalberghi, E., & Tomasello, M. (1997). Primate causal understanding in the physical and in the social domains. Behavioral Processes, 42, 189–203.

    Article  Google Scholar 

  • Wallace, T., & Wintz, P. (1980). An efficient three-dimensional aircraft recognition algorithm using normalized Fourier descriptors. Computer Graphics and Image Processing, 13, 99–126.

    Article  Google Scholar 

  • Yu, C., Smith, L. B., Shen, H., Pereira, A. F., & Smith, T. G. (2009). Active information selection: visual attention through the hands. IEEE Transactions on Autonomous Mental Development, 1(2), 141–151.

    Article  Google Scholar 

  • Zak, M. (1988). Terminal attractors for addressable memory in neural networks. Physical Letters A, 133, 218–222.

    Article  Google Scholar 

  • Zeeman, E. C. (1977). Catastrophe theory-selected papers 1972–1977. Reading: Addison-Wesley.

    MATH  Google Scholar 

  • Zöllner, R., Asfour, T., & Dillman, R. (2004). Programming by demonstration: dual-arm manipulation tasks for humanoid robots. In Proceedings of the IEEE/RSJ international conference on intelligent robots systems.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vishwanathan Mohan.

Electronic Supplementary Material

Below are the links to the electronic supplementary material.

Observing a tool being used.wmv (WMV 11.2 MB)

Auro2iCubArtShort.wmv (WMV 26.7 MB)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mohan, V., Morasso, P., Zenzeri, J. et al. Teaching a humanoid robot to draw ‘Shapes’. Auton Robot 31, 21–53 (2011). https://doi.org/10.1007/s10514-011-9229-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10514-011-9229-0

Keywords

Navigation