Skip to main content
Log in

Towards an Embodied Developing Vision System

  • Technical Contribution
  • Published:
KI - Künstliche Intelligenz Aims and scope Submit manuscript

Abstract

Many cognitive scientists now agree that artificial cognition might be probably achieved developmentally, starting from a set of basic-level premature capabilities and incrementally self-extending itself with experience through discrete or continuous stages bred with experience. Although we are still far from seeing an artificial full-fledged self-extending cognitive system, the literature has provided promising examples and demonstrations. Nonetheless, not much thought is given to the modeling of how an artificial vision system, an important part of a developing cognitive system, can develop itself in a similar manner. In this article, we dwell upon the issue of a developing vision system, the relevant problems and possible solutions whenever possible.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. According to [43], a problem is well-posed if (1) a solution exists, (2) the solution is unique, and (3) it depends continuously on the data. A problem is ill-posed if it is not well-posed.

References

  1. Adolph KE, Eppler MA, Gibson EJ (1993) Crawling versus walking infants’ perception of affordances for locomotion over sloping surfaces. Child Dev 64(4):1158–1174

    Article  Google Scholar 

  2. Aloimonos J (1990) Purposive and qualitative active vision. In: 10th international conference on pattern recognition, IEEE, vol 1, pp 346–360

  3. Aloimonos Y (2013) Active perception. Lawrence Erlbaum Associates, New Jersey

    Google Scholar 

  4. Aloimonos Y, Rosenfeld A (1994) Principles of computer vision. In: Young TY (ed) Handbook of pattern recognition and image processing: computer vision. Chap. 1. Academic Press, San Diego, pp 1–15

    Google Scholar 

  5. Altamura M, Carver FW, Elvevåg B, Weinberger DR, Coppola R (2014) Dynamic cortical involvement in implicit anticipation during statistical learning. Neurosci Lett 558:73–77

    Article  Google Scholar 

  6. Asada M, Hosoda K, Kuniyoshi Y, Ishiguro H, Inui T, Yoshikawa Y, Ogino M, Yoshida C (2009) Cognitive developmental robotics: a survey. IEEE Trans Auton Ment Dev 1(1):12–34

    Article  Google Scholar 

  7. Atil I (2010) Function and appearance based emergence of object concepts through affordances. Master’s thesis, Dept. of Computer Engineering, Middle East Technical University

  8. Barakat BK, Seitz AR, Shams L (2013) The effect of statistical learning on internal stimulus representations: predictable items are enhanced even when not predicted. Cognition 129(2):205–211

    Article  Google Scholar 

  9. Barlow H (2001) The exploitation of regularities in the environment by the brain. Behav Brain Sci 24(04):602–607

    Google Scholar 

  10. Bengio Y (2009) Learning deep architectures for AI. Found Trends Mach Learn 2(1):1–127

    Article  MATH  MathSciNet  Google Scholar 

  11. Bengio Y, Courville A, Vincent P (2013) Representation learning: a review and new perspectives. IEEE Trans Pattern Anal Mach Intell 35(8):1798–1828

    Article  Google Scholar 

  12. Berkes P, Wiskott L (2005) Slow feature analysis yields a rich repertoire of complex cell properties. J Vis 5(6):579–602

    Article  Google Scholar 

  13. Bertero M, Poggio T, Torre V (1987) Ill-posed problems in early vision. Tech. rep, Cambridge

  14. Boden MA (2006) Mind as machine: a history of cognitive science. Oxford University Press, Oxford

    Google Scholar 

  15. Brooks RA (1997) From earwigs to humans. Robot Auton Syst 20(2):291–304

    Article  Google Scholar 

  16. Brunswik E, Kamiya J (1953) Ecological cue-validity of ’proximity’ and of other Gestalt factors. Am J Psychol LXVI:20–32

    Article  Google Scholar 

  17. Cangelosi A, Metta G, Sagerer G, Nolfi S, Nehaniv C, Fischer K, Tani J, Belpaeme T, Sandini G, Nori F et al (2010) Integration of action and language knowledge: a roadmap for developmental robotics. IEEE Trans Auton Ment Dev 2(3):167–195

    Article  Google Scholar 

  18. Ciresan D, Meier U, Schmidhuber J (2012) Multi-column deep neural networks for image classification. In: IEEE conference on computer vision and pattern recognition (CVPR), IEEE, pp 3642–3649

  19. Cohen LB, Cashon CH (2003) Infant perception and cognition. In: Lerner RM, Easterbrooks MA, Mistry J (eds) Handbook of psychology: developmental psychology. Chap. 3. John Wiley and Sons Inc, USA, pp 65–89

    Google Scholar 

  20. Conway CM, Christiansen MH (2005) Modality-constrained statistical learning of tactile, visual, and auditory sequences. J Exp Psychol Learn Mem Cogn 31(1):24

    Article  Google Scholar 

  21. Elder J, Goldberg R (2002) Ecological statistics of Gestalt laws for the perceptual organization of contours. J Vis 2(4):324–353

    Article  Google Scholar 

  22. Elder JH, Krupnik A, Johnston LA (2003) Contour grouping with prior models. IEEE Trans Pattern Anal Mach Intell 25(25):1–14

    Google Scholar 

  23. Elman JL (1993) Learning and development in neural networks: the importance of starting small. Cognition 48(1):71–99

    Article  Google Scholar 

  24. Ferrell C, Kemp C (1996) An ontogenetic perspective to scaling sensorimotor intelligence. In: Embodied cognition and action: papers from the 1996 AAAI Fall Symposium, vol 5

  25. Fidler S, Leonardis A (2007) Towards scalable representations of object categories: learning a hierarchy of parts. In: IEEE conference on computer vision and pattern recognition (CVPR)

  26. Field D (1994) What is the goal of sensory coding? Neural Comput 6(4):561–601

    Article  MathSciNet  Google Scholar 

  27. Field DJ, Hayes A, Hess RF (1993) Contour integration by the human visual system: evidence for a local “association field”. Vis Res 33(2):173–193

    Article  Google Scholar 

  28. Fikes RE, Nilsson NJ (1972) Strips: a new approach to the application of theorem proving to problem solving. Artif intell 2(3):189–208

    Google Scholar 

  29. Firat O, Can G, Vural FY (2014) Representation learning for contextual object and region detection in remote sensing. In: International conference on pattern recognition (ICPR)

  30. Fiser J, Aslin RN (2002) Statistical learning of new visual feature combinations by infants. Proc Natl Acad Sci 99(24):15,822–15,826

    Article  Google Scholar 

  31. Fodor JA (1981) Representations: philosophical essays on the foundations of cognitive science. MIT Press/Bradford Books, Cambridge

    Google Scholar 

  32. Franzius M, Wilbert N, Wiskott L (2008) Invariant object recognition with slow feature analysis. In: Proceedings of the 18th international conference on Artificial Neural Networks, Part I. Springer, pp 961–970

  33. Fry AF, Hale S (1996) Processing speed, working memory, and fluid intelligence: evidence for a developmental cascade. Psychol Sci 7(4):237–241

    Article  Google Scholar 

  34. Garrett HE (1946) A developmental theory of intelligence. Am Psychol 1(9):372–378

    Article  MathSciNet  Google Scholar 

  35. Geisler W, Perry J, Super B, Gallogly D (2001) Edge co-occurrence in natural images predicts contour grouping performance. Vis Res 41(6):711–724

    Article  Google Scholar 

  36. Gibson EJ (1969) Principles of perceptual learning and development. Appleton-Century-Crofts

  37. Gibson EJ (2000) Perceptual learning in development: some basic concepts. Ecol Psychol 12(4):295–302

    Article  Google Scholar 

  38. Gibson EJ, Walk RD (1960) The “visual cliff”. Sci Am 202:67–71

    Article  Google Scholar 

  39. Gibson JJ (1977) The theory of a ordances. In: Shaw R, Bransford J (eds) Perceiving, acting, and knowing: toward an ecological Psychology. Wiley, New Jersey, pp 67–82

  40. Gibson JJ, Gibson EJ (1955) Perceptual learning: differentiation or enrichment? Psychol Rev 62(1):32–41

    Article  Google Scholar 

  41. Glenberg AM, Robertson DA (2000) Symbol grounding and meaning: a comparison of high-dimensional and embodied theories of meaning. J Mem Lang 43(3):379–401

    Article  Google Scholar 

  42. Gopnik A, Schulz L (2004) Mechanisms of theory formation in young children. Trends Cogn Sci 8(8):371–377

    Article  Google Scholar 

  43. Hadamard J (1923) Lectures on the Cauchy problem in linear partial differential equations. Yale, New Haven

    Google Scholar 

  44. Harnad S (1990) The symbol grounding problem. Physica D Nonlinear Phenom 42(1):335–346

    Article  MathSciNet  Google Scholar 

  45. Hinton GE, Osindero S, Teh YW (2006) A fast learning algorithm for deep belief nets. Neural Comput 18(7):1527–1554

    Article  MATH  MathSciNet  Google Scholar 

  46. Howe CQ, Purves D (2002) Range image statistics can explain the anomalous perception of length. PNAS 99(20):13,184–13,188

    Article  Google Scholar 

  47. Howe CQ, Purves D (2004) Size contrast and assimilation explained by the statistics of natural scene geometry. J Cogn Neurosci 16(1):90–102

    Article  Google Scholar 

  48. Huang J, Lee AB, Mumford D (2000) Statistics of range images. IEEE Conf Comput Vis Pattern Recogn (CVPR) 1(1):1324–1331

    Google Scholar 

  49. Hyvärinen A, Hoyer PO (2001) A two-layer sparse coding model learns simple and complex cell receptive fields and topography from natural images. Vis Res 41(18):2413–2423

    Article  Google Scholar 

  50. Hyvärinen A, Hurri J, Hoyer PO (2009) Natural image statistics: a probabilistic approach to early computational vision. Springer, London

    Book  Google Scholar 

  51. Kalkan S (2008) Multi-modal statistics of local image structures and its applications for depth prediction. PhD thesis, Dept. of Informatics, University of Goettingen, Germany

  52. Kalkan S, Wörgötter F, Krüger N (2006) Statistical analysis of local 3D structure in 2D images. In: IEEE Computer Society conference on computer vision and pattern recognition (CVPR), pp 1114–1121

  53. Kalkan S, Krüger N, Wörgötter F (2007) First-order and second-order statistical analysis of 3D and 2D structure. Netw Comput Neural Syst 18(2):129–160

    Article  Google Scholar 

  54. Kalkan S, Wörgötter F, Krüger N (2008) Depth prediction at homogeneous image structures. In: International conference on computer vision theory and applications (VISAPP)

  55. Kellman P, Arterberry M (eds) (1998) The cradle of knowledge: development of perception in infancy. MIT Press, Cambridge

  56. Kirkham NZ, Slemmer JA, Johnson SP (2002) Visual statistical learning in infancy: evidence for a domain general learning mechanism. Cognition 83(2):B35–B42

    Article  Google Scholar 

  57. Knill DC, Richards W (eds) (1996) Perception as bayesian inference. Cambridge University Press, Cambridge

    MATH  Google Scholar 

  58. König P, Krüger N (2006) Symbols as self-emergent entities in an optimization process of feature extraction and predictions. Biol Cybern 94(4):325–334

    Article  MATH  Google Scholar 

  59. Kraft D, Detry R, Pugeault N, Baseski E, Guerin F, Piater JH, Kruger N (2010) Development of object and grasping knowledge by robot exploration. IEEE Trans Auton Ment Dev 2(4):368–383

    Article  Google Scholar 

  60. Krüger N (1998) Collinearity and parallelism are statistically significant second order relations of complex cell responses. Neural Process Lett 8(2):117–129

    Article  Google Scholar 

  61. Krüger N, Wörgötter F (2002) Multi modal estimation of collinearity and parallelism in natural image sequences. Netw Comput Neural Syst 13(4):553–576

    Article  Google Scholar 

  62. Krüger N, Wörgötter F (2004) Statistical and deterministic regularities: utilisation of motion and grouping in biological and artificial visual systems. Adv Imaging Electron Phys 131:82–147

    Google Scholar 

  63. LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324

    Article  Google Scholar 

  64. Lungarella M, Metta G, Pfeifer R, Sandini G (2003) Developmental robotics: a survey. Connect Sci 15(4):151–190

    Article  Google Scholar 

  65. Minsky ML (1963) Steps towards artificial intelligence. In: Feigenbaum E, Feldman J (eds) Computers and Thought. McGraw-Hill, New York, pp 406–450

    Google Scholar 

  66. Newell A, Simon HA (1976) Computer science as empirical inquiry: symbols and search. Commun ACM 19(3):113–126

    Article  MathSciNet  Google Scholar 

  67. Olshausen BA, Field DJ (1997) Sparse coding with an overcomplete basis set: a strategy employed by V1? Vis Res 37(23):3311–3325

    Article  Google Scholar 

  68. Overton WF (2003) Development across the life span. In: Easterbrooks MA, Mistry J (eds) Handbook of psychology: One. Wiley, New Jersey, pp 11–42

  69. Potetz B, Lee TS (2003) Statistical correlations between two-dimensional images and three-dimensional structures in natural scenes. J Opt Soc Am 20(7):1292–1303

    Article  Google Scholar 

  70. Pugeault N, Krüger N, Wörgötter F (2004) A non-local stereo similarity based on collinear groups. In: Proceedings of the fourth international ICSC symposium on engineering of intelligent systems

  71. Purves D, Lotto B (eds) (2002) Why we see what we do: an empirical theory of vision. Sinauer Associates, Sunderland

    Google Scholar 

  72. Qin AK, Suganthan PN (2004) Robust growing neural gas algorithm with application in cluster analysis. Neural Netw 17(8):1135–1148

    Article  MATH  Google Scholar 

  73. Rao RPN, Olshausen BA, Lewicki MS (eds) (2002) Probabilistic models of the brain. MIT Press, MA

    Google Scholar 

  74. Rifai S, Bengio Y, Courville A, Vincent P, Mirza M (2012) Disentangling factors of variation for facial expression recognition. In: European conference on computer vision. Springer, pp 808–822

  75. Rocha A (1997) The brain as a symbol-processing machine. Prog Neurobiol 53(2):121–198

    Article  Google Scholar 

  76. Saffran JR (2003) Statistical language learning mechanisms and constraints. Curr Dir Psychol Sci 12(4):110–114

    Article  Google Scholar 

  77. Saffran JR, Aslin RN, Newport EL (1996) Statistical learning by 8-month-old infants. Science 274(5294):1926–1928

    Article  Google Scholar 

  78. Salakhutdinov R, Hinton GE (2009) Deep boltzmann machines. In: International conference on artificial intelligence and statistics, pp 448–455

  79. Samuel AL (2000) Some studies in machine learning using the game of checkers. IBM J Res Dev 44(1.2):206–226

    Article  Google Scholar 

  80. Sandini G (1997) Artificial systems and neuroscience. In: Proceeding of the Otto and Martha Fischbeck seminar on active vision

  81. Sarkar S, Boyer KL (1993) Perceptual organization in computer vision: a review and a proposal for a classificatory structure. IEEE Trans Syst Man Cybern 23(2):382–399

    Article  Google Scholar 

  82. Scalzo F, Piater JH (2005) Statistical learning of visual feature hierarchies. IEEE Workshop Learn Comput Vis Pattern Recogn 3:44–44

    Google Scholar 

  83. Schmidhuber J (2014) Deep learning in neural networks: an overview. CoRR abs/1404.7828

  84. Selfridge OG (1958) Pandemonium: a paradigm for learning in mechanisation of thought processes. In: Proceedings of a symposium held at the National Physical Laboratory, pp 513–526

  85. Simoncelli E, Ohlshausen B (2001) Natural image statistics and neural representations. Anu Rev Neurosci 24:1193–1216

    Article  Google Scholar 

  86. Simoncelli EP (2003) Vision and the statistics of the visual environment. Curr Opin Neurobiol 13(2):144–149

    Article  Google Scholar 

  87. Smith L, Yu C (2008) Infants rapidly learn word-referent mappings via cross-situational statistics. Cognition 106(3):1558–1568

    Article  Google Scholar 

  88. Spelke E (1990) Principles of object perception. Cogn Sci 14(1):29–56

    Article  Google Scholar 

  89. Thelen E, Smith L (1994) A dynamic systems approach to the development of cognition and action. MIT Press, Cambridge

    Google Scholar 

  90. Tomasello M (2009) Constructing a language: a usage-based theory of language acquisition. Harvard University Press, Cambridge

    Google Scholar 

  91. Turk-Browne NB, Scholl BJ, Chun MM, Johnson MK (2009) Neural evidence of statistical learning: efficient detection of visual regularities without awareness. J Cogn Neurosci 21(10):1934–1945

    Article  Google Scholar 

  92. Turk-Browne NB, Scholl BJ, Johnson MK, Chun MM (2010) Implicit perceptual anticipation triggered by statistical learning. J Neurosci 30(33):11177–11187

    Article  Google Scholar 

  93. Vapnik V (2000) The nature of statistical learning theory. Springer, New York

    Book  MATH  Google Scholar 

  94. Wagemans J, Elder JH, Kubovy M, Palmer SE, Peterson MA, Singh M, von der Heydt R (2012) A century of gestalt psychology in visual perception: I. Perceptual grouping and figure-ground organization. Psychol Bull 138(6):1172

    Article  Google Scholar 

  95. Yang Z, Purves D (2003) Image/source statistics of surfaces in natural scenes. Netw Comput Neural Syst 14(3):371–390

    Article  Google Scholar 

  96. Zhu SC (1999) Embedding gestalt laws in markov random fields. IEEE Trans Pattern Anal Mach Intell 21(11):1170–1187

    Article  Google Scholar 

Download references

Acknowledgments

We would like to thank Orhan Firat and Fatos Yarman Vural for contributing the figure on the feature learning example. We acknowledge the use of the facilities provided by the the Modeling and Simulation Center of METU (MODSIMMER) for the article.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sinan Kalkan.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Atıl, İ., Kalkan, S. Towards an Embodied Developing Vision System. Künstl Intell 29, 41–50 (2015). https://doi.org/10.1007/s13218-015-0351-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13218-015-0351-6

Keywords

Navigation