Abstract
In recent years, scientific and technological advances have produced artificial systems that have matched or surpassed human capabilities in narrow domains such as face detection and optical character recognition. However, the problem of producing truly intelligent machines still remains far from being solved. In this chapter, we first describe some of these recent advances, and then review one approach to moving beyond these limited successes – the neuromorphic approach of studying and reverse-engineering the networks of neurons in the human brain (specifically, the visual system). Finally, we discuss several possible future directions in the quest for visual intelligence.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
A pedestrian detection system that stops a car automatically, http://articles.economictimes.indiatimes.com/2011-02-27/news/28638493_1_detection-system-volvo-collision-warning-system
Caltech 101, http://www.vision.caltech.edu/Image_Datasets/Caltech101/
DARPA Grand Challenge, http://en.wikipedia.org/wiki/DARPA_Grand_Challenge
Digital Camera Face Recognition: How It Works, http://www.popularmechanics.com/technology/how-to/4218937
HomeCageScan 2.0, http://www.cleversysinc.com/products/software/homecagescan
Night View Assist: How night becomes day., http://www.daimler.com/dccom/0-5-1210218-1-1210320-1-0-0-1210228-0-0-135-7165-0-0-0-0-0-0-0.html
The MIT Intelligence Initiative, http://isquared.mit.edu/
The PASCAL Visual Object Classes Homepage, http://pascallin.ecs.soton.ac.uk/challenges/VOC/
USPS Awards Parascript Contract for OCR to Support Automated Parcel Bundle Sorting at USPS Facilities throughout the United States, http://money.msn.com/business-news/article.aspx?feed=PR&Date=20110601&ID=13713512/
Amit, Y., Mascaro, M.: An integrated network for invariant visual detection and recognition. Vision Research 43(19), 2073–2088 (2003), http://dx.doi.org/10.1016/S0042-69890300306-7 , doi:10.1016/S0042-6989(03)00306-7
Anzai, A., Peng, X., Essen, D.V.: Neurons in monkey visual area V2 encode combinations of orientations. Nature Neuroscience 10(10), 1313–1321 (2007), http://www.nature.com/neuro/journal/vaop/ncurrent/full/nn1975.html
Cadieu, C., Kouh, M., Pasupathy, A., Connor, C., Riesenhuber, M., Poggio, T.: A model of V4 shape selectivity and invariance. Journal of Neurophysiology 98(3), 1733 (2007), http://jn.physiology.org/content/98/3/1733.short
Fukushima, K.: Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biological Cybernetics 36(4), 193–202 (1980), http://www.springerlink.com/content/r6g5w3tt54528137 , doi:10.1007/BF00344251
Gawne, T.J., Martin, J.M.: Responses of primate visual cortical V4 neurons to simultaneously presented stimuli. Journal of Neurophysiology 88(3), 1128 (2002), http://jn.physiology.org/content/88/3/1128.short
Hubel, D., Wiesel, T.: Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex. The Journal of Physiology 160(1), 106 (1962), http://jp.physoc.org/content/160/1/106.full.pdf
Hung, C.P., Kreiman, G., Poggio, T., DiCarlo, J.J.: Fast Readout of Object Identity from Macaque Inferior Temporal Cortex. Science 310(5749), 863–866 (2005), http://www.sciencemag.org/cgi/content/abstract/310/5749/863 , doi:10.1126/science.1117593
Jhuang, H., Garrote, E., Yu, X., Khilnani, V., Poggio, T., Steele, A., Serre, T.: Automated home-cage behavioural phenotyping of mice. Nature Communications 1(6), 1–9 (2010), http://www.nature.com/ncomms/journal/v1/n6/abs/ncomms1064.html
Jhuang, H., Serre, T., Wolf, L., Poggio, T.: A biologically inspired system for action recognition. In: International Conference on Computer Vision (ICCV), vol. 11, pp. 1–8 (2007), http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=4408988
Keysers, C., Xiao, D., Földiák, P., Perrett, D.: The speed of sight. Journal of Cognitive Neuroscience 13(1), 90–101 (2001), http://www.mitpressjournals.org/doi/abs/10.1162/089892901564199
Lampl, I., Ferster, D.: Intracellular measurements of spatial integration and the MAX operation in complex cells of the cat primary visual cortex. Journal of Neurophysiology 92(5), 2704 (2004), http://jn.physiology.org/content/92/5/2704.short
Li, F., VanRullen, R., Koch, C., Perona, P.: Rapid natural scene categorization in the near absence of attention. Proceedings of the National Academy of Sciences of the United States of America 99(14), 9596 (2002), http://www.pnas.org/content/99/14/9596.short
Mel, B.W.: SEEMORE: Combining Color, Shape, and Texture Histogramming in a Neurally Inspired Approach to Visual Object Recognition. Neural Computation 9(4), 777–804 (1997), http://dx.doi.org/10.1162/neco.1997.9.4.777 , http://www.mitpressjournals.org/doi/abs/10.1162/neco.1997.9.4.777 doi:10.1162/neco.1997.9.4.777
Mishkin, M., Ungerleider, L.G., Macko, K.A.: Object vision and spatial vision: Two cortical pathways. Trends in Neurosciences 6, 414–417 (1983)
Mutch, J., Lowe, D.: Multiclass Object Recognition with Sparse, Localized Features. In: 2006 IEEE Conference on Computer Vision and Pattern Recognition, pp. 11–18. IEEE (2006), http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=1640736 , doi:10.1109/CVPR.2006.200
Perrett, D., Oram, M.: Neurophysiology of shape processing. Image and Vision Computing 11(6), 317–333 (1993), http://linkinghub.elsevier.com/retrieve/pii/0262885693900115
Pinto, N., DiCarlo, J.J., Cox, D.D.: Establishing Good Benchmarks and Baselines for Face Recognition. In: IEEE European Conference on Computer Vision, Faces in ’Real-Life’ Images Workshop (2008), http://hal.archives-ouvertes.fr/inria-00326732/
Pinto, N., DiCarlo, J.J., Cox, D.D.: How far can you get with a modern face recognition test set using only simple features? In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2591–2598. IEEE (2009), http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=5206605 , doi:10.1109/CVPR.2009.5206605
Riesenhuber, M., Poggio, T.: Hierarchical models of object recognition in cortex. Nature Neuroscience 2(11), 1019–1025 (1999), doi:10.1038/14819
Serre, T., Kouh, M., Cadieu, C., Knoblich, U., Kreiman, G., Poggio, T.: A Theory of Object Recognition: Computations and Circuits in the Feedforward Path of the Ventral Stream in Primate Visual Cortex. CBCL Paper #259/AI Memo #2005-036 (2005), http://en.scientificcommons.org/21119952
Serre, T., Oliva, A., Poggio, T.: A feedforward architecture accounts for rapid categorization. Proceedings of the National Academy of Sciences of the United States of America 104(15), 6424–6429 (2007), http://cat.inist.fr/?aModele=afficheN&cpsidt=18713198
Serre, T., Poggio, T.: A neuromorphic approach to computer vision. Communications of the ACM 53(10), 54–61 (2010), http://portal.acm.org/citation.cfm?id=1831425
Serre, T., Wolf, L., Bileschi, S., Riesenhuber, M., Poggio, T.: Robust Object Recognition with Cortex-Like Mechanisms. IEEE Trans. Pattern Anal. Mach. Intell. 29(3), 411–426 (2007), http://portal.acm.org/citation.cfm?id=1263421&dl=
Thorpe, S., Fabre-Thorpe, M.: Seeking categories in the brain. Science 291(5502), 260 (2001), http://www.sciencemag.org/content/291/5502/260.short
Thorpe, S., Fize, D., Marlot, C.: Speed of processing in the human visual system. Nature 381(6582), 520–522 (1996), http://www.ncbi.nlm.nih.gov/pubmed/8632824 , doi:10.1038/381520a0
Turing, A.M.: Computing machinery and intelligence. Mind 59(236), 433–460 (1950)
VanRullen, R., Koch, C.: Visual selective behavior can be triggered by a feed-forward process. Journal of Cognitive Neuroscience 15(2), 209–217 (2003), http://www.mitpressjournals.org/doi/abs/10.1162/089892903321208141
Wallis, G., Rolls, E.T.: A model of invariant object recognition in the visual system. Progress in Neurobiology 51, 167–194 (1997), http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.48.880&rep=rep1&type=pdf
Wallisch, P., Movshon, J.: Structure and Function Come Unglued in the Visual Cortex. Neuron 60(2), 195–197 (2008), http://linkinghub.elsevier.com/retrieve/pii/s0896-6273%2808%2900851-9
Wersing, H., Körner, E.: Learning optimized features for hierarchical models of invariant object recognition. Neural Computation 15(7), 1559–1588 (2003), http://www.mitpressjournals.org/doi/abs/10.1162/089976603321891800 , doi:10.1162/089976603321891800
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer Berlin Heidelberg
About this chapter
Cite this chapter
Tan, C., Leibo, J.Z., Poggio, T. (2013). Throwing Down the Visual Intelligence Gauntlet. In: Cipolla, R., Battiato, S., Farinella, G. (eds) Machine Learning for Computer Vision. Studies in Computational Intelligence, vol 411. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28661-2_1
Download citation
DOI: https://doi.org/10.1007/978-3-642-28661-2_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-28660-5
Online ISBN: 978-3-642-28661-2
eBook Packages: EngineeringEngineering (R0)