Parallel Deep Learning with Suggestive Activation for Object Category Recognition

Varadarajan, Karthik Mahesh; Vincze, Markus

doi:10.1007/978-3-642-39402-7_36

Parallel Deep Learning with Suggestive Activation for Object Category Recognition

Karthik Mahesh Varadarajan &
Markus Vincze

Conference paper

3473 Accesses
5 Citations

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7963))

Abstract

The performance of visual perception algorithms for object category detection has largely been restricted by the lack of generalizability and scalability of state-of-art hand-crafted feature detectors and descriptors across instances of objects with different shapes, textures etc. The recently introduced deep learning algorithms have attempted at overcoming this limitation through automatic learning of feature kernels. Nevertheless, conventional deep learning architectures are uni-modal, essentially feedforward testing pipelines working on image space with little regard for context and semantics. In this paper, we address this issue by presenting a new framework for object categorization based on Deep Learning, called Parallel Deep Learning with Suggestive Activation (PDLSA) that imbibes several brain operating principles drawn from neuroscience and psychophysical studies. In particular, we focus on Suggestive Activation – a schema which enables feedback loops in the recognition process that use information obtained from partial detection results to generate hypotheses based on long-term memory (or knowledge base) to search in the image space for features corresponding to these hypotheses thereby enabling activation of the response corresponding to the correct object category through multi-modal integration. Results presented against a traditional SIFT based category classifier on the University of Washington benchmark RGB-D dataset demonstrates the validity of the approach.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Lowe, D.G.: Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision 60(2), 91–110 (2004)
Article Google Scholar
Kim, S., Yoon, K.-J., Kweon, I.S.: Object recognition using a generalized robust invariant feature and Gestalt’s law of proximity and similarity. Pattern Recognition 41(2), 726–741 (2008)
Article MATH Google Scholar
Bay, H., Tuytelaars, T., Van Gool, L.: SURF: Speeded up robust features. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006, Part I. LNCS, vol. 3951, pp. 404–417. Springer, Heidelberg (2006)
Chapter Google Scholar
Mikolajczyk, K., Schmid, C.: Scale & affine invariant interest point detectors. International Journal of Computer Vision 60(1), 63–86 (2004)
Article Google Scholar
Forssen, P.-E., Lowe, D.G.: Shape descriptors for maximally stable extremal regions. In: IEEE 11th International Conference on Computer Vision, ICCV 2007. IEEE (2007)
Google Scholar
Fei-Fei, L., Perona, P.: A bayesian hierarchical model for learning natural scene categories. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2005, vol. 2. IEEE (2005)
Google Scholar
Jarrett, K., et al.: What is the best multi-stage architecture for object recognition? In: 2009 IEEE 12th International Conference on Computer Vision. IEEE (2009)
Google Scholar
Bengio, Y.: Learning deep architectures for AI. Foundations and Trends in Machine Learning 2(1), 1–127 (2009)
Article MathSciNet MATH Google Scholar
Cover, T.M., Thomas, J.A.: Elements of information theory. Wiley-Interscience (2006)
Google Scholar
Field, D.J.: Relations between the statistics of natural images and the response properties of cortical cells. J. Opt. Soc. Am. A 4(12), 2379–2394 (1987)
Article Google Scholar
Field, D.J.: What is the goal of sensory coding? Neural Computation 6(4) (1994)
Google Scholar
Caruana, R.: Multitask learning. Machine Learning 28(1), 41–75 (1997)
Article MathSciNet Google Scholar
Biederman, I.: Recognition-by-components: a theory of human image understanding. Psychological Review 94(2), 115 (1987)
Article Google Scholar
Arbib, M.A. (ed.): The handbook of brain theory and neural networks. MIT Press
Google Scholar
Rogers, T., et al.: Object recognition under semantic impairment: The effects of conceptual regularities on perceptual decisions. Language and Cognitive Processes 18(5-6), 625–662 (2003)
Article Google Scholar
Varadarajan, K.M., Vincze, M.: AfNet: The affordance network. In: Lee, K.M., Matsushita, Y., Rehg, J.M., Hu, Z. (eds.) ACCV 2012, Part I. LNCS, vol. 7724, pp. 512–523. Springer, Heidelberg (2013)
Chapter Google Scholar
Gibson, J.J.: The concept of affordances. Perceiving, Acting, and Knowing, 67–82 (1977)
Google Scholar
Prinz, W.: Modes of linkage between perception and action. Cognition and motor processes, pp. 185–193. Springer, Heidelberg (1984)
Google Scholar
Kohler, E., et al.: Hearing sounds, understanding actions: action representation in mirror neurons. Science 297(5582), 846–848 (2002)
Article Google Scholar
Varadarajan, K.M.: k-TR: Karmic Tabula Rasa – A Theory of Visual Perception. In: Conference of the International Society of Psychophysics - ISP (2011)
Google Scholar
Varadarajan, K.M., Vincze, M.: Knowledge representation and inference for grasp affordances. In: Crowley, J.L., Draper, B.A., Thonnat, M. (eds.) ICVS 2011. LNCS, vol. 6962, pp. 173–182. Springer, Heidelberg (2011)
Chapter Google Scholar
Varadarajan, K.M., Vincze, M.: AfRob: The affordance network ontology for robots. In: 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE (2012)
Google Scholar
AfNet: The Affordance Network (2013), http://www.theaffordances.net

Download references

Authors

Karthik Mahesh Varadarajan
View author publications
You can also search for this author in PubMed Google Scholar
Markus Vincze
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Intel Science and Technology Center on Embedded Computing, Pittsburgh, PA, USA
Mei Chen
UMIC Research Centre, RWTH Aachen University, Aachen, Germany
Bastian Leibe
FB Informatik, Universität Hamburg, Germany
Bernd Neumann

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Varadarajan, K.M., Vincze, M. (2013). Parallel Deep Learning with Suggestive Activation for Object Category Recognition. In: Chen, M., Leibe, B., Neumann, B. (eds) Computer Vision Systems. ICVS 2013. Lecture Notes in Computer Science, vol 7963. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-39402-7_36

Download citation

DOI: https://doi.org/10.1007/978-3-642-39402-7_36
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-39401-0
Online ISBN: 978-3-642-39402-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics