Skip to main content

Parallel Deep Learning with Suggestive Activation for Object Category Recognition

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7963))

Abstract

The performance of visual perception algorithms for object category detection has largely been restricted by the lack of generalizability and scalability of state-of-art hand-crafted feature detectors and descriptors across instances of objects with different shapes, textures etc. The recently introduced deep learning algorithms have attempted at overcoming this limitation through automatic learning of feature kernels. Nevertheless, conventional deep learning architectures are uni-modal, essentially feedforward testing pipelines working on image space with little regard for context and semantics. In this paper, we address this issue by presenting a new framework for object categorization based on Deep Learning, called Parallel Deep Learning with Suggestive Activation (PDLSA) that imbibes several brain operating principles drawn from neuroscience and psychophysical studies. In particular, we focus on Suggestive Activation – a schema which enables feedback loops in the recognition process that use information obtained from partial detection results to generate hypotheses based on long-term memory (or knowledge base) to search in the image space for features corresponding to these hypotheses thereby enabling activation of the response corresponding to the correct object category through multi-modal integration. Results presented against a traditional SIFT based category classifier on the University of Washington benchmark RGB-D dataset demonstrates the validity of the approach.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Lowe, D.G.: Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision 60(2), 91–110 (2004)

    Article  Google Scholar 

  2. Kim, S., Yoon, K.-J., Kweon, I.S.: Object recognition using a generalized robust invariant feature and Gestalt’s law of proximity and similarity. Pattern Recognition 41(2), 726–741 (2008)

    Article  MATH  Google Scholar 

  3. Bay, H., Tuytelaars, T., Van Gool, L.: SURF: Speeded up robust features. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006, Part I. LNCS, vol. 3951, pp. 404–417. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  4. Mikolajczyk, K., Schmid, C.: Scale & affine invariant interest point detectors. International Journal of Computer Vision 60(1), 63–86 (2004)

    Article  Google Scholar 

  5. Forssen, P.-E., Lowe, D.G.: Shape descriptors for maximally stable extremal regions. In: IEEE 11th International Conference on Computer Vision, ICCV 2007. IEEE (2007)

    Google Scholar 

  6. Fei-Fei, L., Perona, P.: A bayesian hierarchical model for learning natural scene categories. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2005, vol. 2. IEEE (2005)

    Google Scholar 

  7. Jarrett, K., et al.: What is the best multi-stage architecture for object recognition? In: 2009 IEEE 12th International Conference on Computer Vision. IEEE (2009)

    Google Scholar 

  8. Bengio, Y.: Learning deep architectures for AI. Foundations and Trends in Machine Learning 2(1), 1–127 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  9. Cover, T.M., Thomas, J.A.: Elements of information theory. Wiley-Interscience (2006)

    Google Scholar 

  10. Field, D.J.: Relations between the statistics of natural images and the response properties of cortical cells. J. Opt. Soc. Am. A 4(12), 2379–2394 (1987)

    Article  Google Scholar 

  11. Field, D.J.: What is the goal of sensory coding? Neural Computation 6(4) (1994)

    Google Scholar 

  12. Caruana, R.: Multitask learning. Machine Learning 28(1), 41–75 (1997)

    Article  MathSciNet  Google Scholar 

  13. Biederman, I.: Recognition-by-components: a theory of human image understanding. Psychological Review 94(2), 115 (1987)

    Article  Google Scholar 

  14. Arbib, M.A. (ed.): The handbook of brain theory and neural networks. MIT Press

    Google Scholar 

  15. Rogers, T., et al.: Object recognition under semantic impairment: The effects of conceptual regularities on perceptual decisions. Language and Cognitive Processes 18(5-6), 625–662 (2003)

    Article  Google Scholar 

  16. Varadarajan, K.M., Vincze, M.: AfNet: The affordance network. In: Lee, K.M., Matsushita, Y., Rehg, J.M., Hu, Z. (eds.) ACCV 2012, Part I. LNCS, vol. 7724, pp. 512–523. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  17. Gibson, J.J.: The concept of affordances. Perceiving, Acting, and Knowing, 67–82 (1977)

    Google Scholar 

  18. Prinz, W.: Modes of linkage between perception and action. Cognition and motor processes, pp. 185–193. Springer, Heidelberg (1984)

    Google Scholar 

  19. Kohler, E., et al.: Hearing sounds, understanding actions: action representation in mirror neurons. Science 297(5582), 846–848 (2002)

    Article  Google Scholar 

  20. Varadarajan, K.M.: k-TR: Karmic Tabula Rasa – A Theory of Visual Perception. In: Conference of the International Society of Psychophysics - ISP (2011)

    Google Scholar 

  21. Varadarajan, K.M., Vincze, M.: Knowledge representation and inference for grasp affordances. In: Crowley, J.L., Draper, B.A., Thonnat, M. (eds.) ICVS 2011. LNCS, vol. 6962, pp. 173–182. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  22. Varadarajan, K.M., Vincze, M.: AfRob: The affordance network ontology for robots. In: 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE (2012)

    Google Scholar 

  23. AfNet: The Affordance Network (2013), http://www.theaffordances.net

Download references

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Varadarajan, K.M., Vincze, M. (2013). Parallel Deep Learning with Suggestive Activation for Object Category Recognition. In: Chen, M., Leibe, B., Neumann, B. (eds) Computer Vision Systems. ICVS 2013. Lecture Notes in Computer Science, vol 7963. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-39402-7_36

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-39402-7_36

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-39401-0

  • Online ISBN: 978-3-642-39402-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics