Skip to main content

Capsule Networks for Attention Under Occlusion

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11731))

Abstract

Capsule Neural Networks (CapsNet) serve as an attempt to model the neural organization in biological neural networks. Through the routing-by-agreement algorithm, the attention mechanism is implemented as individual capsules that focus on specific upstream capsules while ignoring the rest. By using the routing algorithm, CapsNets are able to attend overlapping digits from the MNIST dataset. In this work, we evaluate the attention capabilities of Capsule Networks using the routing-by-agreement with occluded shape stimuli as presented in neurophysiology. We do so by implementing a more compact type of capsule network. Our results in classifying the non-occluded as well as the occluded shapes show that indeed, CapsNets are able to differentiate occlusions from near-occlusion situations as in real biological neurons. In our experiments, performing the reconstruction of the occluded stimuli also shows promising results.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Hubel, D.H., Wiesel, T.N.: Receptive fields, binocular interaction and functional architecture in the Cat’s visual cortex. J. Physiol. 160(1), 106–154 (1962)

    Article  Google Scholar 

  2. Pasupathy, A., Connor, C.E.: Shape representation in area V4: position-specific tuning for boundary conformation. J. Neurophysiol. 86(5), 2505–2519 (2001)

    Article  Google Scholar 

  3. Fukushima, K., Wake, N.: Handwritten alphanumeric character recognition by the neocognitron. IEEE Trans. Neural Netw. 2(3), 355–365 (1991)

    Article  Google Scholar 

  4. Riesenhuber, M., Poggio, T.: Hierarchical models of object recognition in cortex. Nat. Neurosci. 2(11), 1019 (1999)

    Article  Google Scholar 

  5. Rodríguez-Sánchez, A., Tsotsos, J.: The roles of endstopped and curvature tuned computations in a hierarchical representation of 2D shape. PLoS ONE 7(8), e42058 (2012)

    Article  Google Scholar 

  6. Rodríguez-Sánchez, A., Oberleiter, S., Xiong, H., Piater, J.: Learning V4 curvature cell populations from sparse endstopped cells. In: Villa, A.E.P., Masulli, P., Pons Rivero, A.J. (eds.) ICANN 2016. LNCS, vol. 9887, pp. 463–471. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-44781-0_55

    Chapter  Google Scholar 

  7. Serre, T., Wolf, L., Bileschi, S., Riesenhuber, M., Poggio, T.: Robust object recognition with cortex-like mechanisms. IEEE Trans. Pattern Anal. Mach. Intell. 3, 411–426 (2007)

    Article  Google Scholar 

  8. Rodríguez-Sánchez, A.J., Tsotsos, J.K.: The importance of intermediate representations for the modeling of 2D shape detection: endstopping and curvature tuned computations. In: CVPR 2011, June 2011, pp. 4321–4326 (2011)

    Google Scholar 

  9. LeCun, Y., et al.: Backpropagation applied to handwritten zip code recognition. Neural Comput. 1(4), 541–551 (1989)

    Article  Google Scholar 

  10. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)

    Google Scholar 

  11. Howard, A.G., et al.: MobileNets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017)

  12. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)

  13. Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28

    Chapter  Google Scholar 

  14. Stabinger, S., Rodríguez-Sánchez, A.: Evaluation of deep learning on an abstract image classification dataset. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2767–2772 (2017). Workshop on Mutual Benefits of Cognitive and Computer Vision (MBCC)

    Google Scholar 

  15. Kim, J., Ricci, M., Serre, T.: Not-so-CLEVR: visual relations strain feedforward neural networks (2018)

    Google Scholar 

  16. Stabinger, S., Rodríguez-Sánchez, A., Piater, J.: 25 years of CNNs: can we compare to human abstraction capabilities? In: Villa, A.E.P., Masulli, P., Pons Rivero, A.J. (eds.) ICANN 2016. LNCS, vol. 9887, pp. 380–387. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-44781-0_45

    Chapter  Google Scholar 

  17. Kim, B., Reif, E., Wattenberg, M., Bengio, S.: Do neural networks show Gestalt phenomena? An exploration of the law of closure. arXiv preprint arXiv:1903.01069 (2019)

  18. Hinton, G.E., Krizhevsky, A., Wang, S.D.: Transforming auto-encoders. In: Honkela, T., Duch, W., Girolami, M., Kaski, S. (eds.) ICANN 2011. LNCS, vol. 6791, pp. 44–51. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-21735-7_6

    Chapter  Google Scholar 

  19. Sabour, S., Frosst, N., Hinton, G.E.: Dynamic routing between capsules. In: Advances in Neural Information Processing Systems, pp. 3859–3869 (2017)

    Google Scholar 

  20. LeCun, Y.: MNIST handwritten digit database. http://yann.lecun.com/exdb/mnist/. Accessed 05 Mar 2019

  21. Ba, J., Mnih, V., Kavukcuoglu, K.: Multiple object recognition with visual attention. arXiv preprint arXiv:1412.7755 (2014)

  22. Bushnell, B.N., Harding, P.J., Kosai, Y., Pasupathy, A.: Partial occlusion modulates contour-based shape encoding in primate area V4. J. Neurosci. 31(11), 4012–4024 (2011)

    Article  Google Scholar 

  23. Maas, A.L., Hannun, A.Y., Ng, A.Y.: Rectifier nonlinearities improve neural network acoustic models. In: Proceedings of ICML, vol. 30, no. 1, p. 3 (2013)

    Google Scholar 

  24. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

  25. Abadi, M., et al.: TensorFlow: a system for large-scale machine learning. In: 12th USENIX OSDI 2016, pp. 265–283 (2016)

    Google Scholar 

  26. Tensorflow Contributors: tf.train.exponential\(\_\)decay. https://www.tensorflow.org/api_docs/python/tf/train/exponential_decay. Accessed 05 Mar 2019

  27. Cadieu, C.F., et al.: Deep neural networks rival the representation of primate it cortex for core visual object recognition. PLoS Comput. Biol. 10(12), e1003963 (2014)

    Article  Google Scholar 

  28. LeCun, Y., Bengio, Y., Hinton, G.E.: Deep learning. Nature 521(7553), 436 (2015)

    Article  Google Scholar 

  29. Crick, F.: The recent excitement about neural networks. Nature 337(6203), 129–132 (1989)

    Article  Google Scholar 

  30. Olshausen, B.A., Anderson, C.H., Van Essen, D.C.: A neurobiological model of visual attention and invariant pattern recognition based on dynamic routing of information. J. Neurosci. 13(11), 4700–4719 (1993)

    Article  Google Scholar 

  31. Shahroudnejad, A., Afshar, P., Plataniotis, K.N., Mohammadi, A.: Improved explainability of capsule networks: relevance path by agreement. In: 2018 IEEE Global Conference on Signal and Information Processing (GlobalSIP). IEEE, pp. 549–553 (2018)

    Google Scholar 

  32. Fu, J., Zheng, H., Mei, T.: Look closer to see better: recurrent attention convolutional neural network for fine-grained image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4438–4446 (2017)

    Google Scholar 

Download references

Acknowledgments

We would like to thank Sebastian Stabinger for his useful comments and Prof. Anitha Pasupathy for providing the program to create the single shape stimuli.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Antonio Rodríguez-Sánchez .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Rodríguez-Sánchez, A., Dick, T. (2019). Capsule Networks for Attention Under Occlusion. In: Tetko, I., Kůrková, V., Karpov, P., Theis, F. (eds) Artificial Neural Networks and Machine Learning – ICANN 2019: Workshop and Special Sessions. ICANN 2019. Lecture Notes in Computer Science(), vol 11731. Springer, Cham. https://doi.org/10.1007/978-3-030-30493-5_50

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-30493-5_50

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-30492-8

  • Online ISBN: 978-3-030-30493-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics