Skip to main content
Log in

Biological modeling of human visual system for object recognition using GLoP filters and sparse coding on multi-manifolds

  • Special Issue Paper
  • Published:
Machine Vision and Applications Aims and scope Submit manuscript

Abstract

Hierarchical MAX model (HMAX) is a bio-inspired model mimicking the visual information processing of visual cortex. However, the visual processing of lower level, such as retina and lateral geniculate nucleus (LGN), is not concerned, and the properties of higher-level neurons are not sufficiently specified. Given that, we develop an extended HMAX model, denoted as E-HMAX, by the following biologically plausible ways. First, contrast normalization is conducted on the input image to simulate the processing of human retina and LGN. Second, log-polar Gabor (GLoP) filters are used to simulate the properties of V1 simple cells instead of Gabor filters. Then, sparse coding on multi-manifolds is modeled to compute the V4 simple cell response instead of Euclidean distance. Meanwhile, a template learning method based on dictionary learning on multi-manifolds is proposed to select informative templates during template learning stage. Experimental results demonstrate that the proposed model has greatly outperformed the standard HMAX model. It is also comparable to some state-of-the-art approaches such as EBIM and OGHM-HMAX.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18

Similar content being viewed by others

References

  1. Li, H., Li, H., Wei, Y., Tang, Y., Wang, Q.: Sparse-based neural response for image classification. Neurocomputing 144, 198–2077 (2014)

    Article  Google Scholar 

  2. Yu, J., Tao, D., Rui, Y., Cheng, J.: Pairwise constraints based multiview features fusion for scene classification. Pattern Recognit. 46(2), 483–496 (2013)

    Article  MATH  Google Scholar 

  3. Sang, J., Xu, C., Liu, J.: User-aware image tag refinement via ternary semantic analysis. IEEE Trans. Multimed. 14(3), 883–895 (2012)

    Article  Google Scholar 

  4. Sang, J., Fang, Q., Xu, C.: Exploiting social-mobile information for location visualization. ACM Trans. Intell. Syst. Technol. (TIST) 8(3), 39 (2017)

    Google Scholar 

  5. Tan, M., Hu, Z., Wang, B., Zhao, J., Wang, Y.: Robust object recognition via weakly supervised metric and template learning. Neurocomputing 181, 96–107 (2016)

    Article  Google Scholar 

  6. Tan, M., Wang, B., Wu, Z., Wang, J., Pan, G.: Weakly supervised metric learning for traffic sign recognition in a lidar-equipped vehicle. IEEE Trans. Intell. Transp. Syst. 17(5), 1415–1427 (2016)

    Article  Google Scholar 

  7. Fukushima, K.: Neocognitron: a self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biol. Cybern. 36(4), 193–202 (1980)

    Article  MATH  Google Scholar 

  8. Serre, T., Wolf, L., Bileschi, S., Riesenhuber, M.: Robust object recognition with cortex-like mechanisms. IEEE Trans. Pattern Anal. Mach. Intell. 29(3), 411–426 (2007)

    Article  Google Scholar 

  9. Lee, H., Grosse, R., Ng, A.: Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In: ICML (2009)

  10. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: NIPS(2012)

  11. Kheradpisheh, S., Ghodrati, M., Ganjtabesh, M., Masquelier, T.: Deep networks resemble human feed-forward vision in invariant object recognition. arXiv preprint arXiv:1508.03929 (2015)

  12. Ross, G., Jeff, D., Trevor, D., Jitendra, M.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: CVPR (2014)

  13. Yu, J., Zhang, B., Kuang, Z., Lin, D., Fan, J.: iPrivacy: image privacy protection by identifying sensitive objects via deep multi-task learning. IEEE Trans. Inf. Forensics Secur. 12(5), 1005–1016 (2017)

    Article  Google Scholar 

  14. Yu, J., Yang, X., Gao, F., Tao, D.: Deep multimodal distance metric learning using click constraints for image ranking. IEEE Trans. Cybern. PP(99), 1–11 (2016)

    Google Scholar 

  15. Wu, W., Qiao, H., Chen, J., Yin, P., Li, Y.: Biologically inspired model simulating visual pathways and cerebellum function in human-Achieving visuomotor coordination and high precision movement with learning ability. arXiv preprint arXiv:1603.02351 (2016)

  16. Cadieu, C., Kouh, M., Pasupathy, A., Connor, C.E., Riesenhuber, M., Poggio, T.: A model of V4 shape selectivity and invariance. J. Neurophysiol. 98, 1733–1750 (2007)

    Article  Google Scholar 

  17. Weng, D., Wang, Y., Gong, M., Tao, D., Wei, H.: DERF: distinctive efficient robust features from the biological modeling of the P ganglion cells. IEEE Trans. Image Process. 24(8), 2287–2302 (2015)

    Article  MathSciNet  Google Scholar 

  18. Grossberg, S., Hong, S.: A neural model of surface perception: lightness, anchoring, and filling-in. Spat. Vis. 19, 263–321 (2006)

    Article  Google Scholar 

  19. Olshausen, B.A., Field, D.J.: Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature 381, 607–609 (1996)

    Article  Google Scholar 

  20. Carlson, E.T., Rasquinha, R.J., Zhang, K., Connor, C.E.: A sparse object coding scheme in area V4. Curr. Biol. 21, 288-29 (2011)

    Article  Google Scholar 

  21. Quiroga, Q.R., Reddy, L., Kreiman, G., Koch, C., Fried, I.: Invariant visual representation by single neurons in the human brain. Nature 435, 1102–1107 (2005)

    Article  Google Scholar 

  22. Hu, X., Zhang, J., Li, J., Zhang, B.: Sparsity-regularized HMAX for visual recognition. PloS one 9(1), e81813 (2014)

    Article  Google Scholar 

  23. Huang, Y., Huang, K., Tao, D., Tan, T., Li, X.: Enhanced biologically inspired model for object recognition. IEEE Trans. Syst. Man Cybern. B (Cybern.) 41(6), 1668–1680 (2011)

    Article  Google Scholar 

  24. Liu, W., Zha, Z.J., Wang, Y., Lu, K., Tao, D.: p-Laplacian regularized sparse coding for human activity recognition. IEEE Trans. Ind. Electron. 63(8), 5120–5129 (2016)

    Google Scholar 

  25. Yu, J., Rui, Y., Tao, D.: Click prediction for web image reranking using multimodal sparse coding. IEEE Trans. Image Process. 23(5), 2019–2032 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  26. Seung, H.S., Lee, D.D.: The manifold ways of perception. Science 290(5500), 2268–2269 (2000)

    Article  Google Scholar 

  27. Weng, J., Ahuja, N., Huang, T.S.: Learning recognition and segmentation of 3-D objects from 2-D. In: Proceedings of IEEE 4th International Conference on Computer Vision, pp. 121–128 (1993)

  28. Sector, I.T.U.R.: Studio encoding parameters of digital television for standard 4:3 and wide-screen 16:9 aspect ratios. In: International Telecommunication Union Radiocommunications Sector (ITU-R), BT.601-5 (1995)

  29. Grossberg, S., Huang, T.R.: ARTSCENE: a neural system for natural scene classification. J. Vis. 9(4), 1–19 (2009)

    Article  Google Scholar 

  30. De Valois, R.L., Yund, E.W., Hepler, N.: The orientation and direction selectivity of cells in macaque visual cortex. Vis. Res. 22, 531–544 (1982)

    Article  Google Scholar 

  31. Schwartz, E.L.: Cortical anatomy and size invariance, and spatial frequency analysis. Vis. Res. 18, 24–58 (1981)

    Google Scholar 

  32. Guyader, N., Chauvin, A., Massot, C., Hérault, J., Marendaz, C.: A biological model of low-level vision suitable for image analysis and cognitive visual perception. Perception 35(1), 56 (2006)

    Google Scholar 

  33. Benoit, A., Caplier, A., Durette, B., Herault, J.: Using human visual system modeling for bio-inspired low level image processing. Comput. Vis. Image Underst. 114(7), 758–773 (2010)

    Article  Google Scholar 

  34. Liu, T., Tao, D.: Classification with noisy labels by importance reweighting. IEEE Trans. Pattern Anal. Mach. Intell. 38(3), 447–461 (2016)

    Article  Google Scholar 

  35. Liu, B., Wang, Y., Zhang, Y., Shen, B.: Learning dictionary on manifolds for image classification. Pattern Recognit. 46(7), 1879–1890 (2013)

    Article  Google Scholar 

  36. Tao, D., Li, X., Wu, X., Maybank, S.J.: Geometric mean for subspace selection. IEEE Trans. Pattern Anal. Mach. Intell. 31(2), 260–274 (2009)

    Article  Google Scholar 

  37. Yu, J., Rui, Y., Tao, D.: Click prediction for web image reranking using multimodal sparse coding. IEEE Trans. Image Process. 23(5), 2019–2032 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  38. Lee, H., Battle, A., Raina, R., Ng, A.Y.: Efficient sparse coding algorithms. Adv. Neural Inf. Process. Syst. 19, 801–808 (2006)

    Google Scholar 

  39. Chang, C., Lin, C.: LIB-SVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2, 27 (2011)

    Article  Google Scholar 

  40. Park, S.H., Goo, J.M., Jo, C.H.: Receiver operating characteristic (ROC) curve: practical review for radiologists. Korean J. Radiol. 5(1), 11–18 (2004)

    Article  Google Scholar 

  41. Li, F.F., Fergus, R., Perona, P.: Learning generative visual models from few training examples: an incremental bayesian approach tested on 101 object categories. Comput. Vis. Image Underst. 106(1), 59–70 (2007)

    Article  Google Scholar 

  42. Lu, Y.F., Zhang, H.Z., Kang, T.K., Choi, I.H., Lim, M.T.: Extended biologically inspired model for object recognition based on oriented Gaussian–Hermite moment. Neurocomputing 139, 189–201 (2014)

    Article  Google Scholar 

  43. Jiang, L.Y.: Study on bio-inspired invariant feature representation of image. M.S. thesis, Dept. Info. Eng., China University of Petroleum, Qingdao (2014)

  44. Robinson, L., Rolls, E.T.: Invariant visual object recognition: biologically plausible approaches. Biol. Cybern. 109(4–5), 505–535 (2015)

    Article  MathSciNet  Google Scholar 

  45. Opelt, A., Pinz, A., Fussenegger, M., Auer, P.: Generic object recognition with boosting. IEEE Trans. Pattern Anal. Mach. Intell. 28(3), 416–431 (2006)

    Article  MATH  Google Scholar 

  46. Ghodrati, M., Khaligh-Razavi, S.M., Ebrahimpour, R., Rajaer, K., Pooyan, M.: How can selection of biologically inspired features improve the performance of a robust object recognition model. PLoS ONE 7(2), e32357 (2012)

    Article  Google Scholar 

  47. Zhai D., Li B., Chang H., Shan S., Chen X., Gao, W.: Manifold alignment via corresponding projections. In: BMVC (2010)

  48. Liu, W., Ma, T., Tao, D., You, J.: HSAE: a Hessian regularized sparse auto-encoders. Neurocomputing 187, 59–65 (2016)

    Article  Google Scholar 

  49. Yin, P., Qiao, H., Wu, W., Qi, L., Li, Y., Zhong, S., Zhang, B.: A novel biologically mechanism-based visual cognition model—automatic extraction of semantics, formation of integrated concepts and re-selection features for ambiguity. arXiv preprint arXiv:1603.07886 (2016)

  50. Lindeberg, T.: A computational theory of visual receptive fields. Biol. Cybern. 107(6), 589–635 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  51. Bhatt, R., Carpenter, G.A., Grossberg, S.: Texture segregation by visual cortex: perceptual grouping, attention, and learning. Vis. Res. 47, 3173–3211 (2007)

    Article  Google Scholar 

  52. Field, D.J.: Relations between the statistics of natural images and the response properties of cortical cells. JOSA A. 4(12), 2379–2394 (1987)

    Article  Google Scholar 

  53. Kovesi, P.: Image features from phase congruency. Videre: J. Comput. Vis. Res. 1(3), 1–26 (1999)

    Google Scholar 

  54. Serre, T., Kouh, M., Cadieu, C., Knoblich, U., Kreiman, G., Poggio, T.: A theory of object recognition: computations and circuits in the feedforward path of the ventral stream in primate visual cortex. Massachusetts Inst of Tech, Cambridge, MA, Center for Biological and Computational Learning (2005)

Download references

Acknowledgements

The paper is funded by the National Natural Science Foundation of China (No. 61671480) and the Natural Science Foundation of Shandong Province (Nos. ZR2017MF069, ZR2018MF017).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yanjiang Wang.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Deng, L., Wang, Y., Liu, B. et al. Biological modeling of human visual system for object recognition using GLoP filters and sparse coding on multi-manifolds. Machine Vision and Applications 29, 965–977 (2018). https://doi.org/10.1007/s00138-018-0928-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00138-018-0928-9

Keywords

Navigation