Abstract
Hierarchical MAX model (HMAX) is a bio-inspired model mimicking the visual information processing of visual cortex. However, the visual processing of lower level, such as retina and lateral geniculate nucleus (LGN), is not concerned, and the properties of higher-level neurons are not sufficiently specified. Given that, we develop an extended HMAX model, denoted as E-HMAX, by the following biologically plausible ways. First, contrast normalization is conducted on the input image to simulate the processing of human retina and LGN. Second, log-polar Gabor (GLoP) filters are used to simulate the properties of V1 simple cells instead of Gabor filters. Then, sparse coding on multi-manifolds is modeled to compute the V4 simple cell response instead of Euclidean distance. Meanwhile, a template learning method based on dictionary learning on multi-manifolds is proposed to select informative templates during template learning stage. Experimental results demonstrate that the proposed model has greatly outperformed the standard HMAX model. It is also comparable to some state-of-the-art approaches such as EBIM and OGHM-HMAX.


















Similar content being viewed by others
References
Li, H., Li, H., Wei, Y., Tang, Y., Wang, Q.: Sparse-based neural response for image classification. Neurocomputing 144, 198–2077 (2014)
Yu, J., Tao, D., Rui, Y., Cheng, J.: Pairwise constraints based multiview features fusion for scene classification. Pattern Recognit. 46(2), 483–496 (2013)
Sang, J., Xu, C., Liu, J.: User-aware image tag refinement via ternary semantic analysis. IEEE Trans. Multimed. 14(3), 883–895 (2012)
Sang, J., Fang, Q., Xu, C.: Exploiting social-mobile information for location visualization. ACM Trans. Intell. Syst. Technol. (TIST) 8(3), 39 (2017)
Tan, M., Hu, Z., Wang, B., Zhao, J., Wang, Y.: Robust object recognition via weakly supervised metric and template learning. Neurocomputing 181, 96–107 (2016)
Tan, M., Wang, B., Wu, Z., Wang, J., Pan, G.: Weakly supervised metric learning for traffic sign recognition in a lidar-equipped vehicle. IEEE Trans. Intell. Transp. Syst. 17(5), 1415–1427 (2016)
Fukushima, K.: Neocognitron: a self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biol. Cybern. 36(4), 193–202 (1980)
Serre, T., Wolf, L., Bileschi, S., Riesenhuber, M.: Robust object recognition with cortex-like mechanisms. IEEE Trans. Pattern Anal. Mach. Intell. 29(3), 411–426 (2007)
Lee, H., Grosse, R., Ng, A.: Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In: ICML (2009)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: NIPS(2012)
Kheradpisheh, S., Ghodrati, M., Ganjtabesh, M., Masquelier, T.: Deep networks resemble human feed-forward vision in invariant object recognition. arXiv preprint arXiv:1508.03929 (2015)
Ross, G., Jeff, D., Trevor, D., Jitendra, M.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: CVPR (2014)
Yu, J., Zhang, B., Kuang, Z., Lin, D., Fan, J.: iPrivacy: image privacy protection by identifying sensitive objects via deep multi-task learning. IEEE Trans. Inf. Forensics Secur. 12(5), 1005–1016 (2017)
Yu, J., Yang, X., Gao, F., Tao, D.: Deep multimodal distance metric learning using click constraints for image ranking. IEEE Trans. Cybern. PP(99), 1–11 (2016)
Wu, W., Qiao, H., Chen, J., Yin, P., Li, Y.: Biologically inspired model simulating visual pathways and cerebellum function in human-Achieving visuomotor coordination and high precision movement with learning ability. arXiv preprint arXiv:1603.02351 (2016)
Cadieu, C., Kouh, M., Pasupathy, A., Connor, C.E., Riesenhuber, M., Poggio, T.: A model of V4 shape selectivity and invariance. J. Neurophysiol. 98, 1733–1750 (2007)
Weng, D., Wang, Y., Gong, M., Tao, D., Wei, H.: DERF: distinctive efficient robust features from the biological modeling of the P ganglion cells. IEEE Trans. Image Process. 24(8), 2287–2302 (2015)
Grossberg, S., Hong, S.: A neural model of surface perception: lightness, anchoring, and filling-in. Spat. Vis. 19, 263–321 (2006)
Olshausen, B.A., Field, D.J.: Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature 381, 607–609 (1996)
Carlson, E.T., Rasquinha, R.J., Zhang, K., Connor, C.E.: A sparse object coding scheme in area V4. Curr. Biol. 21, 288-29 (2011)
Quiroga, Q.R., Reddy, L., Kreiman, G., Koch, C., Fried, I.: Invariant visual representation by single neurons in the human brain. Nature 435, 1102–1107 (2005)
Hu, X., Zhang, J., Li, J., Zhang, B.: Sparsity-regularized HMAX for visual recognition. PloS one 9(1), e81813 (2014)
Huang, Y., Huang, K., Tao, D., Tan, T., Li, X.: Enhanced biologically inspired model for object recognition. IEEE Trans. Syst. Man Cybern. B (Cybern.) 41(6), 1668–1680 (2011)
Liu, W., Zha, Z.J., Wang, Y., Lu, K., Tao, D.: p-Laplacian regularized sparse coding for human activity recognition. IEEE Trans. Ind. Electron. 63(8), 5120–5129 (2016)
Yu, J., Rui, Y., Tao, D.: Click prediction for web image reranking using multimodal sparse coding. IEEE Trans. Image Process. 23(5), 2019–2032 (2014)
Seung, H.S., Lee, D.D.: The manifold ways of perception. Science 290(5500), 2268–2269 (2000)
Weng, J., Ahuja, N., Huang, T.S.: Learning recognition and segmentation of 3-D objects from 2-D. In: Proceedings of IEEE 4th International Conference on Computer Vision, pp. 121–128 (1993)
Sector, I.T.U.R.: Studio encoding parameters of digital television for standard 4:3 and wide-screen 16:9 aspect ratios. In: International Telecommunication Union Radiocommunications Sector (ITU-R), BT.601-5 (1995)
Grossberg, S., Huang, T.R.: ARTSCENE: a neural system for natural scene classification. J. Vis. 9(4), 1–19 (2009)
De Valois, R.L., Yund, E.W., Hepler, N.: The orientation and direction selectivity of cells in macaque visual cortex. Vis. Res. 22, 531–544 (1982)
Schwartz, E.L.: Cortical anatomy and size invariance, and spatial frequency analysis. Vis. Res. 18, 24–58 (1981)
Guyader, N., Chauvin, A., Massot, C., Hérault, J., Marendaz, C.: A biological model of low-level vision suitable for image analysis and cognitive visual perception. Perception 35(1), 56 (2006)
Benoit, A., Caplier, A., Durette, B., Herault, J.: Using human visual system modeling for bio-inspired low level image processing. Comput. Vis. Image Underst. 114(7), 758–773 (2010)
Liu, T., Tao, D.: Classification with noisy labels by importance reweighting. IEEE Trans. Pattern Anal. Mach. Intell. 38(3), 447–461 (2016)
Liu, B., Wang, Y., Zhang, Y., Shen, B.: Learning dictionary on manifolds for image classification. Pattern Recognit. 46(7), 1879–1890 (2013)
Tao, D., Li, X., Wu, X., Maybank, S.J.: Geometric mean for subspace selection. IEEE Trans. Pattern Anal. Mach. Intell. 31(2), 260–274 (2009)
Yu, J., Rui, Y., Tao, D.: Click prediction for web image reranking using multimodal sparse coding. IEEE Trans. Image Process. 23(5), 2019–2032 (2014)
Lee, H., Battle, A., Raina, R., Ng, A.Y.: Efficient sparse coding algorithms. Adv. Neural Inf. Process. Syst. 19, 801–808 (2006)
Chang, C., Lin, C.: LIB-SVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2, 27 (2011)
Park, S.H., Goo, J.M., Jo, C.H.: Receiver operating characteristic (ROC) curve: practical review for radiologists. Korean J. Radiol. 5(1), 11–18 (2004)
Li, F.F., Fergus, R., Perona, P.: Learning generative visual models from few training examples: an incremental bayesian approach tested on 101 object categories. Comput. Vis. Image Underst. 106(1), 59–70 (2007)
Lu, Y.F., Zhang, H.Z., Kang, T.K., Choi, I.H., Lim, M.T.: Extended biologically inspired model for object recognition based on oriented Gaussian–Hermite moment. Neurocomputing 139, 189–201 (2014)
Jiang, L.Y.: Study on bio-inspired invariant feature representation of image. M.S. thesis, Dept. Info. Eng., China University of Petroleum, Qingdao (2014)
Robinson, L., Rolls, E.T.: Invariant visual object recognition: biologically plausible approaches. Biol. Cybern. 109(4–5), 505–535 (2015)
Opelt, A., Pinz, A., Fussenegger, M., Auer, P.: Generic object recognition with boosting. IEEE Trans. Pattern Anal. Mach. Intell. 28(3), 416–431 (2006)
Ghodrati, M., Khaligh-Razavi, S.M., Ebrahimpour, R., Rajaer, K., Pooyan, M.: How can selection of biologically inspired features improve the performance of a robust object recognition model. PLoS ONE 7(2), e32357 (2012)
Zhai D., Li B., Chang H., Shan S., Chen X., Gao, W.: Manifold alignment via corresponding projections. In: BMVC (2010)
Liu, W., Ma, T., Tao, D., You, J.: HSAE: a Hessian regularized sparse auto-encoders. Neurocomputing 187, 59–65 (2016)
Yin, P., Qiao, H., Wu, W., Qi, L., Li, Y., Zhong, S., Zhang, B.: A novel biologically mechanism-based visual cognition model—automatic extraction of semantics, formation of integrated concepts and re-selection features for ambiguity. arXiv preprint arXiv:1603.07886 (2016)
Lindeberg, T.: A computational theory of visual receptive fields. Biol. Cybern. 107(6), 589–635 (2013)
Bhatt, R., Carpenter, G.A., Grossberg, S.: Texture segregation by visual cortex: perceptual grouping, attention, and learning. Vis. Res. 47, 3173–3211 (2007)
Field, D.J.: Relations between the statistics of natural images and the response properties of cortical cells. JOSA A. 4(12), 2379–2394 (1987)
Kovesi, P.: Image features from phase congruency. Videre: J. Comput. Vis. Res. 1(3), 1–26 (1999)
Serre, T., Kouh, M., Cadieu, C., Knoblich, U., Kreiman, G., Poggio, T.: A theory of object recognition: computations and circuits in the feedforward path of the ventral stream in primate visual cortex. Massachusetts Inst of Tech, Cambridge, MA, Center for Biological and Computational Learning (2005)
Acknowledgements
The paper is funded by the National Natural Science Foundation of China (No. 61671480) and the Natural Science Foundation of Shandong Province (Nos. ZR2017MF069, ZR2018MF017).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Deng, L., Wang, Y., Liu, B. et al. Biological modeling of human visual system for object recognition using GLoP filters and sparse coding on multi-manifolds. Machine Vision and Applications 29, 965–977 (2018). https://doi.org/10.1007/s00138-018-0928-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00138-018-0928-9