Skip to main content
Log in

An effective 3D target recognition model imitating robust methods of the human visual system

  • Theoretical Advances
  • Published:
Pattern Analysis and Applications Aims and scope Submit manuscript

Abstract

This paper presents a model of 3D object recognition motivated from the robust properties of human vision system (HVS). The HVS shows the best efficiency and robustness for an object identification task. The robust properties of the HVS are visual attention, contrast mechanism, feature binding, multi-resolution, size tuning, and part-based representation. In addition, bottom-up and top-down information are combined cooperatively. Based on these facts, a plausible computational model integrating these facts under the Monte Carlo optimization technique was proposed. In this scheme, object recognition is regarded as a parameter optimization problem. The bottom-up process is used to initialize parameters in a discriminative way; the top-down process is used to optimize them in a generative way. Experimental results show that the proposed recognition model is feasible for 3D object identification and pose estimation in visible and infrared band images.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17

Similar content being viewed by others

References

  1. Lowe DG (1987) Three-dimensional object recognition from single two-dimensional images. Artif Intell 31(3):355–395

    Article  Google Scholar 

  2. Faugeras OD, Hebert M (1986) The representation recognition, and locating of 3-D objects. Int J Robotics Res 5(3):27–52

    Article  Google Scholar 

  3. Mundy J, Zisserman A (1992) Geometric invariance in computer vision. MIT, Cambridge, MA, pp 335–460

    Google Scholar 

  4. Rothwell CA (1993) Recognition using projective invariance, Ph.D Thesis, Oxford

  5. Murase H, Nayar S (1995) Visual learning and recognition of 3-D objects from appearance. Int JComput Vis 14:5–24

    Article  Google Scholar 

  6. Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int JComput Vis 60(2):91–110

    Article  Google Scholar 

  7. Rothganger F, Lazebnik S, Schmid C, Ponce J (2004) Segmenting, modeling, and matching video clips containing multiple moving objects. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Washington, DC, June, pp 914–921

  8. Fergus R, Perona P, Zisserman A (2003) Object class recognition by unsupervised scale-invariant learning. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Madison, Wisconsin, June, pp 264–271

  9. Peters G (2000) Theories of three-dimensional object perception—a survey. In recent research developments in pattern recognition, transworld research network, Part-I, vol 1, pp 179–197

  10. Nichols MJ, Newsome WT (1999) The neurobiology of cognition. Nature 402(2):C35–C38

    Article  PubMed  Google Scholar 

  11. Siegel M, Kording KP, Konig P (2000) Integrating top-down and bottom-up sensory processing by somato-dendritic interactions. J Comput Neurosci 8:161–173

    Article  PubMed  Google Scholar 

  12. Bar M (2004) Visual objects in context. Nat Rev: Neurosci 5:617–629

    Article  Google Scholar 

  13. Treisman A (1998) Feature binding, attention and object perception. Philos Trans: Biol Sci 29 353(1373):1295–1306

    Article  Google Scholar 

  14. VanRullen R (2003) Visual saliency and spike timing in the ventral visual pathway. J Physiol (Paris) 97:365–377

    Article  Google Scholar 

  15. Fiser J, Subramaniam S, Biederman I (2001) Size Tuning in the absence of spatial frequency tuning in object recognition. Vis Res 41(15):1931–1950

    Article  PubMed  Google Scholar 

  16. Biederman I (1987) Recognition by components: a theory of human image understanding. Psychol Rev 94(2):115–147

    Article  PubMed  Google Scholar 

  17. Pasupathy A, Connor CE (2001) Shape representation in area V4: position-specific tuning for boundary conformation. J Neurophysiol 86(5):2505–2519

    Google Scholar 

  18. Kuno Y, Ikeuchi K, Kanade T (1988) Model-based vision by cooperative processing of evidence and hypotheses using configuration spaces. SPIE Digital Opt Shape Representation Pattern Recognit 938:444–453

    Google Scholar 

  19. Zhu SC, Zhang R, Tu Z (2000) Integrating bottom-up/top-down for object recognition by data driven markov chain Monte Carlo. Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Hilton Head, SC, June, pp 738–745

  20. Milanese R, Wechsler H, Gil S (1994) Integration of bottom-up and top-down for visual attention using non-linear relaxation. Proceeding of IEEE Conference on Computer Vision and Pattern Recognition, Seattle, USA, June, pp 781–785

  21. Kumar VP (2002) Towards trainable man-machine interfaces: combining top-down constraints with bottom-up learning in facial analysis. Ph.D Thesis, MIT

  22. Serre T, Riesenhuber M (2004) Realistic modeling of simple and complex cell tuning in the HMX model, and implications for invariant object recognition in cortex. AIM, MIT

    Google Scholar 

  23. Tu Z, Chen X, Yuille A, Zhu SC (2005) Image parsing: unifying segmentation, detection, and object recognition (Marr Prize Issue, a short version appeared in ICCV 2003). Int J Comput Vis 63(2):113–140

    Google Scholar 

  24. Borgelt C, Kruse Z (2001) Graphical models: methods for data analysis and mining. Wiley, New York, pp 1–12

    Google Scholar 

  25. Green P (1996) Reversible jump markov chain Monte Carlo computation and bayesian Model Determination. Champman and Hall, London

    Google Scholar 

  26. Doucet A, Freitas ND, Gordon N (2001) Sequential Monte Carlo methods in practice. Springer, New York, pp 432–444, 3–13

  27. Ristic B, Arulampalam S, Gordon N (2004) Beyond the Kalman filter: particle filters for tracking applications. Artech House, London, pp 35–62

    MATH  Google Scholar 

  28. Robert CP, Casella G (1999) Monte Carlo statistical methods. Springer, Berlin Heidelberg New York

    MATH  Google Scholar 

  29. Edelman S, Bülthoff H (1992) Orientation dependence in the recognition of familiar and novel views of 3D objects. Vis Res 32:2385–2400

    Article  PubMed  Google Scholar 

  30. Lindeberg T (1998) Feature detection with automatic scale selection. Int JComput Vis 30(2):77–116

    Google Scholar 

  31. Kim S, Kweon IS (2005) Automatic model-based 3D object recognition by combining feature matching with tracking. Machine Vis Appl DOI 10.1007/s00138-005-0194-9

  32. Parkhurst D, Law K, Niebur E (2002) Modeling the role of salience in the allocation of overt visual attention. Vis Res 42:107–123

    Article  PubMed  Google Scholar 

  33. Feldman J, Singh M (2005) Information along contours and object boundaries. Psychol Rev 112(1):243–252

    Article  PubMed  Google Scholar 

  34. Reisfeld D, Wolfson H, Yeshurun Y (1995) Context-free attentional Operators: the generalized symmetry transform. Int J Comput Vis 14(2):119–130

    Article  Google Scholar 

  35. Harris CJ, Stephens M (1988) A combined corner and edge detector. In Proceedings of 4th Alvey Vision Conference, Manchester, pp 147–151

  36. Schmid C, Mohr R, Bauckhage C (2000) Evaluation of interest point detectors. Int J Comput Vis 37(2):151–172

    Article  MATH  Google Scholar 

  37. Canny J (1986) A computational approach to edge detection. IEEE Trans Pattern Anal Machine Intell 8(6):679–698

    Article  Google Scholar 

  38. Mikolajczyk K, Schmid C (2003) A performance evaluation of local descriptors. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Madison, Wisconsin, pp 774–781

  39. Desolneux A, Moisan L, Morel JM (2004) Gestalt theory and computer vision. In: Carsetti A (ed) Seeing, thinking and knowing. Kluwer Academic, New York, pp 71–101

    Chapter  Google Scholar 

  40. Colorado State University Computer Vision Group, Fort Carson RSTA Data Collection, http://www.cs.colostate.edu/∼vision/ft_carson/

Download references

Acknowledgements

This research has been supported by the Korean Ministry of Science and Technology for National Research Laboratory Program (Grant number M1-0302-00-0064), Korea.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sungho Kim.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kim, S., Jang, G. & Kweon, I.S. An effective 3D target recognition model imitating robust methods of the human visual system. Pattern Anal Applic 8, 211–226 (2005). https://doi.org/10.1007/s10044-005-0001-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10044-005-0001-y

Keywords

Navigation