Skip to main content

Advertisement

Log in

A machine learning based intelligent vision system for autonomous object detection and recognition

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Existing object recognition techniques often rely on human labeled data conducting to severe limitations to design a fully autonomous machine vision system. In this work, we present an intelligent machine vision system able to learn autonomously individual objects present in real environment. This system relies on salient object detection. In its design, we were inspired by early processing stages of human visual system. In this context we suggest a novel fast algorithm for visually salient object detection, robust to real-world illumination conditions. Then we use it to extract salient objects which can be efficiently used for training the machine learning-based object detection and recognition unit of the proposed system. We provide results of our salient object detection algorithm on MSRA Salient Object Database benchmark comparing its quality with other state-of-the-art approaches. The proposed system has been implemented on a humanoid robot, increasing its autonomy in learning and interaction with humans. We report and discuss the obtained results, validating the proposed concepts.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Notes

  1. http://research.microsoft.com/en-us/um/people/jiansun/SalientObject/salient_object.htm.

  2. Based on executable available on http://ivrg.epfl.ch/supplementary_material/RK_CVPR09/index.html.

  3. This video can be found to the following address http://www.youtube.com/watch?v=xxz3wm3L1pE.

References

  1. Achanta R, Estrada F, Wils P, Süsstrunk S (2008) Salient region detection and segmentation. In: International conference on computer vision systems (ICVS’08). Lecture notes in computer science, vol 5008. Springer, Berlin, pp 66–75

    Chapter  Google Scholar 

  2. Achanta R, Hemami S, Estrada F, Susstrunk S (2009) Frequency-tuned salient region detection. In: IEEE international conference on computer vision and pattern recognition (CVPR)

    Google Scholar 

  3. An SY, Kang JG, Choi WS, Oh SY (2011) A neural network based retrainable framework for robust object recognition with application to mobile robotics. Appl Intell 35:190–210. doi:10.1007/s10489-010-0212-9

    Article  Google Scholar 

  4. Angelopoulou A, Psarrou A, Garcia Rodriguez J, Gupta G (2008) Active-gng: model acquisition and tracking in cluttered backgrounds. In: Proceeding of the 1st ACM workshop on vision networks for behavior analysis, VNBA’08. ACM, New York, pp 17–22

    Chapter  Google Scholar 

  5. Avidan S, Shamir A (2007) Seam carving for content-aware image resizing. ACM Trans Graph 26

  6. Bay H, Tuytelaars T, Gool LJV (2006) Surf: speeded up robust features. In: Leonardis A, Bischof H, Pinz A (eds) ECCV (1). Lecture notes in computer science, vol 3951. Springer, Berlin, pp 404–417

    Google Scholar 

  7. Bay H, Ess A, Tuytelaars T, Gool LV (2008) Speeded-up robust features (surf). Comput Vis Image Underst 110:346–359

    Article  Google Scholar 

  8. Borba GB, Gamba HR, Marques O, Mayron LM (2006) An unsupervised method for clustering images based on their salient regions of interest. In: Proceedings of the 14th annual ACM international conference on multimedia, MULTIMEDIA’06. ACM, New York, pp 145–148

    Chapter  Google Scholar 

  9. Bülthoff HH, Wallraven C, Giese MA (2008) Perceptual robotics. In: Siciliano B, Khatib O (eds) Springer handbook of robotics. Springer, Berlin, pp 1481–1498

    Chapter  Google Scholar 

  10. Chen LQ, Xie X, Fan X, Ma WY, Zhang HJ, Zhou HQ (2003) A visual attention model for adapting images on small displays. Multimed Syst 9(4):353–364

    Article  Google Scholar 

  11. Ekvall S, Kragic D (2005) Receptive field cooccurrence histograms for object detection. In: 2005 IEEE/RSJ international conference on intelligent robots and systems (IROS 2005), pp 84–89

    Chapter  Google Scholar 

  12. Frintrop S, Kessel M (2009) Most salient region tracking. In: Proceedings of the 2009 IEEE international conference on robotics and automation, ICRA’09. IEEE Press, Piscataway, pp 758–763

    Google Scholar 

  13. Fu K, Mui J (1981) A survey on image segmentation. Pattern Recognit 13(1):3–16

    Article  MathSciNet  Google Scholar 

  14. García-Rodríguez J, García-Chamizo JM (2011) Surveillance and human-computer interaction applications of self-growing models. Appl Soft Comput (in press, corrected proof)

  15. Harel J, Koch C, Perona P (2007) Graph-based visual saliency. In: Advances in neural information processing systems, vol 19, pp 545–552

    Google Scholar 

  16. Hossain M, Dewan M, Chae O (2012) A flexible edge matching technique for object detection in dynamic environment. Appl Intell 36:638–648. doi:10.1007/s10489-011-0281-4

    Article  Google Scholar 

  17. Hou X, Zhang L (2007) Saliency detection: a spectral residual approach. IEEE Conf Comput Vis Pattern Recognit 2(800):1–8

    MathSciNet  Google Scholar 

  18. Itti L, Koch C, Niebur E (1998) A model of saliency-based visual attention for rapid scene analysis. IEEE Trans Pattern Anal Mach Intell 20:1254–1259

    Article  Google Scholar 

  19. Kursun O, Favorov OV (2010) Feature selection and extraction using an unsupervised biologically-suggested approximation to Gebelein’s maximal correlation. Int J Pattern Recognit Artif Intell 24(3):337–358. http://search.ebscohost.com/login.aspx?direct=true&db=bth&AN=51857641&lang=fr&site=ehost-live

    Article  Google Scholar 

  20. Liang Z, Chi Z, Fu H, Feng D (2012) Salient object detection using content-sensitive hypergraph representation and partitioning. Pattern Recognit 45(11):3886–3901

    Article  Google Scholar 

  21. Liu T, Yuan Z, Sun J, Wang J, Zheng N, Tang X, Shum HY (2011) Learning to detect a salient object. IEEE Trans Pattern Anal Mach Intell 33(2):353–367

    Article  Google Scholar 

  22. Lowe DG (1999) Object recognition from local scale-invariant features. In: Proceedings of the international conference on computer vision, Washington, pp 1150–1157

    Chapter  Google Scholar 

  23. Meger D, Muja M, Helmer S, Gupta A, Gamroth C, Hoffman T, Baumann MA, Southey T, Fazli P, Wohlkinger W, Viswanathan P, Little JJ, Lowe DG, Orwell J (2010) Curious george: an integrated visual search platform. In: CRV. IEEE Press, New York, pp 107–114

    Google Scholar 

  24. Mikolajczyk K, Schmid C (2005) A performance evaluation of local descriptors. IEEE Trans Pattern Anal Mach Intell 27(10):1615–1630

    Article  Google Scholar 

  25. Mileva Y, Bruhn A, Weickert J (2007) Illumination-robust variational optical flow with photometric invariants. In:Hamprecht FA, Schnörr C, Jähne B (eds) DAGM-symposium. Lecture notes in computer science, vol 4713. Springer, Berlin, pp 152–162

    Google Scholar 

  26. Moreno R, Graña M, Zulueta E (2010) Rgb colour gradient following colour constancy preservation. Electron Lett 46(13):908–910

    Article  Google Scholar 

  27. Moreno R, Graña M, d’Anjou A (2011) Illumination source chromaticity estimation based on spherical coordinates in rgb. Electron Lett 47(1):28–30

    Article  Google Scholar 

  28. Navalpakkam V, Itti L (2006) An integrated model of top-down and bottom-up attention for optimizing detection speed. In: Proceedings of the 2006 IEEE computer society conference on computer vision and pattern recognition, CVPR’06, vol 2. IEEE Computer Society, Washington, pp 2049–2056

    Google Scholar 

  29. Porikli F (2005) Integral histogram: a fast way to extract histograms in Cartesian spaces. In: IEEE computer society conference on computer vision and pattern recognition, CVPR 2005, vol 1. IEEE Computer Society, Los Alamitos, pp 829–836

    Google Scholar 

  30. Ramik D, Sabourin C, Madani K (2011) Hybrid salient object extraction approach with automatic estimation of visual attention scale. In: 2011 seventh international conference on signal-image technology and Internet-based systems (SITIS), pp 438–445. doi:10.1109/SITIS.2011.31

    Chapter  Google Scholar 

  31. Rutishauser U, Walther D, Koch C, Perona P (2004) Is bottom-up attention useful for object recognition? In: Proceedings of the 2004 IEEE computer society conference on computer vision and pattern recognition, vol 2. IEEE Press, Washington, pp 37–44

    Google Scholar 

  32. Shafer SA (1985) Using color to separate reflection components. Color Res Appl 10(4):210–218

    Article  Google Scholar 

  33. van de Weijer J, Gevers T (2004) Robust optical flow from photometric invariants. In: ICIP, pp 1835–1838

    Google Scholar 

  34. Viola PA, Jones MJ (2004) Robust real-time face detection. Int J Comput Vis 57(2):137–154

    Article  Google Scholar 

  35. Wang Y, Qi Y (2013) Memory-based cognitive modeling for robust object extraction and tracking. Appl Intell 1–16. doi:10.1007/s10489-013-0437-5

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Christophe Sabourin.

Appendix: Image segmentation algorithm in siRGB

Appendix: Image segmentation algorithm in siRGB

figure a
figure b
figure c
figure d

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ramík, D.M., Sabourin, C., Moreno, R. et al. A machine learning based intelligent vision system for autonomous object detection and recognition. Appl Intell 40, 358–375 (2014). https://doi.org/10.1007/s10489-013-0461-5

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-013-0461-5

Keywords

Navigation