Skip to main content

Advertisement

Log in

Advanced pattern recognition from complex environments: a classification-based approach

  • Methodologies and Application
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

This paper describes an algorithm for building 3D maps of objects detected in the visual scene acquired in an indoor environment. One feature of the described algorithm is that it works with a standard webcam equipped with a simple devices which automatically estimates the camera orientation and its distance from the floor. Another feature is that the algorithm has a low computational complexity. The proposed algorithm first extracts from the acquired images the regions of interest (ROI) which may contain an object. The ROI’s 3D position is then estimated and a map of the environment is generated. ROI extraction is realized with an Haar-like approach. ROIs are represented with edge-based features. The edge representation is filtered with a novel fuzzy-based technique which removes edges introduced by noise. Object classification is performed with a pseudo2D-HMM algorithm. We prove the reliability of our method by discussing some critical applications in the context of human–robot interaction and robot–robot interaction. Finally, we complete our contributions via describing a case study in the robotic field and providing comprehensive experimental results showing the benefits deriving from our approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

References

  • Besl PJ, McKay ND (1992) A method for registration of 3d shapes. IEEE Trans PAMI, 239–256

  • Biswas J, Veloso M (2012) Depth camera based indoor mobile robot localization and navigation. In: IEEE international conference on robotics and automation, ICRA, 14–18 May, 2012. St. Paul, Minnesota, USA, pp 1697–1702

  • Campbell J, Sukthankar R, Nourbakhsh I, Pahwa A (2005) A robust visual odometry and precipice detection system using consumergrade monocular vision. In: Proceedings of the 2005 IEEE international conference on robotics and automation ICRA 2005, pp 3421–3427

  • Canny J (1986) A computational approach to edge detection. Pattern Anal Mach Intell 8(6):679–698

    Article  Google Scholar 

  • Cuzzocrea A (2006) Accuracy control in compressed multidimensional data cubes for quality of answer-based OLAP tools. In: Proceedings of 18th international conference on scientific and statistical database management, SSDBM 2006, 3–5 July 2006, Vienna, Austria, pp 301–310

  • Cuzzocrea A (2014) Privacy and security of big data: current challenges and future research perspectives. In Proceedings of the first international workshop on privacy and secuirty of big data, PSBD@CIKM, Shanghai, China, November 7, 2014, pp 45–47

  • Cuzzocrea A, Furfaro F, Saccà D (2009) Enabling OLAP in mobile environments via intelligent data cube compression techniques. J Intell Inf Syst 33(2):95–143

    Article  Google Scholar 

  • Cuzzocrea A, Mumolo E, Moro A (2015) A classification-based algorithm for building 3d maps of environmental objects. In: 15th international conference on computational science and its applications, ICCSA, Banff, AB, Canada, June 22–25, 2015, short papers, poster papers, and Ph. D. student showcase works, pp 33–41

  • Cuzzocrea A, Saccà D (2010) Balancing accuracy and privacy of OLAP aggregations on data cubes. In: Proceedings of DOLAP 2010, ACM 13th international workshop on data warehousing and OLAP, Toronto, Ontario, Canada, October 30, 2010, pp 93–98

  • Cuzzocrea A, Saccà D, Serafino P (2006) A hierarchy-driven compression technique for advanced OLAP visualization of multidimensional data cubes. In: Proceedings of 8th international conference data warehousing and knowledge discovery, DaWaK, Krakow, Poland, September 4–8, 2006, pp 106–119

  • Cuzzocrea A, Saccà D, Ullman JD (2013) Big data: a research agenda. In: 17th international database engineering & applications symposium, IDEAS, Barcelona, Spain, October 09–11, 2013, pp 198–203

  • Cuzzocrea A, Song IY, Davis KC (2011) Analytics over large-scale multidimensional data: the big data revolution!. In: Proceedings of DOLAP, ACM 14th international workshop on data warehousing and OLAP, Glasgow, United Kingdom, October 28, 2011, pp 101–104

  • Eigen D, Puhrsch C, Fergus R (2014) Depth map prediction from a single image using a multi-scale deep network. CoRR, abs/1406.2283

  • Felzenszwalb PF, Huttenclocher DP (2004) Efficient graph-based image segmentation. Int J Comput Vis 59(2):167–181

    Article  Google Scholar 

  • Feng Lu, Milios Evangelos (1997) Robot pose estimation in unknown environments by matching 2d range scans. J Intell Robot Syst 18(3):249–275

    Article  Google Scholar 

  • Furfaro F, Cuzzocrea A, Masciari E, Saccà D, Sirangelo C (2004) Approximate query answering on sensor network data streams. In: Stefanidis A, Nittel S (eds) GeoSensor Networks. CRC Press, Boca Raton, FL, USA, pp 53–72

  • Geiger A, Lauer M, Wojek C, Stiller C, Urtasun R (2014) 3D traffic scene understanding from movable platforms. IEEE Trans Pattern Anal Mach Intell 36(5):1012–1025

    Article  Google Scholar 

  • Gerkey B, Vaughan R, Howard A (2003) The player/stage project: tools for multi-robot and distributed sensor systems. In: Proceedings of the international conference on advanced robotics, pp 317–323

  • Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: IEEE conference on computer vision and pattern recognition (CVPR’14), pp 580–587

  • Gonzalez RC, Woods RE (1992) Digital image processing, 2nd edn. Addison-Wesley Longman Publishing Co., Inc. Boston, MA, USA

  • Graham D, Simmons G, Nguyen DT, Zhou G (2015) A software-based sonar ranging sensor for smart phones. IEEE Internet Things J 2(6):479–489

    Article  Google Scholar 

  • Hafed ZM, Levine MD (2001) Face recognition using the discrete cosine transform. Int J Comput Vis 43(3):167–188

    Article  MATH  Google Scholar 

  • Ikehara M, Nagai T, Naruse T, Kurematsu A (2002) Hmm-based surface reconstruction from single images. In: Proceeding of IEEE international conference on image processing (ICIP), pp 561–564

  • Jones MJ, Viola P (2003) Fase multiview face detection. MERL Technical Report No. TR2003-96, Cambridge, MA, USA

  • Karsch K, Liu C, Kang SB (2012) Depth extraction from video using non-parametric sampling. In: Fitzgibbon A, Lazebnik S, Perona P, Sato Y, Schmid C (eds) Computer Vision - ECCV 2012. ECCV 2012. Lecture Notes in Computer Science, vol 7576. Springer, Berlin, Heidelberg, pp 775–788

  • Kawakita M, Iizuka K, Aida T, Kurita T, Kikuchi H (2004) Real-time three-dimensional video image composition by depth information. IEICE Electron Express 1:237–242

    Article  Google Scholar 

  • Kearns J, Saxena A, Driemeyer J, Ng A (2006) Robotic grasping of novel objects. In: Proceeding of 20th anniversary conference neural information processing systems, vol 19

  • Kirsch R (1971) Computer determination of the constituent structure of biological images. Comput Biomed Res 4:315–328

    Article  Google Scholar 

  • Konrad J, Wang M, Ishwar P (2012) 2d-to-3d image conversion by learning depth from examples. In: CVPR Workshops. pp 16–22

  • Ladicky L, Shi J, Pollefeys M (2014) Pulling things out of perspective. In: CVPR. pp 89–96

  • Ladický L, Zeisl B, Pollefeys M (2014) Discriminatively trained dense surface normal estimation. In: Fleet D, Pajdla T, Schiele B, Tuytelaars T (eds) Computer Vision - ECCV 2014. ECCV 2014. Lecture Notes in Computer Science, vol 8693. Springer, Cham, pp 468–484

  • Lienhart R, Maydt J (2002) An extended set of Haar-like features for rapid object detection. In: Proceedings 2002. International Conference on In Image Processing. vol 1. pp 900–903

  • Liu Z, Xu S, Zhang Y, Chen X, Chen CP (2014) Interval type-2 fuzzy kernel based support vector machine algorithm for scene classification of humanoid robot. Soft Comput 18(3):589–606

    Article  Google Scholar 

  • Liu B, Gould S, Koller D (2010) Single image depth estimation from predicted semantic labels. In: Proceedings of IEEE Internationational Conference on Computer Vision and Pattern Recognition (CVPR). pp 1253–1260

  • Liu M, Salzmann M, He X (2014) Discrete-continuous depth estimation from a single image. In: CVPR, pp 716–723,

  • Marr D, Hildreth E (1980) Theory of edge detection. Proc R Soc Lond 207:187–217

    Article  Google Scholar 

  • McColl D, Zhang Z, Nejat G (2011) Human body pose interpretation and classification for social human–robot interaction. Int J Soc Robot 3(3):313–332

    Article  Google Scholar 

  • Minguez J, Montesano L, Lamiraux F (2006) Metric-based iterative closest point scan matching for sensor displacement estimation. Trans Robot 22(5):1047–1054

    Article  Google Scholar 

  • Minguez J, Montesano L, Lamiraux F (2006) Metric-based iterative closest point scan matching for sensor displacement estimation. IEEE Trans Robot 22(5):1047–1054

    Article  Google Scholar 

  • Modayil J, Kuipers B (2006) Autonomous shape model learning for object localization and recognition. In: International conference on robotics and automation (ICRA), pp 2991–2996,

  • Montemerlo M, Thrun S, Koller D, Wegbreit B (2002) Fastslam: a factored solution to the simultaneous localization and mapping problem. In: Proceedings of the 18th national conference on artificial intelligence (AAAI), pp 593–598

  • Moro A, Mumolo E, Nolich M (2008) Visual scene analysis using relaxation labeling and embedded hidden markov models for map-based robot navigation. In: International conference on information technology interfaces ITI, pp 767–772

  • Mozos OM, Triebel R, Jensfelt P, Rottmann A, Burgard W (2007) Supervised semantic labeling of places using information extracted from sensor data. Robot Auton Syst 55(5):391–402

    Article  Google Scholar 

  • Nefian A, Hayes MH (1999) An embedded hmm-based aproach for face detection and recognition. In: International conference on acoustics, speech and signal processing, pp 3553–3556

  • Oquab M, Bottou L, Laptev I, Sivic J (2014) Learning and transferring midlevel image representations using convolutional neural networks. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp 1717–1724

  • Parent P, Zucker S (1989) Trace inference, curvature consistency and curve detection. IEEE Trans Pattern Anal Mach Intell 11(8):823–839

    Article  Google Scholar 

  • Peleg S, Rosenfeld A (1978) Determining compatibility coefficients for curve enhancement relaxation processes. IEEE Trans Syst Man Cybern SMC–8:548–555

    Google Scholar 

  • Ranganathan A, Dellaert F (2007) Semantic modeling of places using objects. In: Proceedings of robotics: science and systems. doi:10.15607/RSS.2007.III.001

  • Rangel JC, Cazorla M, Garcia-Varea I, Martinez Gomez J, Fromont E, Sebban M (2016) Scene classification based on semantic labeling. Adv Robot 30(11–12):758–769

    Article  Google Scholar 

  • Romero-Cano V, Agamennoni G, Nieto J (2016) A variational approach to simultaneous multi-object tracking and classification. Int J Robot Res 35(6):654–671

    Article  Google Scholar 

  • Rosa Tiago, Queluz Maria Paula (2001) Authentication of digital images and video: generic models and a new contribution. Sig Proc Image Commun 16(5):461–475

    Article  Google Scholar 

  • Rusu RB, Maldonado A, Beetz M, Kranz M, Msenlechner L, Holleis P, Schmidt A (2006) Player/stage as middleware for ubiquitous computing. In: Proceedings of the 8th annual conference on ubiquitous computing, pp 17–21

  • Saxena A, Sun M, Ng AY (2009) Make3D: learning 3D scene structure from a single still image. IEEE Trans Pattern Anal Mach Intell 31(5):824–840

  • Schwalb M, Ewerth R, Freisleben B (2007) Using depth features to retrieve monocular video shots. In: Proceeding of ACM international conference on image and video retrieval, pp 210–217

  • Stark M, Zia Zeeshan M, Schindler K (2013) Explicit occlusion modeling for 3d object class representations. In: CVPR2013, pp 3326–3333

  • Tomono M (2006) 3-d object map building using dense object models with sift-based recognition features. In: Proceeding of IEEE international conference of intelligent robots and systems - IROS

  • Torralba A (2003) Contextual priming for object detection. Int J Comput Vis 53(2):161–191

    Article  MathSciNet  Google Scholar 

  • Vasudevan S, Gachter S, Berger M, Siegwart R (2007) Cognitive maps for mobile robots—an object based approach. Robot Auton Syst 55(5):359–371

    Article  Google Scholar 

  • Yu B, Cuzzocrea A, Jeong DH, Maydebura S (2012) On managing very large sensor-network data using bigtable. In: 12th IEEE/ACM international symposium on cluster, cloud and grid computing, CCGrid 2012, Ottawa, Canada, May 13–16, 2012, pp 918–922

  • Zucker SW, Hummel RA, Rosenfeld A (1977) An application of relaxation labeling to line and curve enhancement. IEEE Trans Comput 26(4):394–403

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alfredo Cuzzocrea.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Communicated by V. Loia.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cuzzocrea, A., Mumolo, E. & Grasso, G.M. Advanced pattern recognition from complex environments: a classification-based approach. Soft Comput 22, 4763–4778 (2018). https://doi.org/10.1007/s00500-017-2661-0

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-017-2661-0

Keywords

Navigation