Skip to main content
Log in

Cauchy Estimator Discriminant Learning for RGB-D Sensor-based Scene Classification

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Because depth information has shown its effectiveness in scene classification, RGB-D sensor-based scene classification has received wide attention. However, when images are polluted by noise in the transmission process, the recognition rate will decline significantly. Furthermore, after adopting feature representation schemes, the dimensionality of concatenated features that are extracted from the RGB image and depth image pair is very high. Therefore, a new dimensional reduction algorithm called Cauchy estimator discriminant learning (CEDL) is presented in this paper. CEDL simultaneously addresses two goals: (1) to decrease negative influences to some extent when there is noise in the input samples; (2) to preserve the local and global geometry structure of the input samples. Experiments with the frequently used NYU Depth V1 dataset suggest the effectiveness of CEDL compared with other state-of-the-art scene classification methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. Bai S (2014) Sparse code LBP and SIFT features together for scene categorization. Audio, Language and Image Processing (ICALIP), 2014 International Conference on IEEE, Jul. 2014, pp 200–205

  2. Bo L, Ren X, Fox D (2011) Depth kernel descriptors for object recognition. 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems, Sep. 2011, pp 821–826

  3. Bo L, Ren X, Fox D (2013) Unsupervised feature learning for RGB-D based object recognition. Springer Tracts Adv Robot 88:387–402

    Article  Google Scholar 

  4. Cai D, He X, Han J, Zhang H (2006) Orthogonal Laplacianfaces for face recognition. IEEE Trans Image Process 15(11):3608–3614

    Article  Google Scholar 

  5. Chen D, Cao X, Wen F, Sun J (2013) Blessing of dimensionality: high-dimensional feature and its efficient compression for face verification. 2013 I.E. Conference on Computer Vision and Pattern Recognition (CVPR), Jun. 2013, pp 3025–3032

  6. Chen Y, Wang JZ, Krovetz R (2003) Content-based image retrieval by clustering. Digital Image Processing, pp 193–200

  7. Desingh K, Krishna KM, Jawahar CV, Rajan D (2013) Depth really matters: improving visual salient region detection with depth. BMVC, pp 1–11

  8. Duan L, Yue K, Jin C, Xu W, Liu W (2015) Tracing errors in probabilistic databases based on the Bayesian network. Database Systems for Advanced Applications. Springer International Publishing, Apr. 2015, pp 104–119

  9. Fisher RA (1936) The use of multiple measurements in taxonomic problems. Ann Eugen 7(2):179–188

    Article  Google Scholar 

  10. Graham DB, Allinson NM (1998) Characterizing virtual eigensignatures for general purpose face recognition. In: Wechsler H, Phillips PJ, Bruce V, Fogelman-Soulie F, Huang TS (eds) Face recognition: from theory to applications, vol 163, pp 446–456

  11. Han J, Shao L, Xu D, Shotton J (2013) Enhanced computer vision with microsoft Kinect sensor: a review. IEEE Trans Cybern 43(5):1318–1334

    Article  Google Scholar 

  12. He X, Niyogi P (2003) Locality preserving projections. Neural Inf Process Syst 16:153

    Google Scholar 

  13. Hotelling H (1933) Analysis of a complex of statistical variables into principal components. J Educ Psychol 24(6):417–441

    Article  MATH  Google Scholar 

  14. Huang D, Shan C, Ardabilian M, Wang Y, Chen L (2011) Local binary patterns and its application to facial image analysis: a survey. IEEE Trans Syst Man Cybern Part C Appl Rev 41(6):765–781

    Article  Google Scholar 

  15. Janoch A, Karayev S, Jia Y, Barron JT, Fritz M, Saenko K, Darrell T (2011) A category-level 3D object dataset: putting the Kinect to work. Proceedings of ICCV Workshop on Advances in Computer Vision and Pattern Recognition, pp 141–165

  16. Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. IEEE International Conference on Computer Vision and Pattern Recognition, Jun. 2006, pp 2167–2178

  17. Li L, Su H, Lim Y, Li F (2010) Objects as attributes for scene classification. ECCV 2010 Workshops, Sep. 2010, pp 57–69

  18. Liang Y, Song M, Bu J, Chen C (2014) Colorization for gray scale facial image by locality-constrained linear coding. J Signal Process Syst 74(1):59–67

    Article  Google Scholar 

  19. Liu T, Tao D Classification with Noisy Labels by Importance Reweighting. IEEE Trans Pattern Anal Mach Intell (T-PAMI) doi: 10.1109/TPAMI.2015.2456899

  20. Madokoro H, Utsumi Y, Sato K (2012) Scene classification using unsupervised neural networks for mobile robot vision. IEEE Proceedings of SICE Annual Conference, pp 1568–1573

  21. Mariscal-Ramirez JA, Fernandez-Prieto JA, Canada-Bago J, Gadeo-Martos MA (2015) A new algorithm to monitor noise pollution adapted to resource-constrained devices. Multimedia Tools Appl 74:9175–9189

    Article  Google Scholar 

  22. Mizera I, Muller CH (2002) Breakdown points of Cauchy regression-scale estimators. Stat Probab Lett 57(1):79–89

    Article  MathSciNet  MATH  Google Scholar 

  23. Monay F, Gatica-Perez D (2003) On image auto-annotation with latent space models. Proceedings of the eleventh ACM international conference on MultimediaACM, pp 275–278

  24. Roweis ST, Saul LK (2000) Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500):2323–2326

    Article  Google Scholar 

  25. Shao L, Han J, Xu D, Shotton J (2013) Computer vision for RGB-D sensors: Kinect and its applications [special issue intro.]. IEEE Trans Cybern 43(5):1314–1317

    Article  Google Scholar 

  26. Shao L, Liu L, Li X (2014) Feature learning for image classification via multiobjective genetic programming. IEEE Trans Neural Netw Learn Syst 25:1359–1371

    Article  Google Scholar 

  27. Shao Y, Zhou Y, He X, Cai D, Bao H (2009) Semi-supervised topic modeling for image annotation. In Proceedings of the 17th ACM International Conference on Multimedia, pp 521–524

  28. Silberman N, Fergus R (2011) Indoor scene segmentation using a structured light sensor. In Proceedings ICCV Workshop 3-D Representation Recognition, Nov. 2011, pp 601–608

  29. Smeulders AW, Worring M, Santini S, Gupta A, Jain R (2000) Content-based image retrieval at the end of the early years. IEEE Trans Pattern Anal Mach Intell 22(12):1349–1380

    Article  Google Scholar 

  30. Tao D, Li X, Wu X, Maybank S (2007) General tensor discriminant analysis and Gabor features for gait recognition. IEEE Trans Pattern Anal Mach Intell 29(10):1700–1715

    Article  Google Scholar 

  31. Tao D, Li X, Wu X, Maybank S (2009) Geometric mean for subspace selection. IEEE Trans Pattern Anal Mach Intell 31(2):260–274

    Article  Google Scholar 

  32. Tenenbaum J, Silva V, Langford J (2000) A global geometric framework for nonlinear dimensionality reduction. Science 290(5500):2319–2323

    Article  Google Scholar 

  33. Tom M, Babu RV, Praveen RG (2015) Compressed domain human action recognition in H.264/AVC video streams. Multimedia Tools Appl 74:9328–9338

    Article  Google Scholar 

  34. Vailaya A, Figueiredo MAT, Jain AK, Zhang H-J (2001) Image classification for content-based indexing. IEEE Trans Image Process 10(1):117–130

    Article  MATH  Google Scholar 

  35. Wang D (2005) The time dimension for scene analysis. IEEE Trans Neural Netw 16(6):1401–1426

    Article  Google Scholar 

  36. Wang X, Hou C, Pu L, Hou Y (2015) A depth estimating method from a single image using FoE CRF. Multimedia Tools Appl 74:9491–9506

    Article  Google Scholar 

  37. Wang X, Hou Z, Tan M, Wang Y, Wang X (2008) Corridor-scene classification for mobile robot using spiking neurons. IEEE International Conference on Natural Computation, pp 125–129

  38. Wang J, Yang J, Yu K, Lv F, Huang T, Gong Y (2010) Locality-constrained linear coding for image classification. IEEE International Conference on Computer Vision and Pattern Recognition, Jun. 2010, pp 3360–3367

  39. Xu C, Tao D, Xu C Multi-view intact space learning. IEEE Trans Patten Anal Mach Intell doi: 10.1109/TPAMI.2015.2417578

  40. Yao Y, Fu Y (2012) Real-time hand pose estimation from RGB-D sensor. IEEE International Conference on Multimedia and ExpoIEEE Computer Society, Jul 2012, pp 705–710

  41. Zhang T, Tao D, Li X, Yang J (2009) Patch alignment for dimensionality reduction. IEEE Trans Knowl Data Eng 21(9):1299–1313

    Article  Google Scholar 

  42. Zhang L, Zhang L, Tao D, Du B (2015) A sparse and discriminative tensor to vector projection for human gait feature representation. Signal Process 106:245–252

    Article  Google Scholar 

  43. Zhang L, Zhang Q, Zhang L, Tao D, Huang X, Du B (2015) Ensemble manifold regularized sparse low-rank approximation for multiview feature embedding. Pattern Recogn 48(10):3102–3112

    Article  Google Scholar 

  44. Zhu F, Shao L (2014) Weakly-supervised cross-domain dictionary learning for visual recognition. Int J Comput Vis 109:42–59

    Article  MATH  Google Scholar 

  45. Zou H, Hastie T, Tibshirani R (2006) Sparse principal component analysis. J Comput Graph Stat 15(2):262–286

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgments

This work was supported in part by the National Natural Science Foundation of China under Grant 61572486, 61402458, 61301242, 61271407 and 61263048, the Guangdong Natural Science Funds under Grant 2014A030310252, the Shenzhen Technology Project under Grant JCYJ20140901003939001, Hubei Key Laboratory of Intelligent Vision Based Monitoring for Hydroelectric Engineering Program under Grant 2014KLA01, the Young and Middle-Aged Backbone Teachers’ Cultivation Plan of Yunnan University under Grant XT412003, the Fundamental Research Funds for the Central Universities, China University of Petroleum (East China) under Grant 14CX02203A.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Weifeng Liu.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tao, D., Yang, X., Liu, W. et al. Cauchy Estimator Discriminant Learning for RGB-D Sensor-based Scene Classification. Multimed Tools Appl 76, 4471–4489 (2017). https://doi.org/10.1007/s11042-016-3370-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-016-3370-x

Keywords

Navigation