Skip to main content
Log in

Structure preserving dimensionality reduction for visual object recognition

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Robust object recognition has drawn increasing attention in the field of computer vision and machine learning with fast development in feature extraction and classification techniques, and release of public datasets, such as Caltech datasets, Pascal Visual Object Classes, and ImageNet. Recently, deep learning based object recognition systems have shown significant performance improvements in visual object recognition tasks using innovative learning methodology. However, high dimensional space searching and recognition is time consuming, so performing point and range queries in high dimension is reconsidered for object recognition. This paper proposes optimized dimensionality reduction using structured sparse principle component analysis. The proposed method retains high dimensional feature structures, removes redundant features that do not contribute to similarity, and classifies the query image in a large database. The qualitative and quantitative experimental results, including a comparison with the current state-of-the-art visual object recognition algorithms, verify that the proposed recognition algorithm performs favorably in reducing the query image dimension and number of training images.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Abdechiri M, Faez K, Amindavar H, Bilotta E (2017) Chaotic target representation for robust object tracking. Signal Process Image Commun 54:23–35

    Article  Google Scholar 

  2. Akaike H (1987) Factor analysis and AIC. Psychometrika 52(3):317–332

    Article  MathSciNet  MATH  Google Scholar 

  3. Arias RS A convex optimization algorithm for sparse representation and applications in classification problems. Ph.D. thesis, DigitalCommons@UTEP. http://digitalcommons.utep.edu/dissertations/AAI3565935

  4. Bellman R (1957) Dynamic programming. Princeton University Press

  5. Bo L, Ren X, Fox D (2013) Multipath sparse coding using hierarchical matching pursuit. In: IEEE Conference on computer vision and pattern recognition

  6. Bosch A, Zisserman A, Mu X, Munoz X (2007) Image classification using random forests and ferns. In: IEEE 11th International conference on computer vision (ICCV), pp 1–8

  7. Boureau Y L, Bach F, LeCun Y, Ponce J (2010) Learning mid-level features for recognition. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, pp 2559–2566

  8. Chen L, Chen J, Gu Y (2012) Greedy pursuits: stability of recovery performance against general perturbations. In: ICNC. IEEE Computer Society, pp 897–901

  9. Ciresan D C, Meier U, Masci J, Gambardella L M, Schmidhuber J (2011) High-performance neural networks for visual object classification. CoRR arXiv:http://arXiv.org/abs/1102.0183

  10. Davison M L (1983) Multidimensional scaling. Wiley, New York

    MATH  Google Scholar 

  11. De Pierrefeu A, Löfstedt T, Hadj-Selem F, Dubois M, Ciuciu P, Frouin V, Duchesnay E (2016) Structured sparse principal components analysis with the tv-elastic net penalty. arXiv:http://arXiv.org/abs/1609.01423

  12. Donoho D L (2006) Compressed sensing. IEEE Trans Inf Theory 52(4):1289–1306

    Article  MathSciNet  MATH  Google Scholar 

  13. Field D J (1994) What is the goal of sensory coding? Neural Comput 6(4):559–601

    Article  Google Scholar 

  14. Gan G, Ng M K P (2015) Subspace clustering with automatic feature grouping. Pattern Recogn 48(11):3703–3713

    Article  Google Scholar 

  15. Grauman K, Darrell T (2005) The pyramid match kernel: Discriminative classification with sets of image features. In: Proceedings of the IEEE international conference on computer vision, vol II, pp 1458–1465. https://doi.org/10.1109/ICCV.2005.239

  16. He K, Zhang X, Ren S, Sun J (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell 37(9):1904–1916. https://doi.org/10.1109/TPAMI.2015.2389824 https://doi.org/10.1109/TPAMI.2015.2389824

    Article  Google Scholar 

  17. Hoffmann H (2007) Kernel PCA for novelty detection. Pattern Recogn 40 (3):863–874

    Article  MATH  Google Scholar 

  18. Huang J, Zhang T (2010) The benefit of group sparsity. Ann Stat 38 (4):1978–2004

    Article  MathSciNet  MATH  Google Scholar 

  19. Huang J, Zhang T, Metaxas D (2009) Learning with structured sparsity. J Mach Learn Res 12:1–30. https://doi.org/10.1145/1553374.1553429

    MathSciNet  MATH  Google Scholar 

  20. Huber P J (1985) Projection pursuit. Ann Statist 13(2):435–475

    Article  MathSciNet  MATH  Google Scholar 

  21. Jenatton R, Audibert J Y, Bach F (2011) Structured variable selection with sparsity-inducing norms. J Mach Learn Res 12:2777–2824

    MathSciNet  MATH  Google Scholar 

  22. Jenatton R, Obozinski G, Bach F (2010) Structured sparse principal component analysis. In: International conference on artificial intelligence and statistics, pp 1–13

  23. Jianchao Y, Yu K, Gong Y, Huang T (2009) Linear spatial pyramid matching using sparse coding for image classification. In: 2009 IEEE Conference on computer vision and pattern recognition pp 1794–1801

  24. Jolliffe I (1986) Principal component analysis. Springer, New York

    Book  MATH  Google Scholar 

  25. Kavukcuoglu K, LeCun Y, Ranzato M (2010) Fast inference in sparse coding algorithms with applications to object recognition, pp 1–9. arXiv:http://arXiv.org/abs/1010.3467. https://doi.org/10.1109/ICIP.2001.958968

  26. Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, vol 2, pp 2169–2178. https://doi.org/10.1109/CVPR.2006.68

  27. LeCun Y, Kavukcuoglu K, Farabet C (2010) Convolutional networks and applications in vision. In: ISCAS. IEEE, pp 253–256

  28. LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521 (7533):436–444

    Article  Google Scholar 

  29. Lee T W (1998) Independent component analysis, theory and applications. Kluwer Academic Publishers

  30. Lee DD, Seung HS (1999) Learning the parts of objects by non-negative matrix factorization. Nature 401(6755):788–91. https://doi.org/10.1038/44565

    Article  MATH  Google Scholar 

  31. Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, Berg AC SSD: single shot multibox detector. arXiv:https://arxiv.org/abs/1512.02325

  32. Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110. https://doi.org/10.1023/B:VISI.0000029664.99615.94

    Article  Google Scholar 

  33. Mutch J, Lowe DG (2006) Multiclass object recognition using sparse, localized features. In: IEEE Conference on computer vision and pattern recognition, pp 11–18. https://doi.org/10.1109/CVPR.2006.200

  34. Naikal N, Yang AY, Shankar S (2011) Informative feature selection for object recognition via Sparse PCA. In: Proceedings of the IEEE international conference on computer vision, pp 818–825. https://doi.org/10.1109/ICCV.2011.6126321

  35. Oliveira GL, Nascimento ER, Vieira AW, Campos MFM (2012) Sparse spatial coding: a novel approach for efficient and accurate object recognition. In: Proceedings - IEEE international conference on robotics and automation, pp 2592–2598. https://doi.org/10.1109/ICRA.2012.6224785

  36. Olshausen BA, Field DJ (1996) Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature 381(6583):607–609. https://doi.org/10.1038/381607a0

    Article  Google Scholar 

  37. Redmon J, Farhadi A YOLO9000: better, faster, stronger. arXiv:https://arxiv.org/abs/1612.08242

  38. Roweis S T, Saul L K (2000) Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500):2323–2326

    Article  Google Scholar 

  39. Serre T, Wolf L, Poggio T (2005) Object recognition with features inspired by visual cortex. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, vol 2, pp 994–1000. https://doi.org/10.1109/CVPR.2005.254

  40. Sohn K, Jung DY, Lee H, Hero AO (2011) Efficient learning of sparse, distributed, convolutional feature representations for object recognition. In: Proceedings of the IEEE international conference on computer vision, pp 2643–2650. https://doi.org/10.1109/ICCV.2011.6126554

  41. Tenenbaum J B, de Silva V, Langford J C (2000) A global geometric framework for nonlinear dimensionality reduction. Science 290(5500):2319–2323

    Article  Google Scholar 

  42. Wang J, Yang J, Yu K, Lv F, Huang T, Gong Y (2010) Locality-constrained linear coding for image classification. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, pp 3360–3367. https://doi.org/10.1109/CVPR.2010.5540018

  43. Weinberger K Q, Saul L K (2006) An introduction to nonlinear dimensionality reduction by maximum variance unfolding. In: AAAI. AAAI Press, pp 1683–1686

  44. Weinberger K Q, Saul L K (2006) Unsupervised learning of image manifolds by semidefinite programming. Int J Comput Vis 70(1):77–90

    Article  Google Scholar 

  45. Yang J, Li Y, Tian Y, Duan L, Gao W (2009) Group-sensitive multiple kernel learning for object categorization. In: IEEE International conference on computer vision

  46. Zeiler M D, Fergus R (2014) Visualizing and understanding convolutional networks. In: European conference on computer vision

  47. Zhang S, Huang J, Li H, Metaxas D N (2012) Automatic image annotation and retrieval using group sparsity. IEEE Trans Syst Man Cybern Part B 42(3):838–849

    Article  Google Scholar 

  48. Zhang Z, Xu Y, Yang J, Li X, Zhang D (2015) A survey of sparse representation: algorithms and applications. IEEE Access 3:490–530. https://doi.org/10.1109/ACCESS.2015.2430359

    Article  Google Scholar 

  49. Zhu P, Zhu W, Hu Q, Zhang C, Zuo W (2017) Subspace clustering guided unsupervised feature selection. Pattern Recogn 66:364–374

    Article  Google Scholar 

  50. Zou H, Hastie T, Tibshirani R (2006) Sparse principal component analysis. J Comput Graph Stat 15(2):265–286. https://doi.org/10.1198/106186006X113430

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

J. Song and S.M. Yoon were supported by the National Research Foundation of Korea grants funded (No.2015R1A5A7037615, No.2016R1D1A1B04932889) and IITP (#2014-0-00501) by the Korean Government. H. Cho was support by the National Research Foundation of Korea (No. 2017R1A2B4011015). G.J.Yoon was supported by National Institute for Mathematical Sciences (NIMS).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sang Min Yoon.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Song, J., Yoon, G., Cho, H. et al. Structure preserving dimensionality reduction for visual object recognition. Multimed Tools Appl 77, 23529–23545 (2018). https://doi.org/10.1007/s11042-018-5682-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-018-5682-5

Keywords

Navigation