Skip to main content
Log in

Global and local clustering with kNN and local PCA

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

This paper proposes a new clustering method that combines the k Near Neighbor (k NN) method and the local Principal Component Analysis (PCA) to consider the global and local information of data points for clustering. Specifically, we propose firstly preserving the local information of samples using the k NN method to obtain a neighborhood subset and a covariance matrix for each data point, and then preserving the global information of the data by conducting the local PCA on each covariance matrix to obtain a binary affinity matrix of the data. Furthermore, our method conducts clustering on the resulting affinity matrix without the assignment of clustering number. Experimental analysis on 8 UCI benchmark datasets showed that our proposed method outperformed the state-of-the-art clustering methods in terms of clustering performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  1. Arias-Castro E, Lerman G, Zhang T (2017) Spectral clustering based on local pca. J Mach Learn Res 18(9):1–57

    MathSciNet  MATH  Google Scholar 

  2. Bhatia N (2010) Vandana Survey of nearest neighbor techniques. Comput Sci Inform Secur 8(2):302–305

    Google Scholar 

  3. Chen YS, Yi PH, Fuh CS (2007) Fast algorithm for nearest neighbor search based on a lower bound tree. In: ICCV, pp 446–453

  4. Deng X, Li Y, Weng J, Zhang J (2018) Feature selection for text classification A review. Multimedia Tools and Applications, pp 1–20

  5. Domeniconi C, Peng J, Gunopulos D (2002) Locally adaptive metric nearest-neighbor classification. IEEE Trans Pattern Anal Mach Intell 24(9):1281–1285

    Article  Google Scholar 

  6. Elhamifar E, Vidal R (2013) Sparse subspace clustering: Algorithm, theory, and applications. IEEE Trans Pattern Anal Mach Intell 35(11):2765–2781

    Article  Google Scholar 

  7. Fayed HA, Atiya AF (2009) A novel template reduction approach for the K-nearest neighbor method. IEEE Press

  8. Gao L, Guo Z, Zhang H, Xu X, Shen HT (2017) Video captioning with attention-based LSTM and semantic consistency. IEEE Trans Multimed 19(9):2045–2055

    Article  Google Scholar 

  9. Goldberg AB, Zhu X, Singh A, Xu Z, Nowak R (2010) Multi-manifold semi-supervised learning. Ynh Lr on Arfal Nllgn Mahn Larnng 5(1):169–176

    Google Scholar 

  10. Gong D, Zhao X, Medioni G (2012) Robust multiple manifolds structure learning, pp 25–32

  11. Hagen L, Kahng A (1991) Fast spectral methods for ratio cut partitioning and clustering. In IEEE International Conference on Computer-Aided Design, 1991. Iccad-91. Digest of Technical Papers, pp 10–13

  12. Hartigan J (1979) A k-means clustering algorithm. Appl Stat 28(1):100–108

    Article  Google Scholar 

  13. Hu R, Zhu X, Cheng D, He W, Yan Y, Song J, Zhang S (2017) Graph self-representation method for unsupervised feature selection. Neurocomputing 220:130–137

    Article  Google Scholar 

  14. Lei C, Zhu X (2017) Unsupervised feature selection via local structure learning and sparse learning. https://doi.org/10.1007/s11042--017--5381--7, pp 11

  15. Liu G, Lin Z, Yan S, Ju S, Yu Y, Ma Y (2013) Robust recovery of subspace structures by low-rank representation. IEEE Trans Pattern Anal Mach Intell 35(1):171–184

    Article  Google Scholar 

  16. Lu CY, Min H, Zhao ZQ, Zhu L, De SH, Yan S (2012) Robust and efficient subspace segmentation via least squares regression. In: European Conference on Computer Vision, pp 347–360

    Chapter  Google Scholar 

  17. Luo D, Nie F, Ding C, Huang H (2011) Multi-subspace representation and discovery. Mach Learn Knowl Discov Databases 6912(1):405–420

    Google Scholar 

  18. Meila M, Xu L (2003) Multiway cuts and spectral clustering

  19. Nie F, Huang H (2016) Subspace clustering via new low-rank model with discrete group structure constraint. IJCAI, pp 1874–1880

  20. Nie F, Wang X, Jordan MI, Huang H (2016) The constrained laplacian rank algorithm for graph-based clustering. In: Thirtieth AAAI Conference on Artificial Intelligence, pp 1969–1976

  21. Shah SA, Koltun V (2017) Robust continuous clustering. Proc Natl Acad Sci USA 114(37):9814– 9819

    Article  Google Scholar 

  22. Shen F, Xu Y, Liu Li, Yang Y, Huang Z, Shen HT (2018) Unsupervised deep hashing with similarity-adaptive and discrete optimization. https://doi.org/10.1109/TPAMI.2018.2789887

  23. Shi J, Malik J (2000) Normalized cuts and image segmentation. IEEE Trans Pattern Anal Mach Intell 22(8):888–905

    Article  Google Scholar 

  24. Song Y, Huang J, Zhou D, Zha H, Iknn C, Giles L (2007) Informative k-nearest neighbor pattern classification. In: Knowledge Discovery in Databases: Pkdd 2007, European Conference on Principles and Practice of Knowledge Discovery in Databases. Proceedings, Warsaw, pp 248–264

  25. Song J, Gao L, Nie F, Shen HT, Yan Y, Sebe N (2016) Optimized graph learning using partial tags and multiple features for image and video annotation. IEEE Trans Image Process 25(11):4999–5011

    Article  MathSciNet  Google Scholar 

  26. Song J, Shen HT, Wang J, Zi H, Sebe N, Wang J (2016) A distance-computation-free search scheme for binary code databases. IEEE Trans Multimed 18(3):484–495

    Article  Google Scholar 

  27. Song J, Gao L, Li L, Zhu X, Sebe N (2018) Quantization-based hashing: a general framework for scalable image and video retrieval. Pattern Recogn 75:175–187

    Article  Google Scholar 

  28. Wang Y, Jiang Y, Wu Y, Zhou ZH (2011) Spectral clustering on multiple manifolds. IEEE Trans Neural Netw 22(7):1149

    Article  Google Scholar 

  29. Wang S, Yuan X, Yao T, Yan S, Shen J (2011) Efficient subspace segmentation via quadratic programming. In: AAAI Conference on Artificial Intelligence, pp 519–524

  30. Wojna A (2002) Riona: A classifier combining rule induction and k. Lect Notes Comput Sci 2430:111–123

    Article  MathSciNet  Google Scholar 

  31. Yang Y, Duan Y, Wang X, Huang Z, Xie N, Shen HT (2018) Hierarchical multi-clue modelling for poi popularity prediction with heterogeneous tourist information. IEEE Transactions on Knowledge and Data Engineering

  32. Yang Y, Zhou J, Ai J, Yi B, Hanjalic A, Shen HT (2018) Video captioning by adversarial lstm. IEEE Transactions on Image Processing, https://doi.org/10.1109/TIP.2018.2855422

    Article  Google Scholar 

  33. Yi B, Yang Y, Shen F, Xie N, Shen HT, Li X (2018) Describing video with attention based bidirectional lstm. IEEE Transactions on Cybernetics, pp 10.1109/TCYB.2018.2831447

  34. Yu Z, Jin J, Qing X, Wang B, Wang X (2012) Lasso based stimulus frequency recognition model for ssvep bcis. Biomed Signal Process Control 7(2):104–111

    Article  Google Scholar 

  35. Zhang Y, Zhao Q, Jin J, Wang X, Cichocki A (2012) A novel bci based on erp components sensitive to configural processing of human faces. J Neural Eng 9 (2):026018

    Article  Google Scholar 

  36. Zhang S, Li X, Zong M, Zhu X, Cheng D (2017) Learning k for knn classification. ACM Trans Intell Syst Technol 8(3):43

    Google Scholar 

  37. Zhang S, Li X, Zong M, Zhu X, Wang R (2018) Efficient knn classification with different numbers of nearest neighbors. IEEE Trans Neural Netw Learn Syst 29 (5):1774–1785

    Article  MathSciNet  Google Scholar 

  38. Zhao J, Xiaojun WU, Dong W (2017) Locality constraint enhanced least squares regression subspace clustering. Pattern Recogn Artif Intell 205(c):22–31

    Google Scholar 

  39. Zheng Q, Liu Z (2016) Research on improved normalized cut spectral clustering algorithm. In: Control and Decision Conference, pp 1981–1984

  40. Zheng W, Zhu X, Zhu Y, Hu R, Lei C (2017) Dynamic graph learning for spectral feature selection. Multimedia Tools and Applications, https://doi.org/10.1007/s11042-017-5272-y

    Article  Google Scholar 

  41. Zheng W, Zhu X, Wen G, Zhu Y, Yu H, Gan J (2018) Unsupervised feature selection by self-paced learning regularization. Pattern Recognition Letters, https://doi.org/10.1016/j.patrec.2018.06.029

  42. Zhu Y, Lucey S (2015) Convolutional sparse coding for trajectory reconstruction. IEEE Trans Pattern Anal Mach Intell 37(3):529–540

    Article  Google Scholar 

  43. Zhu X, Zhang S, Jin Z, Zhang Z, Xu Z (2011) Missing value estimation for mixed attribute data sets. IEEE Trans Knowl Data Eng 23(1):110–121

    Article  Google Scholar 

  44. Zhu X, Zi H, Yang Y, Shen HT, Xu C, Luo J (2013) Self-taught dimensionality reduction on the high-dimensional small-sized data. Pattern Recogn 46 (1):215–229

    Article  Google Scholar 

  45. Zhu X, Zhang L, Zi H (2014) A sparse embedding and least variance encoding approach to hashing. IEEE Trans Image Process 23(9):3737–3750

    Article  MathSciNet  Google Scholar 

  46. Zhu X, Li X, Zhang S (2016) Block-row sparse multiview multilabel learning for image classification. IEEE Trans Cybern 46(2):450–461

    Article  Google Scholar 

  47. Zhu Y, Kim M, Zhu X, Yan J, Kaufer D, Wu G (2017) Personalized diagnosis for alzheimer’s disease. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp 205–213

  48. Zhu X, Li X, Zhang S, Ju C, Wu X (2017) Robust joint graph sparse coding for unsupervised spectral feature selection. IEEE Trans Neural Netw Learn Syst 28(6):1263–1275

    Article  MathSciNet  Google Scholar 

  49. Zhu X, Li X, Zhang S, Xu Z, Yu L, Wang C (2017) Graph pca hashing for similarity search. IEEE Trans Multimed 19(9):2033–2044

    Article  Google Scholar 

  50. Zhu X, Suk H-I, Huang H, Shen D (2017) Low-rank graph-regularized structured sparse regression for identifying genetic biomarkers. IEEE Trans Big Data 3(4):405–414

    Article  Google Scholar 

  51. Zhu X, Suk H-I, Wang L, Lee S-W, Shen D (2017) A novel relational regularization feature selection method for joint regression and classification in AD diagnosis. Med Image Anal 38:205–214

    Article  Google Scholar 

  52. Zhu Y, Zhu X, Kim M, Kaufer D, Wu G (2017) A novel dynamic hyper-graph inference framework for computer assisted diagnosis of neuro-diseases. In: International Conference on Information Processing in Medical Imaging, pp 158–169

    Google Scholar 

  53. Zhu X, Zhang S, Hu R, Zhu Y et al (2018) Local and global structure preservation for robust unsupervised spectral feature selection. IEEE Trans Knowl Data Eng 30(3):517–529

    Article  Google Scholar 

  54. Zhu X, Zhang S, Li Y, Zhang J, Yang L, Fang Y (2018) Low-rank sparse subspace for spectral clustering. IEEE Transactions on Knowledge and Data Engineering. https://doi.org/10.1109/TKDE.2018.2858782

Download references

Acknowledgments

This work is partially supported by the China Key Research Program (Grant No: 2016YFB1000905); the Natural Science Foundation of China (Grants No: 61573270 and 61672177); the Project of Guangxi Science and Technology (GuiKeAD17195062); the Guangxi Natural Science Foundation (Grant No: 2015GXNSFCB139011); the Guangxi Collaborative Innovation Center of Multi-Source Information Integration and Intelligent Processing; the Guangxi High Institutions Program of Introducing 100 High-Level Overseas Talents; and the Research Fund of Guangxi Key Lab of Multisource Information Mining & Security (18-A-01-01).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiaofeng Zhu.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wu, L., Zhu, X. & Tong, T. Global and local clustering with kNN and local PCA. Multimed Tools Appl 77, 29727–29738 (2018). https://doi.org/10.1007/s11042-018-6488-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-018-6488-1

Keywords

Navigation