Skip to main content
Log in

Exploring Implicit and Explicit Geometrical Structure of Data for Deep Embedded Clustering

  • Published:
Neural Processing Letters Aims and scope Submit manuscript

Abstract

Clustering is an essential data analysis technique and has been studied extensively over the last decades. Previous studies have shown that data representation and data structure information are two critical factors for improving clustering performance, and it forms two important lines of research. The first line of research attempts to learn representative features, especially utilizing the deep neural networks, for handling clustering problems. The second concerns exploiting the geometric structure information within data for clustering. Although both of them have achieved promising performance in lots of clustering tasks, few efforts have been dedicated to combine them in a unified deep clustering framework, which is the research gap we aim to bridge in this work. In this paper, we propose a novel approach, Manifold regularized Deep Embedded Clustering (MDEC), to deal with the aforementioned challenge. It simultaneously models data generating distribution, cluster assignment consistency, as well as geometric structure of data in a unified framework. The proposed method can be optimized by performing mini-batch stochastic gradient descent and back-propagation. We evaluate MDEC on three real-world datasets (USPS, REUTERS-10K, and MNIST), where experimental results demonstrate that our model outperforms baseline models and obtains the state-of-the-art performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Notes

  1. Optimal values of parameters can be found using grid search.

References

  1. Cai D, He X, Han J, Huang T (2011) Graph regularized non-negative matrix factorization for data representation. IEEE Trans Pattern Anal Mach Intell 33(8):1548–1560

    Article  Google Scholar 

  2. Chang P, Zhang J, Hu J, Song Z (2018) A deep neural network based on ELM for semi-supervised learning of image classification. Neural Process Lett 48(1):375–388

    Article  Google Scholar 

  3. Chao G (2019) Discriminative k-means Laplacian clustering. Neural Process Lett 49(1):393–405

    Article  Google Scholar 

  4. Chen G (2015) Deep learning with nonparametric clustering. arXiv:1501.03084

  5. Chen X, Liu X, Jia Y (2011) Discriminative structure selection method of Gaussian mixture models with its application to handwritten digit recognition. Neurocomputing 74:954–961

    Article  Google Scholar 

  6. Chollet F (2015) Keras Technical report

  7. Deerwester S, Dumais ST, Furnas GW, Landauer TK, Harshman R (1990) Indexing by latent semantic analysis. J Am Soc Inf Sci 41(6):391–407

    Article  Google Scholar 

  8. Deng Z, Huang L, Wang C, Lai J, Yu PS (2019) Deepcf: a unified framework of representation learning and matching function learning in recommender system. CoRR arXiv:1901.04704

  9. Gan H, Sang N, Huang C (2015) Manifold regularized semi-supervised Gaussian mixture model. J Opt Soc Am A Opt Image Sci 32(4):566–575

    Article  Google Scholar 

  10. Glorot X, Bordes A, Bengio Y (2011) Deep sparse rectifier neural networks. J Mach Learn Res 15:315–323

    Google Scholar 

  11. Guo G, Zhang N (2019) A survey on deep learning based face recognition. Comput Vis Image Underst 189:102805

    Article  Google Scholar 

  12. Guo X, Gao L, Liu X, Yin J (2017) Improved deep embedded clustering with local structure preservation. In: Proceedings of the twenty-sixth international joint conference on artificial intelligence, pp 1753–1759

  13. Hartigan JA, Wong MA (1979) A k-means clustering algorithm. Appl Stat 28:100–108

    Article  Google Scholar 

  14. Hastie T, Tibshirani R, Friedman J (eds) (2003) The elements of statistical learning. Springer, New York

    Google Scholar 

  15. Hu W, Hu H (2019) Fine tuning dual streams deep network with multi-scale pyramid decision for heterogeneous face recognition. Neural Process Lett 50(2):1465–1483

    Article  Google Scholar 

  16. Jabi M, Pedersoli M, Mitiche A, Ayed IB (2019) Deep clustering: on the link between discriminative models and k-means. arXiv:1810.04246

  17. Kingma D, Ba J (2014) Adam: a method for stochastic optimization. arXiv:1412.6980

  18. LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86:2278–2324

    Article  Google Scholar 

  19. Lewis DD, Yang Y, Rose TG, Li F (2004) Rcv1: a new benchmark collection text categorization research. J Mach Learn Res 5:361–397

    Google Scholar 

  20. Liu J, Cai D, He X (2010a) Gaussian mixture model with local consistency. In: Proceeding of the 24th AAAI conference on artificial intelligence, pp 512–517

  21. Liu W, He J, Chang SF (2010b) Large graph construction for scalable semi-supervised learning. In: Proceedings of the 27th international conference on machine learning, pp 679–686

  22. Maaten LvD, Hinton G (2008) Visualizing data using t-SNE. J Mach Learn Res 9(85):2579–2605

    MATH  Google Scholar 

  23. Nie F, Zeng Z, Tsang IW, Zhang C (2011) Spectral embedded clustering: a framework for in-sample and out-of-sample spectral clustering. IEEE Trans Neural Netw 22(11):1796–1808

    Article  Google Scholar 

  24. Peng X, Xiao S, Feng J, Yau WY, Yi Z (2016) Deep subspace clustering with sparsity prior. In: Proceedings of the twenty-fifth international joint conference on artificial intelligence, pp 1925–1931

  25. Song L, Zhang M, Wu X, He R (2018) Adversarial discriminative heterogeneous face recognition. In: Proceedings of the 32nd AAAI conference on artificial intelligence

  26. Tian F (2014) Learning deep representations for graph clustering. In: Proceedings of the 27th international conference on neural information processing systems, pp 2429–2437

  27. Wada Y, Miyamoto S, Nakagama T, Andéol L, Kumagai W, Kanamori T (2019) Spectral embedded deep clustering. Entropy 21(8):795

    Article  Google Scholar 

  28. Wang Z, Chang S, Zhou J, Wang M, Huang TS (2016) Learning a task-specific deep architecture for clustering. In: Proceedings of the 16th SIAM international conference on data mining 2016, pp 369–377

  29. Wu B, Wang E, Zhu Z, Chen W, Xiao P (2018) Manifold nmf with \(l_{21}\) norm for clustering. Neurocomputing 273:78–88

    Article  Google Scholar 

  30. Xie J, Girshick R, Farhadi A (2016) Unsupervised deep embedding for clustering analysis. In: Proceedings of the 33rd international conference on machine learning, pp 478–487

  31. Xu X, Shen F, Yang Y, Zhang D, Shen HT, Song J (2017) Matrix tri-factorization with manifold regularizations for zero-shot learning. In: Proceedings of the 30th IEEE conference on computer vision and pattern recognition, pp 2007–2016

  32. Yang J, Parikh D, Batra D (2016b) Joint unsupervised learning of deep representations and image clusters. In: Proceeding of the 29th IEEE conference on computer vision and pattern recognition, pp 5147–5156

  33. Yang Z, Cohen WW, Salakhutdinov R (2016a) Revisiting semi-supervised learning with graph embeddings. In: Proceedings of the 33rd international conference on machine learning, pp 40–48

  34. Ye X, Zhao J (2019) Multi-manifold clustering: a graph-constrained deep nonparametric method. Pattern Recogn 93:215–227

    Article  Google Scholar 

  35. Yu SX, Shi J (2003) Multiclass spectral clustering. In: Proceedings of the 9th IEEE international conference on computer vision, pp 313–319

  36. Zhang S, Tay Y, Yao L, Wu B, Sun A (2019) Deeprec: an open-source toolkit for deep learning based recommendation. In: Proceedings of the twenty-eighth international joint conference on artificial intelligence, IJCAI 2019, Macao, China, August 10–16, 2019, pp 6581–6583

  37. Zhang S, Yao L, Sun A, Tay Y (2019) Deep learning based recommender system: a survey and new perspectives. ACM Comput Surv 52(1):5:1–5:38

    Google Scholar 

  38. Zhe X, Chen S, Yan H (2019) Directional statistics-based deep metric learning for image classification and retrieval. Pattern Recogn 93:113–123

    Article  Google Scholar 

  39. Zheng M, Bu J, Chen C, Wang C, Zhang L, Qiu G, Cai D (2011) Graph regularized sparse coding for image representation. IEEE Trans Image Process 20(5):1327–1336

    Article  MathSciNet  Google Scholar 

  40. Zhu X, Li Z, Zhang X, Li P, Xue Z, Wang L (2019) Deep convolutional representations and kernel extreme learning machines for image classification. Multimed Tools Appl 78(20):29271–29290

    Article  Google Scholar 

Download references

Acknowledgements

The work was partially supported by the National Natural Science Foundation of China (No. 61722211), the Federal Ministry of Education and Research (No. 01LE1806A), the Natural Science Foundation of Chongqing (No. cstc2017jcyjBX0059), and the Beijing Academy of Artificial Intelligence (No. BAAI2019ZD0306).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiaofei Zhu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhu, X., Do, K.D., Guo, J. et al. Exploring Implicit and Explicit Geometrical Structure of Data for Deep Embedded Clustering. Neural Process Lett 53, 1–16 (2021). https://doi.org/10.1007/s11063-020-10375-9

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11063-020-10375-9

Keywords

Navigation