Skip to main content

Exploring Latent Sparse Graph for Large-Scale Semi-supervised Learning

  • Conference paper
  • First Online:
Machine Learning and Knowledge Discovery in Databases (ECML PKDD 2022)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13716))

  • 538 Accesses

Abstract

We focus on developing a novel scalable graph-based semi-supervised learning (SSL) method for input data consisting of a small amount of labeled data and a large amount of unlabeled data. Due to the lack of labeled data and the availability of large-scale unlabeled data, existing SSL methods usually either encounter suboptimal performance because of an improper graph constructed from input data or are impractical due to the high-computational complexity of solving large-scale optimization problems. In this paper, we propose to address both problems by constructing a novel graph of input data for graph-based SSL methods. A density-based approach is proposed to learn a latent graph from input data. Based on the latent graph, a novel graph construction approach is proposed to construct the graph of input data by an efficient formula. With this formula, two transductive graph-based SSL methods are devised with the computational complexity linear in the number of input data points. Extensive experiments on synthetic data and real datasets demonstrate that the proposed methods not only are scalable for large-scale data, but also achieve good classification performance, especially for an extremely small number of labeled data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Belkin, M., Niyogi, P., Sindhwani, V.: Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. JMLR 7, 2399–2434 (2006)

    MathSciNet  MATH  Google Scholar 

  2. Bishop, C.M.: Pattern Recognition and Machine Learning, 1st edn. Springer, New York (2006)

    Google Scholar 

  3. Chang, X., Lin, S.B., Zhou, D.X.: Distributed semi-supervised learning with kernel ridge regression. JMLR 18(1), 1493–1514 (2017)

    MathSciNet  MATH  Google Scholar 

  4. Chapelle, O., Schölkopf, B., Zien, A. (eds.): Semi-Supervised Learning. MIT Press, Cambridge, MA (2006)

    Google Scholar 

  5. Chen, J., Liu, Y.: Locally linear embedding: a survey. Artif. Intell. Rev. 36(1), 29–48 (2011)

    Article  Google Scholar 

  6. Chen, Y.C.: A tutorial on kernel density estimation and recent advances. Biostat. Epidemiol. 1(1), 161–187 (2017)

    Article  Google Scholar 

  7. Cohen, G., Afshar, S., Tapson, J., van Schaik, A.: EMNIST: an extension of MNIST to handwritten letters. arXiv preprint arXiv:1702.05373 (2017)

  8. Elhamifar, E., Vidal, R.: Sparse manifold clustering and embedding. In: NIPS, pp. 55–63 (2011)

    Google Scholar 

  9. Fergus, R., Weiss, Y., Torralba, A.: Semi-supervised learning in gigantic image collections. In: NIPS, pp. 522–530 (2009)

    Google Scholar 

  10. Franceschi, L., Niepert, M., Pontil, M., He, X.: Learning discrete structures for graph neural networks. In: ICML (2019)

    Google Scholar 

  11. Geršgorin, S.: Über die abgrenzung der eigenwerte einer matrix. Izv. Akad. Nauk SSSR Ser. Mat 1(7), 749–755 (1931)

    MATH  Google Scholar 

  12. Joachims, T.: Transductive learning via spectral graph partitioning. In: ICML, pp. 290–297 (2003)

    Google Scholar 

  13. Karlen, M., Weston, J., Erkan, A., Collobert, R.: Large scale manifold transduction. In: ICML, pp. 448–455 (2008)

    Google Scholar 

  14. Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. In: ICLR (2017)

    Google Scholar 

  15. Krishnapuram, R., Keller, J.M.: The possibilistic c-means algorithm: insights and recommendations. IEEE Trans. Fuzzy Syst. 4(3), 385–393 (1996)

    Article  Google Scholar 

  16. Kruskal, J.B.: On the shortest spanning subtree of a graph and the traveling salesman problem. Proc. Am. Math. Soc. 7(1), 48–50 (1956)

    Article  MathSciNet  MATH  Google Scholar 

  17. Li, Q., Wu, X.M., Liu, H., Zhang, X., Guan, Z.: Label efficient semi-supervised learning via graph filtering. In: CVPR, pp. 9582–9591 (2019)

    Google Scholar 

  18. Liu, W., He, J., Chang, S.F.: Large graph construction for scalable semi-supervised learning. In: ICML, pp. 679–686 (2010)

    Google Scholar 

  19. Mao, Q., Wang, L., Tsang, I.W.: Principal graph and structure learning based on reversed graph embedding. IEEE TPAMI 39(11), 2227–2241 (2016)

    Article  Google Scholar 

  20. Melacci, S., Belkin, M.: Laplacian support vector machines trained in the primal. JMLR 12(Mar), 1149–1184 (2011)

    Google Scholar 

  21. Nocedal, J., Wright, S.: Numerical optimization. Springer Series in Operations Research and Financial Engineering, 2 edn. Springer, New York (2006). https://doi.org/10.1007/978-0-387-40065-5

  22. Roweis, S.T., Saul, L.K.: Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500), 2323–2326 (2000)

    Article  Google Scholar 

  23. Sivananthan, S., et al.: Manifold regularization based on nyström type subsampling. Appl. Comput. Harmonic Anal. 44, 1–200 (2018)

    Google Scholar 

  24. Subramanya, A., Bilmes, J.: Semi-supervised learning with measure propagation. JMLR 12(Nov), 3311–3370 (2011)

    Google Scholar 

  25. Taherkhani, F., Kazemi, H., Nasrabadi, N.M.: Matrix completion for graph-based deep semi-supervised learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 5058–5065 (2019)

    Google Scholar 

  26. Wang, F., Zhang, C.: Label propagation through linear neighborhoods. IEEE TKDE 20(1), 55–67 (2007)

    Google Scholar 

  27. Wang, M., Li, H., Tao, D., Lu, K., Wu, X.: Multimodal graph-based reranking for web image search. IEEE TIP 21(11), 4649–4661 (2012)

    MathSciNet  MATH  Google Scholar 

  28. Yin, K., Tai, X.C.: An effective region force for some variational models for learning and clustering. J. Sci. Comput. 74(1), 175–196 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  29. Zhang, H., Zhang, Z., Zhao, M., Ye, Q., Zhang, M., Wang, M.: Robust triple-matrix-recovery-based auto-weighted label propagation for classification. arXiv preprint arXiv:1911.08678 (2019)

  30. Zhang, K., Kwok, J.T., Parvin, B.: Prototype vector machine for large scale semi-supervised learning. In: ICML, pp. 1233–1240. ACM (2009)

    Google Scholar 

  31. Zhang, Z., Jia, L., Zhao, M., Liu, G., Wang, M., Yan, S.: Kernel-induced label propagation by mapping for semi-supervised classification. IEEE TBD 5(2), 148–165 (2019)

    Google Scholar 

  32. Zhang, Z., Zhang, Y., Liu, G., Tang, J., Yan, S., Wang, M.: Joint label prediction based semi-supervised adaptive concept factorization for robust data representation. In: IEEE TKDE (2019)

    Google Scholar 

  33. Zhang, Z., Zhao, M., Chow, T.W.: Marginal semi-supervised sub-manifold projections with informative constraints for dimensionality reduction and recognition. Neural Netw. 36, 97–111 (2012)

    Article  MATH  Google Scholar 

  34. Zhou, D., Bousquet, O., Lal, T.N., Weston, J., Schölkopf, B.: Learning with local and global consistency. In: NIPS, pp. 321–328 (2003)

    Google Scholar 

  35. Zhuang, L., Zhou, Z., Gao, S., Yin, J., Lin, Z., Ma, Y.: Label information guided graph construction for semi-supervised learning. IEEE TIP 26(9), 4182–4192 (2017)

    MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

L. Wang was supported in part by NSF DMS-2009689. R. Chan was supported in part by HKRGC GRF Grants CUHK14301718, CityU11301120, and CRF Grant C1013-21GF. T. Zeng was supported in part by the National Key R &D Program of China under Grant 2021YFE0203700.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Li Wang .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 240 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wang, Z., Wang, L., Chan, R., Zeng, T. (2023). Exploring Latent Sparse Graph for Large-Scale Semi-supervised Learning. In: Amini, MR., Canu, S., Fischer, A., Guns, T., Kralj Novak, P., Tsoumakas, G. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2022. Lecture Notes in Computer Science(), vol 13716. Springer, Cham. https://doi.org/10.1007/978-3-031-26412-2_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-26412-2_23

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-26411-5

  • Online ISBN: 978-3-031-26412-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics