Skip to main content

An Overview of Multi-View Methods for Text Clustering

  • Chapter
  • First Online:
Machine Learning and Data Analytics for Solving Business Problems

Abstract

Text clustering has become an important challenge in text mining and machine learning, which partitions a specific documents’ collection into groups according to certain similarity/dissimilarity criterion. With advances in information acquisition technologies, textual data can frequently be represented using different techniques, generating multi-view data. We propose in this chapter an overview of the existing clustering methods with a special emphasis on multi-view text clustering methods. We design a new categorizing model based on the main properties pointed out in the multi-view textual clustering method. To evaluate their performance, we perform extensive experiments on several real-world textual data sets. Based on the experimental results, we provide some insights for researchers who want to decide the best method to use when a task of clustering multi-view textual data is required.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 139.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 179.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 179.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. M. Amini, N. Usunier, C. Goutte, Learning from multiple partially observed views-an application to multilingual text categorization, in Advances in Neural Information Processing Systems (2009), pp. 28–36

    Google Scholar 

  2. S. Bettoumi, C. Jlassi, N. Arous, Collaborative multi-view k-means clustering. Soft Comput. 23(3), 937–945 (2019)

    Google Scholar 

  3. S. Bickel, T. Scheffer, Multi-view clustering, in ICDM, vol. 4 (2004), pp. 19–26

    Google Scholar 

  4. A. Blum, T. Mitchell, Combining labeled and unlabeled data with co-training, in Proceedings of the Eleventh Annual Conference on Computational Learning Theory (ACM, New York, 1998), pp. 92–100

    Google Scholar 

  5. S. Boyd, N. Parikh, E. Chu, B. Peleato, J. Eckstein, et al., Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends® Mach. Learn. 3(1), 1–122 (2011)

    Google Scholar 

  6. M. Brbić, I. Kopriva, Multi-view low-rank sparse subspace clustering. Pattern Recogn. 73, 247–258 (2018)

    Article  Google Scholar 

  7. E. Bruno, S. Marchand-Maillet, Multiview clustering: a late fusion approach using latent models, in Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval (2009), pp. 736–737

    Google Scholar 

  8. G. Chao, S. Sun, J. Bi, A survey on multi-view clustering (2017). arXiv preprint arXiv:1712.06246

    Google Scholar 

  9. M. Fraj, M.A.B. Hajkacem, N. Essoussi, Ensemble method for multi-view text clustering, in International Conference on Computational Collective Intelligence (Springer, Berlin, 2019), pp. 219–231

    Google Scholar 

  10. M. Fraj, M.A.B. Hajkacem, N. Essoussi, Self-organizing map for multi-view text clustering, in International Conference on Big Data Analytics and Knowledge Discovery (Springer, Berlin, 2020), pp. 396–408

    Book  Google Scholar 

  11. E. Gaussier, C. Goutte, Relation between PLSA and NMF and implications, in Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (2005), pp. 601–602

    Google Scholar 

  12. T. Hofmann, Unsupervised learning by probabilistic latent semantic analysis. Mach. Learn. 42(1), 177–196 (2001)

    Article  MATH  Google Scholar 

  13. S.F. Hussain, M. Mushtaq, Z. Halim, Multi-view document clustering via ensemble method. J. Intell. Inform. Syst. 43(1), 81–99 (2014)

    Article  Google Scholar 

  14. D. Kim, D. Seo, S. Cho, P. Kang, Multi-co-training for document classification using various document representations: tF–IDF, LDA, and Doc2Vec. Inform. Sci. 477, 15–29 (2019)

    Article  Google Scholar 

  15. T. Kohonen, The self-organizing map. Proc. IEEE 78(9), 1464–1480 (1990)

    Article  Google Scholar 

  16. A. Kumar, H. Daumé, A co-training approach for multi-view spectral clustering, in Proceedings of the 28th International Conference on Machine Learning (ICML-11) (2011), pp. 393–400

    Google Scholar 

  17. A. Kumar, P. Rai, H. Daume, Co-regularized multi-view spectral clustering, in Advances in Neural Information Processing Systems, vol. 24 (2011)

    Google Scholar 

  18. B. Larsen, C. Aone, Fast and effective text mining using linear-time document clustering, in Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. (CiteSeer, 1999), pp. 16–22

    Google Scholar 

  19. Y. Liang, Y. Pan, H. Lai, J. Yin, Robust multi-view clustering via inter-and-intra-view low rank fusion. Neurocomputing 385, 220–230 (2020)

    Article  Google Scholar 

  20. K.Y. Lin, L. Huang, C.D. Wang, H.Y. Chao, Multi-view proximity learning for clustering, in International Conference on Database Systems for Advanced Applications (Springer, Berlin, 2018), pp. 407–423

    Book  Google Scholar 

  21. Z. Lin, R. Liu, Z. Su, Linearized alternating direction method with adaptive penalty for low-rank representation, in Advances in Neural Information Processing Systems, vol. 24 (2011)

    Google Scholar 

  22. J. Liu, C. Wang, J. Gao, J. Han, Multi-view clustering via joint nonnegative matrix factorization, in Proceedings of the 2013 SIAM International Conference on Data Mining (SIAM, 2013), pp. 252–260

    Google Scholar 

  23. F. Nie, G. Cai, X. Li, Multi-view clustering and semi-supervised classification with adaptive neighbours, in AAAI (2017), pp. 2408–2414

    Google Scholar 

  24. J.W. Reed, Y. Jiao, T.E. Potok, B.A. Klump, M.T. Elmore, A.R. Hurson, TF–ICF: a new term weighting scheme for clustering dynamic data streams, in 2006 5th International Conference on Machine Learning and Applications (ICMLA’06) (IEEE, Piscataway, 2006), pp. 258–263

    Google Scholar 

  25. A. Strehl, J. Ghosh, Cluster ensembles—a knowledge reuse framework for combining multiple partitions. J. Mach. Learn. Res. 3(Dec), 583–617 (2002)

    MathSciNet  MATH  Google Scholar 

  26. Z. Tao, H. Liu, S. Li, Z. Ding, Y. Fu, From ensemble clustering to multi-view clustering, in IJCAI (2017)

    Google Scholar 

  27. G. Tzortzis, A. Likas, Kernel-based weighted multi-view clustering, in 2012 IEEE 12th International Conference on Data Mining (IEEE, Piscataway, 2012), pp. 675–684

    Google Scholar 

  28. X. Wan, Co-training for cross-lingual sentiment classification, in Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1-Volume 1 (Association for Computational Linguistics, 2009), pp. 235–243

    Google Scholar 

  29. Q. Wang, Y. Dou, X. Liu, Q. Lv, S. Li, Multi-view clustering with extreme learning machine. Neurocomputing 214, 483–494 (2016)

    Article  Google Scholar 

  30. B. Wei, C. Pal, Cross lingual adaptation: an experiment on sentiment classifications, in Proceedings of the ACL 2010 Conference Short Papers (Association for Computational Linguistics, 2010), pp. 258–262

    Google Scholar 

  31. X. Xie, S. Sun, Multi-view clustering ensembles, in 2013 International Conference on Machine Learning and Cybernetics (ICMLC), vol. 1 (IEEE, Piscataway, 2013), pp. 51–56

    Google Scholar 

  32. Y. Yang, H. Wang, Multi-view clustering: a survey. Big Data Mining Anal. 1(2), 83–107 (2018)

    Article  Google Scholar 

  33. C. Zhang, Q. Hu, H. Fu, P. Zhu, X. Cao, Latent multi-view subspace clustering, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017), pp. 4279–4287

    Google Scholar 

  34. X. Zhang, L. Zong, X. Liu, H. Yu, Constrained NMF-based multi-view clustering on unmapped data, in AAAI (2015), pp. 3174–3180

    Google Scholar 

  35. X. Zhao, N. Evans, J.L. Dugelay, A subspace co-training framework for multi-view clustering. Pattern Recogn. Lett. 41, 73–82 (2014)

    Article  Google Scholar 

  36. L. Zheng, T. Li, C. Ding, Hierarchical ensemble clustering, in 2010 IEEE International Conference on Data Mining (IEEE, Piscataway, 2010), pp. 1199–1204

    Book  Google Scholar 

  37. F. Zhuang, G. Karypis, X. Ning, Q. He, Z. Shi, Multi-view learning via probabilistic latent semantic analysis. Inform. Sci. 199, 20–30 (2012)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Fraj, M., HajKacem, M.A.B., Essoussi, N. (2022). An Overview of Multi-View Methods for Text Clustering. In: Alyoubi, B., Ben Ncir, CE., Alharbi, I., Jarboui, A. (eds) Machine Learning and Data Analytics for Solving Business Problems. Unsupervised and Semi-Supervised Learning. Springer, Cham. https://doi.org/10.1007/978-3-031-18483-3_8

Download citation

Publish with us

Policies and ethics