Abstract
Text clustering has become an important challenge in text mining and machine learning, which partitions a specific documents’ collection into groups according to certain similarity/dissimilarity criterion. With advances in information acquisition technologies, textual data can frequently be represented using different techniques, generating multi-view data. We propose in this chapter an overview of the existing clustering methods with a special emphasis on multi-view text clustering methods. We design a new categorizing model based on the main properties pointed out in the multi-view textual clustering method. To evaluate their performance, we perform extensive experiments on several real-world textual data sets. Based on the experimental results, we provide some insights for researchers who want to decide the best method to use when a task of clustering multi-view textual data is required.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
M. Amini, N. Usunier, C. Goutte, Learning from multiple partially observed views-an application to multilingual text categorization, in Advances in Neural Information Processing Systems (2009), pp. 28–36
S. Bettoumi, C. Jlassi, N. Arous, Collaborative multi-view k-means clustering. Soft Comput. 23(3), 937–945 (2019)
S. Bickel, T. Scheffer, Multi-view clustering, in ICDM, vol. 4 (2004), pp. 19–26
A. Blum, T. Mitchell, Combining labeled and unlabeled data with co-training, in Proceedings of the Eleventh Annual Conference on Computational Learning Theory (ACM, New York, 1998), pp. 92–100
S. Boyd, N. Parikh, E. Chu, B. Peleato, J. Eckstein, et al., Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends® Mach. Learn. 3(1), 1–122 (2011)
M. Brbić, I. Kopriva, Multi-view low-rank sparse subspace clustering. Pattern Recogn. 73, 247–258 (2018)
E. Bruno, S. Marchand-Maillet, Multiview clustering: a late fusion approach using latent models, in Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval (2009), pp. 736–737
G. Chao, S. Sun, J. Bi, A survey on multi-view clustering (2017). arXiv preprint arXiv:1712.06246
M. Fraj, M.A.B. Hajkacem, N. Essoussi, Ensemble method for multi-view text clustering, in International Conference on Computational Collective Intelligence (Springer, Berlin, 2019), pp. 219–231
M. Fraj, M.A.B. Hajkacem, N. Essoussi, Self-organizing map for multi-view text clustering, in International Conference on Big Data Analytics and Knowledge Discovery (Springer, Berlin, 2020), pp. 396–408
E. Gaussier, C. Goutte, Relation between PLSA and NMF and implications, in Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (2005), pp. 601–602
T. Hofmann, Unsupervised learning by probabilistic latent semantic analysis. Mach. Learn. 42(1), 177–196 (2001)
S.F. Hussain, M. Mushtaq, Z. Halim, Multi-view document clustering via ensemble method. J. Intell. Inform. Syst. 43(1), 81–99 (2014)
D. Kim, D. Seo, S. Cho, P. Kang, Multi-co-training for document classification using various document representations: tF–IDF, LDA, and Doc2Vec. Inform. Sci. 477, 15–29 (2019)
T. Kohonen, The self-organizing map. Proc. IEEE 78(9), 1464–1480 (1990)
A. Kumar, H. Daumé, A co-training approach for multi-view spectral clustering, in Proceedings of the 28th International Conference on Machine Learning (ICML-11) (2011), pp. 393–400
A. Kumar, P. Rai, H. Daume, Co-regularized multi-view spectral clustering, in Advances in Neural Information Processing Systems, vol. 24 (2011)
B. Larsen, C. Aone, Fast and effective text mining using linear-time document clustering, in Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. (CiteSeer, 1999), pp. 16–22
Y. Liang, Y. Pan, H. Lai, J. Yin, Robust multi-view clustering via inter-and-intra-view low rank fusion. Neurocomputing 385, 220–230 (2020)
K.Y. Lin, L. Huang, C.D. Wang, H.Y. Chao, Multi-view proximity learning for clustering, in International Conference on Database Systems for Advanced Applications (Springer, Berlin, 2018), pp. 407–423
Z. Lin, R. Liu, Z. Su, Linearized alternating direction method with adaptive penalty for low-rank representation, in Advances in Neural Information Processing Systems, vol. 24 (2011)
J. Liu, C. Wang, J. Gao, J. Han, Multi-view clustering via joint nonnegative matrix factorization, in Proceedings of the 2013 SIAM International Conference on Data Mining (SIAM, 2013), pp. 252–260
F. Nie, G. Cai, X. Li, Multi-view clustering and semi-supervised classification with adaptive neighbours, in AAAI (2017), pp. 2408–2414
J.W. Reed, Y. Jiao, T.E. Potok, B.A. Klump, M.T. Elmore, A.R. Hurson, TF–ICF: a new term weighting scheme for clustering dynamic data streams, in 2006 5th International Conference on Machine Learning and Applications (ICMLA’06) (IEEE, Piscataway, 2006), pp. 258–263
A. Strehl, J. Ghosh, Cluster ensembles—a knowledge reuse framework for combining multiple partitions. J. Mach. Learn. Res. 3(Dec), 583–617 (2002)
Z. Tao, H. Liu, S. Li, Z. Ding, Y. Fu, From ensemble clustering to multi-view clustering, in IJCAI (2017)
G. Tzortzis, A. Likas, Kernel-based weighted multi-view clustering, in 2012 IEEE 12th International Conference on Data Mining (IEEE, Piscataway, 2012), pp. 675–684
X. Wan, Co-training for cross-lingual sentiment classification, in Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1-Volume 1 (Association for Computational Linguistics, 2009), pp. 235–243
Q. Wang, Y. Dou, X. Liu, Q. Lv, S. Li, Multi-view clustering with extreme learning machine. Neurocomputing 214, 483–494 (2016)
B. Wei, C. Pal, Cross lingual adaptation: an experiment on sentiment classifications, in Proceedings of the ACL 2010 Conference Short Papers (Association for Computational Linguistics, 2010), pp. 258–262
X. Xie, S. Sun, Multi-view clustering ensembles, in 2013 International Conference on Machine Learning and Cybernetics (ICMLC), vol. 1 (IEEE, Piscataway, 2013), pp. 51–56
Y. Yang, H. Wang, Multi-view clustering: a survey. Big Data Mining Anal. 1(2), 83–107 (2018)
C. Zhang, Q. Hu, H. Fu, P. Zhu, X. Cao, Latent multi-view subspace clustering, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017), pp. 4279–4287
X. Zhang, L. Zong, X. Liu, H. Yu, Constrained NMF-based multi-view clustering on unmapped data, in AAAI (2015), pp. 3174–3180
X. Zhao, N. Evans, J.L. Dugelay, A subspace co-training framework for multi-view clustering. Pattern Recogn. Lett. 41, 73–82 (2014)
L. Zheng, T. Li, C. Ding, Hierarchical ensemble clustering, in 2010 IEEE International Conference on Data Mining (IEEE, Piscataway, 2010), pp. 1199–1204
F. Zhuang, G. Karypis, X. Ning, Q. He, Z. Shi, Multi-view learning via probabilistic latent semantic analysis. Inform. Sci. 199, 20–30 (2012)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Fraj, M., HajKacem, M.A.B., Essoussi, N. (2022). An Overview of Multi-View Methods for Text Clustering. In: Alyoubi, B., Ben Ncir, CE., Alharbi, I., Jarboui, A. (eds) Machine Learning and Data Analytics for Solving Business Problems. Unsupervised and Semi-Supervised Learning. Springer, Cham. https://doi.org/10.1007/978-3-031-18483-3_8
Download citation
DOI: https://doi.org/10.1007/978-3-031-18483-3_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-18482-6
Online ISBN: 978-3-031-18483-3
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)