Weighted multi-view co-clustering (WMVCC) for sparse data

Hussain, Syed Fawad; Khan, Khadija; Jillani, Rashad

doi:10.1007/s10489-021-02405-3

Weighted multi-view co-clustering (WMVCC) for sparse data

Published: 01 May 2021

Volume 52, pages 398–416, (2022)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

865 Accesses
14 Citations
Explore all metrics

Abstract

Multi-view clustering has gained importance in recent times due to the large-scale generation of data, often from multiple sources. Multi-view clustering refers to clustering a set of objects which are expressed by multiple set of features, known as views, such as movies being expressed by the list of actors or by a textual summary of its plot. Co-clustering, on the other hand, refers to the simultaneous grouping of data samples and features under the assumption that samples exhibit a pattern only under a subset of features. This paper combines multi-view clustering with co-clustering and proposes a new Weighted Multi-View Co-Clustering (WMVCC) algorithm. The motivation behind the approach is to use the diversity of features provided by multiple sources of information while exploiting the power of co-clustering. The proposed method expands the clustering objective function to a unified co-clustering objective function across all the multiple views. The algorithm follows the k-means strategy and iteratively optimizes the clustering by updating cluster labels, features, and view weights. A local search is also employed to optimize the clustering result using weighted multi-step paths in a graph. Experiments are conducted on several benchmark datasets. The results show that the proposed approach converges quickly, and the clustering performance significantly outperforms other recent and state-of-the-art algorithms on sparse datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-view co-clustering with multi-similarity

Article 20 December 2022

A weighted multi-view clustering via sparse graph learning

Article 28 June 2024

Balanced multi-view clustering with dynamic consistency exploration among multiple views

Article 19 April 2025

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

Artificial Intelligence

Data availability

Publicly available data (references mentioned in text).

Notes

Links to datasets used are available here: https://sites.google.com/site/fawadsyed/datasets

References

Garcia-Dias R, Vieira S, Pinaya WHL, Mechelli A (2020) Clustering analysis. In machine learning (pp. 227-247). Academic press
Bisson G, Hussain F (2008) Chi-Sim: a new similarity measure for the co-clustering task. In 7^th IEEE international conference on machine learning and applications (ICMLA), San Diego, USA. pp. 211–217
Jiang L, Cheng Y, Yang L, Li J, Yan H, Wang X (2019) A trust-based collaborative filtering algorithm for E-commerce recommendation system. J Ambient Intell Humaniz Comput 10(8):3023–3034
Article Google Scholar
Ahmadian S, Joorabloo N, Jalili M, Ren Y, Meghdadi M, Afsharchi M (2020) A social recommender system based on reliable implicit relationships. Knowl-Based Syst 192:105371
Article Google Scholar
Zhang X, Yang Y, Li T, Zhang Y, Wang H, Fujita H (2021) CMC: a consensus multi-view clustering model for predicting Alzheimer’s disease progression. Comput Methods Prog Biomed 199:105895
Article Google Scholar
Xu YM, Wang CD, Lai JH (2016) Weighted multi-view clustering with feature selection. Pattern Recogn 53:25–35
Article Google Scholar
Yang Y, Wang H (2018) Multi-view clustering: a survey. Big Data Mining and Analytics 1(2):83–107
Article Google Scholar
Cai J, Luo J, Wang S, Yang S (2018) Feature selection in machine learning: a new perspective. Neurocomputing 300:70–79
Article Google Scholar
Hussain SF, Babar HZUD, Khalil A, Jillani RM, Hanif M, Khurshid K (2020) A fast non-redundant feature selection technique for text data. IEEE Access 8:181763–181781
Article Google Scholar
Xiao Q, Dai J, Luo J, Fujita H (2019) Multi-view manifold regularized learning-based method for prioritizing candidate disease miRNAs. Knowl-Based Syst 175:118–129
Article Google Scholar
Hussain SF, Mushtaq M, Halim Z (2014) Multi-view document clustering via ensemble method. J Intell Inf Syst 43(1):81–99
Article Google Scholar
Hussain SF, Bashir S (2016) Co-clustering of multi-view datasets. Knowl Inf Syst 47(3):545–570
Article Google Scholar
Jain AK (2010) Data clustering: 50 years beyond K-means. Pattern Recogn Lett 31(8):651–666
Article Google Scholar
Forgey E (1965) Cluster analysis of multivariate data: efficiency vs. interpretability of classification. Biometrics 21(3):768–769
Google Scholar
Hussain SF, Haris M (2019) A k-means based co-clustering (kCC) algorithm for sparse, high dimensional data. Expert Syst Appl 118:20–34
Article Google Scholar
Yu SS, Chu SW, Wang CM, Chan YK, Chang TC (2018) Two improved k-means algorithms. Appl Soft Comput 68:747–755
Article Google Scholar
Blömer J, Lammersen C, Schmidt M, Sohler C (2016) Theoretical analysis of the k-means algorithm–a survey. In algorithm engineering (pp. 81–116). Springer, Cham, Theoretical Analysis of the k-Means Algorithm – A Survey
Celebi ME, Kingravi HA, Vela PA (2013) A comparative study of efficient initialization methods for the k-means clustering algorithm. Expert Syst Appl 40(1):200–210
Article Google Scholar
Salton G, Wong A, Yang CS (1975) A vector space model for automatic indexing. Commun ACM 18(11):613–620
Article Google Scholar
Blum A, Mitchell T (1998). Combining labeled and unlabeled data with co-training. In Proceedings of the 11^th annual conference on Computational learning theory, pp. 92–100
Chao G, Sun S (2019) Semi-supervised multi-view maximum entropy discrimination with expectation Laplacian regularization. Information Fusion 45:296–306
Article Google Scholar
Zhang Y, Yang Y, Li T, Fujita H (2019) A multitask multiview clustering algorithm in heterogeneous situations based on LLE and LE. Knowl-Based Syst 163:776–786
Article Google Scholar
Sun J, Lu J, Xu T, Bi J (2015). Multi-view sparse co-clustering via proximal alternating linearized minimization. In international conference on machine learning (PMLR), Lille, France, pp. 757–766
Tzortzis G, Likas A (2012) Kernel-based weighted multi-view clustering. In 12^th IEEE international conference on data mining (ICDM), Brussels, Belgium, pp. 675–684
Xiang S, Yuan L, Fan W, Wang Y, Thompson PM, Ye J (2013) Multi-source learning with block-wise missing data for alzheimer's disease prediction. In Proceedings of the 19^th ACM SIGKDD international conference on knowledge discovery and data mining, Chicago, USA, pp. 185–193
Zhao X, Evans N, Dugelay JL (2014) A subspace co-training framework for multi-view clustering. Pattern Recogn Lett 41:73–82
Article Google Scholar
Chen X, Xu X, Huang JZ, Ye Y (2011) TW-k-means: automated two-level variable weighting clustering algorithm for multiview data. IEEE Trans Knowl Data Eng 25(4):932–944
Article Google Scholar
Yang MS, Sinaga KP (2019) A feature-reduction multi-view k-means clustering algorithm. IEEE Access 7:114472–114486
Article Google Scholar
Cai X, Nie F, Huang H. (2013). Multi-view k-means clustering on big data. In 23^rd international joint conference on artificial intelligence (IJCAI), Beijing, China
Lin KY, Wang CD, Meng YQ, Zhao ZL (2017). Multi-view unit intact space learning. In international conference on knowledge science, engineering and management, Changchun, China, pp. 211–223
Zhang GY, Wang CD, Huang D, Zheng WS (2017) Multi-view collaborative locally adaptive clustering with Minkowski metric. Expert Syst Appl 86:307–320
Article Google Scholar
Sublime J, Matei B, Cabanes G, Grozavu N, Bennani Y, Cornuéjols A (2017) Entropy based probabilistic collaborative clustering. Pattern Recogn 72:144–157
Article Google Scholar
Kumar A, Rai P, Daume H (2011) Co-regularized multi-view spectral clustering. Advances in neural information processing systems (NIPS), Grenada, Spain, pp. 1413-1421
Kang Z, Shi G, Huang S, Chen W, Pu X, Zhou JT, Xu Z (2020) Multi-graph fusion for multi-view spectral clustering. Knowl-Based Syst 189:105102
Article Google Scholar
Huang D, Wang CD, Lai JH (2017) Locally weighted ensemble clustering. IEEE transactions on cybernetics 48(5):1460–1473
Article Google Scholar
Zhang GY, Wang CD, Huang D, Zheng WS, Zhou YR (2018) TW-co-k-means: two-level weighted collaborative k-means for multi-view clustering. Knowl-Based Syst 150:127–138
Article Google Scholar
Zhang X, Sun H, Liu Z, Ren Z, Cui Q, Li Y (2019) Robust low-rank kernel multi-view subspace clustering based on the schatten p-norm and correntropy. Inf Sci 477:430–447
Article Google Scholar
Hussain SF, Bisson G (2010) Text categorization using word similarities based on higher order co-occurrences. In proceedings of the SIAM international conference on data mining (SDM), Columbus, USA, pp. 1–12
Hussain SF, Bisson G, Grimal C (2010). An improved co-similarity measure for document clustering. In 9^th international conference on machine learning and applications, Tampa, USA, pp. 190–197
Adinugroho S, Wihandika RC, Adikara PP (2020) Newsgroup topic extraction using term-cluster weighting and pillar K-means clustering. International journal of computers and applications, 1-8
Sun Y, Platoš J (2020). High-Dimensional Text Clustering by Dimensionality Reduction and Improved Density Peak. Wireless Communications and Mobile Computing, vol. 2020, https://doi.org/10.1155/2020/8881112
Hancer E, Xue B, Zhang M (2020) A survey on feature selection approaches for clustering. Artif Intell Rev 53(6):4519–4545
Article Google Scholar
Arthur D, Vassilvitskii S (2007). k-means++: the advantages of careful seeding. Proceedings of the 18^th annual ACM-SIAM symposium on discrete algorithms, pp. 1027-1035
Wang Y, Wu L, Lin X, Gao J (2018) Multiview spectral clustering via structured low-rank matrix factorization. IEEE transactions on neural networks and learning systems 29(10):4833–4843
Article Google Scholar
Liang Y, Huang D, Wang CD (2019). Consistency meets inconsistency: a unified graph learning framework for multi-view clustering. In IEEE international conference on data mining (ICDM), Beijing, China, pp. 1204–1209
Brbić M, Kopriva I (2018) Multi-view low-rank sparse subspace clustering. Pattern Recogn 73:247–258
Article Google Scholar
Houthuys L, Langone R, Suykens JA (2018) Multi-view kernel spectral clustering. Information Fusion 44:46–56
Article Google Scholar

Download references

Acknowledgements

Khadija Khan would like to thank the Ghulam Ishaq Khan Institute of Engineering Sciences and Technology, Topi, Pakistan for providing her a fully funded scholarship to pursue the MS degree under its GA-1 scheme.

Code availability

Shall be released in future.

Funding

This work is part of a graduate thesis (Ms. Khadija Khan) funded by the Ghulam Ishaq Khan Institute under its scholarship (GA-1) scheme.

Author information

Authors and Affiliations

Machine Learning and Data Science (MDS) Lab, G.I.K Institute, Topi, KPK, 23460, Pakistan
Syed Fawad Hussain, Khadija Khan & Rashad Jillani
Faculty of Computer Science and Engineering, G.I.K. Institute, Topi, KPK, 23460, Pakistan
Syed Fawad Hussain, Khadija Khan & Rashad Jillani

Authors

Syed Fawad Hussain
View author publications
You can also search for this author inPubMed Google Scholar
Khadija Khan
View author publications
You can also search for this author inPubMed Google Scholar
Rashad Jillani
View author publications
You can also search for this author inPubMed Google Scholar

Contributions

Syed Fawad Hussain is credited with the conceptualization of the idea, editing, and writing portions (method description and analysis of results) of the text; Khadija Khan wrote the code including those of some of the competing methods and running the simulations; Rashad Jilani helped with writing a substantial part of the manuscript and generating the graphs.

Corresponding author

Correspondence to Syed Fawad Hussain.

Ethics declarations

Conflicts of interest/competing interests

None.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hussain, S.F., Khan, K. & Jillani, R. Weighted multi-view co-clustering (WMVCC) for sparse data. Appl Intell 52, 398–416 (2022). https://doi.org/10.1007/s10489-021-02405-3

Download citation

Accepted: 30 March 2021
Published: 01 May 2021
Issue Date: January 2022
DOI: https://doi.org/10.1007/s10489-021-02405-3

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Weighted multi-view co-clustering (WMVCC) for sparse data

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Multi-view co-clustering with multi-similarity

A weighted multi-view clustering via sparse graph learning

Balanced multi-view clustering with dynamic consistency exploration among multiple views

Explore related subjects

Data availability

Notes

References

Acknowledgements

Code availability

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflicts of interest/competing interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now