Abstract
Clustering with partial supervision background information or semi-supervised clustering, learning from a combination of both labeled and unlabeled data, has received a lot of attention over the last decade. The supervisory information is usually used as the constraints to bias clustering towards a good region of search space. This paper proposes a semi-supervised algorithm, called constrained non-negative matrix factorization (Constrained-NMF), with a few labeled examples as constraints to improve performance. The proposed algorithm is a matrix factorization algorithm, in which initialization of matrices is required at the beginning. Although the benefits of good initialization are well-known, randomized seeding of basis matrix and coefficient matrix is still the standard approach for many non-negative matrix factorization (NMF) algorithms. This work devises an algorithm called entropy-based weighted semi-supervised fuzzy c-means (EWSS-FCM) algorithm to initialize the seeds. The experimental results indicate that the proposed Constrained-NMF can benefit from the initialization obtained from EWSS-FCM, which emphasizes the role of labeled examples and automatically weights them during the course of clustering. This work considers labeled examples in the objective functions to devise the two algorithms, in which the labeled information is propagated to unlabeled examples iteratively. We further analyze the proposed Constrained-NMF and give convergence justifications. The experiments are conducted on five real data sets, and experimental results indicate that the proposed algorithm generally outperforms the other alternatives.





Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.References
Agarwal P, Alam MA, Biswas R (2011) Issues, challenges and tools of clustering algorithms. CoRR. abs/1110.2610
Ashfaq RAR, Wang X, Huang JZ, Abbas H, He Y (2017) Fuzziness based semi-supervised learning approach for intrusion detection system. Inf Sci 378:484–497
Baili N, Frigui H (2011) Relational fuzzy clustering with multiple kernels. In: Proceedings of the 2011 IEEE 11th international conference on data mining workshops, Washington, DC, IEEE Computer Society, pp 488–495
Basu S, Banerjee A, Mooney RJ (2002) Semi-supervised clustering by seeding. In: Proceedings of the nineteenth international conference on machine learning, San Francisco, pp 27–34. Morgan Kaufmann Publishers Inc
Basu S, Bilenko M, Mooney RJ (2004) A probabilistic framework for semi-supervised clustering. In: Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining, New York, pp 59–68. ACM
Beliakov G, James S, Li G (2011) Learning Choquet-integral-based metrics for semisupervised clustering. Trans Fuz Sys 19(3):562–574
Belkin M, Niyogi P (2004) Semi-supervised learning on Riemannian manifolds. Mach Learn 56(1–3):209–239
Bensaid AM, Hall LO, Bezdek JC, Clarke LP (1996) Partially supervised clustering for image segmentation. Pattern Recognit 29(5):859–871
Berry MW, Browne M (2005) Email surveillance using non-negative matrix factorization. Comput Math Org Theory 11(3):249–264
Bertsekas DP (1999) Nonlinear programming, 2nd edn. Athena Scientific, Belmont
Bezdek JC (1981) Pattern recognition with fuzzy objective function algorithms. Kluwer Academic Publishers, Norwell
Blum A, Chawla S (2001) Learning from labeled and unlabeled data using graph mincuts. In: Proceedings of the eighteenth international conference on machine learning, San Francisco, pp 19–26. Morgan Kaufmann Publishers Inc
Blum A, Mitchell T (1998) Combining labeled and unlabeled data with co-training. In: Proceedings of the eleventh annual conference on computational learning theory, New York, pp 92–100. ACM
Bosc P, Prade H (1996) An introduction to the fuzzy set and possibility theory-based treatment of flexible queries and uncertain or imprecise databases. In: Uncertainty management in information systems, pp 285–324
Bouchachia A, Pedrycz W (2006) Data clustering with partial supervision. Data Min Knowl Discov 12:47–78
Bregler C, Omohundro SM (1993) Surface learning with applications to lipreading. In: Advances in neural information processing systems 6, 7th NIPS conference, Denver, Colorado, USA, pp 43–50
Cao H, Deng H-W, Wang Y-P (2012) Segmentation of m-fish images for improved classification of chromosomes with an adaptive fuzzy c-means clustering algorithm. IEEE Trans Fuz Sys 20(1):1–8
Carlson A, Betteridge J, Wang RC, Hruschka ER Jr., Mitchell TM (2010) Coupled semi-supervised learning for information extraction. In: Proceedings of the third ACM international conference on web search and data mining, New York, pp 101–110. ACM
Chapelle O, Weston J, Schlkopf B (2002) Cluster kernels for semi-supervised learning. In: Advances in neural information processing systems 15, Neural information processing systems, NIPS 2002, 9–14 December 2002, Vancouver, British Columbia, Canada, pp 585–592
Cozman FG, Cohen I (2002) Unlabeled data can degrade classification performance of generative classifiers. In: Proceedings of the fifteenth international Florida artificial intelligence research society conference, pp 327–331
Culp M, Michailidis G (2008) Graph-based semisupervised learning. IEEE Trans Pattern Anal Mach Intell 30(1):174–179
Devarajan K (2008) Nonnegative matrix factorization: an analytical and interpretive tool in computational biology. PLoS Comput Biol 4(7). https://doi.org/10.1371/journal.pcbi.1000029
Ding C, He X, Simon HD (2005) On the equivalence of nonnegative matrix factorization and spectral clustering. In: Proceedings of the 2005 SIAM international conference on data mining, pp 606–610
Fodor I (2002) A survey of dimension reduction techniques
Goldberg AB, Zhu X (2006) Seeing stars when there aren’t many stars: graph-based semi-supervised learning for sentiment categorization. In: Proceedings of the first workshop on graph based methods for natural language processing, Stroudsburg, pp 45–52. Association for Computational Linguistics
Hamasuna Y, Endo Y, Miyamoto S (2010) Semi-supervised fuzzy c-means clustering using clusterwise tolerance based pairwise constraints. In: Proceedings of the 2010 IEEE international conference on granular computing, Washington, DC, pp 188–193. IEEE Computer Society
Huang H-C, Chuang Y-Y, Chen C-S (2012) Multiple kernel fuzzy clustering. IEEE Trans Fuz Syst 20(1):120–134
Hüllermeier E (2005) Fuzzy methods in machine learning and data mining: status and prospects. Fuzzy Sets Syst 156(3):387–406
Jaccard P (1912) The distribution of the flora in the alpine zone. New Phytol 11(2):37–50
Jain AK (2010) Data clustering: 50 years beyond k-means. Pattern Recognit Lett 31(8):651–666
Ji X, Xu W (2006) Document clustering with prior knowledge. In: Proceedings of the 29th annual international ACM SIGIR conference on research and development in information retrieval, New York, pp 405–412. ACM
Joachims T (1999) Making large-scale support vector machine learning practical. MIT Press, Cambridge, pp 169–184
Joachims T (1999) Transductive inference for text classification using support vector machines. In: Proceedings of the sixteenth international conference on machine learning, San Francisco, pp 200–209. Morgan Kaufmann Publishers Inc
Joachims T (2003) Transductive learning via spectral graph partitioning. In: Machine learning, Proceedings of the twentieth international conference (ICML 2003), 21–24 August 2003, Washington, DC, USA, pp 290–297
Kim H, Park H (2007) Sparse non-negative matrix factorizations via alternating non-negativity-constrained least squares for microarray data analysis. Bioinformatics 23(12):1495–1502
Klose A, Kruse R (2005) Semi-supervised learning in knowledge discovery. Fuzzy Sets Syst 149(1):209–233
Koren Y, Bell R, Volinsky C (2009) Matrix factorization techniques for recommender systems. Computer 42(8):30–37
Lee DD, Seung HS (1999) Learning the parts of objects by nonnegative matrix factorization. Nature 401:788–791
Lee DD, Seung HS (2000) Algorithms for non-negative matrix factorization. In: Advances in neural information processing systems 13, Papers from neural information processing systems (NIPS) 2000, Denver, CO, USA, pp 556–562
Lee H, Battle A, Raina R, Ng AY (2006) Efficient sparse coding algorithms. In: Advances in neural information processing systems 19, Proceedings of the twentieth annual conference on neural information processing systems, Vancouver, British Columbia, Canada, pp 801–808
Liu C, Chang T, Li H (2013) Clustering documents with labeled and unlabeled documents using fuzzy semi-kmeans. Fuzzy Sets Syst 221:48–64
Manning CD, Raghavan P, Schtze H (2008) Introduction to information retrieval. Cambridge University Press, New York
Maraziotis IA (2012) A semi-supervised fuzzy clustering algorithm applied to gene expression data. Pattern Recognit 45(1):637–648
Miyamoto S, Yamazaki M, Hashimoto W (2009) Fuzzy semi-supervised clustering with target clusters using different additional terms. In: The 2009 IEEE international conference on granular computing, GrC 2009, Lushan Mountain, pp 444–449. IEEE
Nigam K, McCallum AK, Thrun S, Mitchell T (2000) Text classification from labeled and unlabeled documents using EM. Mach Learn 39:103–134
Pedrycz W, Waletzky J (1997) Fuzzy clustering with partial supervision. IEEE Trans Syst Man Cybern Part B Cybern 27(5):787–795
Shi J, Malik J (2000) Normalized cuts and image segmentation. IEEE Trans Pattern Anal Mach Intell 22(8):888–905
Shi X, Tseng B, Adamic L (2009) Information diffusion in computer science citation networks. In: Proceedings of the third international conference on weblogs and social media. arXiv:0905.2636
Labzour JBT, Bensaid A (1998) Improved semi-supervised point-prototype clustering algorithms. In: IEEE international conference on fuzzy systems proceedings, vol 2, pp 1383–1387
Takács G, Pilászy I, Németh B, Tikk D (2008) Matrix factorization and neighbor based algorithms for the netflix prize problem. In: Proceedings of the 2008 ACM conference on recommender systems, New York, pp 267–274. ACM
Wagstaff K, Cardie C, Rogers S, Schrödl S (2001) Constrained K-means clustering with background knowledge. In: Proceedings of the eighteenth international conference on machine learning, San Francisco, pp 577–584. Morgan Kaufmann Publishers Inc
Wang F, Li T, Zhang C (2008) Semi-supervised clustering via matrix factorization. In: Proceedings of the SIAM international conference on data mining, Atlanta, pp 1–12. SIAM
Wang J, Yang J, Yu K, Lv F, Huang T, Gong Y (2010) Locality-constrained linear coding for image classification. In: IEEE conference on computer vision and pattern classification
Wang W, Zhou Z-H (June 2010) A new analysis of co-training. In: Fürnkranz J, Joachims T (eds) Proceedings of the 27th international conference on machine learning, Haifa, pp 1135–1142. Omnipress
Wang X, Ashfaq RAR, Fu A (2015) Fuzziness based sample categorization for classifier performance improvement. J Intell Fuzzy Syst 29(3):1185–1196
Wang X, Davidson I (2010) Flexible constrained spectral clustering. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining, New York, pp 563–572. ACM
Wang X, Xing H, Li Y, Hua Q, Dong C, Pedrycz W (2015) A study on relationship between generalization abilities and fuzziness of base classifiers in ensemble learning. IEEE Trans Fuzzy Syst 23(5):1638–1654
Xiang G, Kreinovich V (2010) Extending maximum entropy techniques to entropy constraints. In: Fuzzy information processing society (NAFIPS), 2010 Annual meeting of the North American, pp 1–7
Xu Y, Yin W (2013) A block coordinate descent method for regularized multiconvex optimization with applications to nonnegative tensor factorization and completion. SIAM J Imaging Sci 6(3):1758–1789
Yin X, Shu T, Huang Q (2012) Semi-supervised fuzzy clustering with metric learning and entropy regularization. Knowl Based Syst 35:304–311
Yu K, Zhu S, Lafferty J, Gong Y (2009) Fast nonparametric matrix factorization for large-scale collaborative filtering. In: Proceedings of the 32nd international ACM SIGIR conference on research and development in information retrieval, New York, pp 211–218. ACM
Zadeh LA (1965) Fuzzy sets. Inf Control 8(3):338–353
Zadeh LA (1973) Outline of a new approach to the analysis of complex systems and decision processes. IEEE Trans Syst Man Cybern SMC 3:28–44
Zhao H, Qi Z (2010) Hierarchical agglomerative clustering with ordering constraints. In: WKDD, IEEE Computer Society, pp 195–199
Zhou D, Bousquet O, Lal TN, Weston J, Schölkopf B (2004) Learning with local and global consistency. In: Proceedings of advances in neural information processing systems, pp 321–328
Zhou K, Gui-Rong X, Yang Q, Yu Y (2010) Learning with positive and unlabeled examples using topic-sensitive PLSA. IEEE Trans Knowl Data Eng 22:46–58
Zhu X (2005) Semi-supervised learning literature survey. Technical report 1530, Computer Sciences, University of Wisconsin-Madison, Madison
Zhu X (2010) Semi-supervised learning. In: Encyclopedia of machine learning, pp 892–897
Zhu X, Ghahramani Z, Lafferty JD (2003) Semi-supervised learning using Gaussian fields and harmonic functions. In: Proceedings of the twentieth international conference on machine learning, pp 912–919
Acknowledgements
This work was supported in part by Ministry of Science and Technology, Taiwan, under Grant Nos. MOST 106-2221-E-009-100 and MOST 105-2218-E-009-034.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Liu, CL., Hsaio, WH., Chang, TH. et al. Clustering data with partial background information. Int. J. Mach. Learn. & Cyber. 10, 1123–1138 (2019). https://doi.org/10.1007/s13042-018-0790-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13042-018-0790-0