Abstract
Semi-supervised clustering algorithms aim to improve clustering results using limited supervision. The supervision is generally given as pairwise constraints; such constraints are natural for graphs, yet most semi-supervised clustering algorithms are designed for data represented as vectors. In this paper, we unify vector-based and graph-based approaches. We first show that a recently-proposed objective function for semi-supervised clustering based on Hidden Markov Random Fields, with squared Euclidean distance and a certain class of constraint penalty functions, can be expressed as a special case of the weighted kernel k-means objective (Dhillon et al., in Proceedings of the 10th International Conference on Knowledge Discovery and Data Mining, 2004a). A recent theoretical connection between weighted kernel k-means and several graph clustering objectives enables us to perform semi-supervised clustering of data given either as vectors or as a graph. For graph data, this result leads to algorithms for optimizing several new semi-supervised graph clustering objectives. For vector data, the kernel approach also enables us to find clusters with non-linear boundaries in the input data space. Furthermore, we show that recent work on spectral learning (Kamvar et al., in Proceedings of the 17th International Joint Conference on Artificial Intelligence, 2003) may be viewed as a special case of our formulation. We empirically show that our algorithm is able to outperform current state-of-the-art semi-supervised algorithms on both vector-based and graph-based data sets.
Article PDF
Similar content being viewed by others
References
Bansal, N., Blum, A., & Chawla, S. (2002). Correlation clustering. In Proceedings of the 43rd IEEE symposium on foundations of computer science (FOCS-02) (pp. 238–247).
Bar-Hillel, A., Hertz, T., Shental, N., & Weinshall, D. (2003). Learning distance functions using equivalence relations. In Proceedings 20th international conference on machine learning (pp. 11–18).
Basu, S., Banerjee, A., & Mooney, R. J. (2002). Semi-supervised clustering by seeding. In Proceedings of 19th international conference on machine learning (ICML-2002) (pp. 19–26).
Basu, S., Banerjee, A., & Mooney, R. J. (2004a). Active semi-supervision for pairwise constrained clustering. In Proceedings 4th SIAM international conference on data mining.
Basu, S., Bilenko, M., & Mooney, R. J. (2004b). A probabilistic framework for semi-supervised clustering In Proceedings of 10th ACM SIGKDD international conference on knowledge discovery and data mining (KDD-2004) (pp. 59–68).
Bie, T. D., Momma, M., & Cristianini, N. (2003). Efficiently learning the metric using side-information. In Lecture notes in artificial intelligence : Vol. 2842. Proceedings of the 14th international conference on algorithmic learning theory (ALT2003) (pp. 175–189). Berlin: Springer.
Bilenko, M., & Basu, S. (2004). A comparison of inference techniques for semi-supervised clustering with hidden Markov random fields. In Proceedings of the ICML-2004 workshop on statistical relational learning and its connections to other fields (SRL-2004), Banff, Canada.
Bilenko, M., Basu, S., & Mooney, R. (2004). Integrating constraints and metric learning in semi-supervised clustering. In Proceedings of the 21st international conference on machine learning.
Chan, P., Schlag, M., & Zien, J. (1994). Spectral k-way ratio cut partitioning. IEEE Transactions CAD-Integrated Circuits and Systems, 13, 1088–1096.
Chang, H., & Yeung, D. (2004). Locally linear metric adaptation for semi-supervised clustering. In Proceedings of the twenty-first international conference on machine learning (ICML) (pp. 153–160).
Chapelle, O., Schölkopf, B., & Zien, A. (Eds.) (2006). Semi-supervised learning. Cambridge: MIT Press.
Charikar, M., Guruswami, V., & Wirth, A. (2003). Clustering with qualitative information. In Proceedings of the 44th annual IEEE symposium on foundations of computer science (pp. 524–533).
Cover, T. M., & Thomas, J. A. (1991). Elements of information theory. New York: Wiley-Interscience.
Cristianini, N., & Shawe-Taylor, J. (2000). Introduction to support vector machines. Cambridge: Cambridge University Press.
Davidson, I., & Ravi, S. S. (2005a). Clustering with constraints: feasibility issues and the k-means algorithm. In Proceedings of the 2005 SIAM international conference on data mining.
Davidson, I., & Ravi, S. S. (2005b). Hierarchical clustering with constraints: theory and practice. In Proceedings of the ninth European principles and practice of KDD (PKDD) (pp. 59–70).
Demaine, E. D., & Immorlica, N. (2003). Correlation clustering with partial information. In Proceedings of the 6th international workshop on approximation algorithms for combinatorial optimization problems and 7th international workshop on randomization and approximation techniques in computer science (RANDOM-APPROX).
Demiriz, A., Bennett, K. P., & Embrechts, M. J. (1999). Semi-supervised clustering using genetic algorithms. In Artificial neural networks in engineering (ANNIE-99) (pp. 809–814).
Dhillon, I., Guan, Y., & Kulis, B. (2004a). Kernel k-means, spectral clustering and normalized cuts. In Proceedings of the 10th international conference on knowledge discovery and data mining (pp. 551–556).
Dhillon, I., Guan, Y., & Kulis, B. (2004b). A unified view of kernel k-means, spectral clustering and graph cuts (Tech. rep. TR-04-25). University of Texas at Austin.
Dhillon, I., Guan, Y., & Kulis, B. (2007). Weighted graph cuts without eigenvectors: a multilevel approach. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(11), 1944–1957.
Duda, R. O., & Hart, P. E. (1973). Pattern classification and scene analysis. New York: Wiley.
Grauman, K., & Darrell, T. (2005). The pyramid match kernel: discriminative classification with sets of image features. In Proceedings of the IEEE international conference on computer vision (ICCV) (pp. 1458–1465).
Kamvar, S. D., Klein, D., & Manning, C. (2003). Spectral learning. In Proceedings of the 17th international joint conference on artificial intelligence (pp. 561–566).
Klein, D., Kamvar, D., & Manning, C. (2002). From instance-level constraints to space-level constraints: making the most of prior knowledge in data clustering. In Proceedings of the 19th international conference on machine learning (pp. 307–314).
Kleinberg, J., & Tardos, E. (1999). Approximation algorithms for classification problems with pairwise relationships: metric labeling and Markov random fields. In: Proceedings of the 40th IEEE symposium on foundations of computer science (FOCS-99), (pp. 14–23).
Kulis, B., Basu, S., Dhillon, I., & Mooney, R. (2005). Semi-supervised graph clustering: a kernel approach. In Proceedings of the 22nd international conference on machine learning (pp. 457–464).
Kulis, B., Surendran, A., & Platt, J. (2007). Fast low-rank semidefinite programming for embedding and clustering. In Proceedings 11th international conference on AI and statistics (AISTATS).
Lange, T., Law, M. H. C., Jain, A. K., & Buhmann, J. M. (2005). Learning with constrained and unlabelled data. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR).
Law, M. H. C., Topchy, A., & Jain, A. K. (2005). Model-based clustering with probabilistic constraints. In Proceedings of the 2005 SIAM international conference on data mining (pp. 641–645).
Lee, I., Date, S. V., Adai, A. T., & Marcotte, E. M. (2004). A probabilistic functional network of yeast genes. Science, 306(5701), 1555–1558.
Lu, Z., & Leen, T. (2005). Semi-supervised learning with penalized probabilistic clustering. In Advances in neural information processing systems.
Meila, M., & Shi, J. (2001). A random walks view of spectral segmentation. In Proceedings of the 8th international workshop on artificial intelligence and statistics (AISTATS).
Ogata, H., Goto, S., Sato, K., Fujibuchi, W., Bono, H., & Kanehisa, M. (1999). KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Research, 27, 29–34.
Shi, J., & Malik, J. (2000). Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8), 888–905.
Sinkkonen, J., & Kaski, S. (2002). Clustering based on conditional distributions in an auxiliary space. Neural Computation, 14(1), 217–239.
Smola, A. J., & Kondor, R. (2003). Kernels and regularization on computational graphs. In Proceedings conference on computational learning theory (COLT) (pp. 144–158).
Strehl, A., Ghosh, J., & Mooney, R. (2000). Impact of similarity measures on web-page clustering. In Workshop on artificial intelligence for web search (AAAI).
Wagstaff, K., Cardie, C., Rogers, S., & Schroedl, S. (2001). Constrained k-means clustering with background knowledge. In Proceedings of the 18th international conference on machine learning (pp. 577–584).
Xing, E. P., Ng, A. Y., Jordan, M. I., & Russell, S. (2003). Distance metric learning, with application to clustering with side-information. In Advances in neural information processing systems 15.
Yan, R., Zhang, J., Yang, J., & Hauptmann, A. G. (2004). A discriminative learning framework with pairwise constraints for video object classification. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR) (Vol. 2, pp. 284–291).
Yu, S., & Shi, J. (2003). Multiclass spectral clustering. In International conference on computer vision (pp. 313–319).
Yu, S., & Shi, J. (2004). Segmentation given partial grouping constraints. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(2), 173–183.
Author information
Authors and Affiliations
Corresponding author
Additional information
Editor: Jennifer Dy.
Rights and permissions
About this article
Cite this article
Kulis, B., Basu, S., Dhillon, I. et al. Semi-supervised graph clustering: a kernel approach. Mach Learn 74, 1–22 (2009). https://doi.org/10.1007/s10994-008-5084-4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10994-008-5084-4