skip to main content
10.1145/1273496.1273568acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicmlConference Proceedingsconference-collections
Article

Relational clustering by symmetric convex coding

Published:20 June 2007Publication History

ABSTRACT

Relational data appear frequently in many machine learning applications. Relational data consist of the pairwise relations (similarities or dissimilarities) between each pair of implicit objects, and are usually stored in relation matrices and typically no other knowledge is available. Although relational clustering can be formulated as graph partitioning in some applications, this formulation is not adequate for general relational data. In this paper, we propose a general model for relational clustering based on symmetric convex coding. The model is applicable to all types of relational data and unifies the existing graph partitioning formulation. Under this model, we derive two alternative bound optimization algorithms to solve the symmetric convex coding under two popular distance functions, Euclidean distance and generalized I-divergence. Experimental evaluation and theoretical analysis show the effectiveness and great potential of the proposed model and algorithms.

References

  1. Banerjee, A., Dhillon, I. S., Ghosh, J., Merugu, S., & Modha, D. S. (2004). A generalized maximum entropy approach to bregman co-clustering and matrix approximation. KDD (pp. 509--514). Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Bui, T. N., & Jones, C. (1993). A heuristic for reducing fill-in in sparse matrix factorization. PPSC (pp. 445--452).Google ScholarGoogle Scholar
  3. Catral, M., Han, L., Neumann, M., & Plemmons, R. (2004). On reduced rank nonnegative matrix factorization for symmetric nonnegative matrices. Linear Algebra and Its Application.Google ScholarGoogle Scholar
  4. Chan, P. K., Schlag, M. D. F., & Zien, J. Y. (1993). Spectral k-way ratio-cut partitioning and clustering. DAC '93 (pp. 749--754). Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. D. D. Lee, & H. S. Seung (1999). Learning the parts of objects by non-negative matrix factorization. Nature, 401, 788--791.Google ScholarGoogle ScholarCross RefCross Ref
  6. Dhillon, I., Guan, Y., & Kulis, B. (2004). A unified view of kernel k-means, spectral clustering and graph cuts (Technical Report TR-04-25). University of Texas at Austin.Google ScholarGoogle Scholar
  7. Dhillon, I., Guan, Y., & Kulis, B. (2005). A fast kernel-based multilevel algorithm for graph clustering. KDD '05. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Dhillon, I. S. (2001). Co-clustering documents and words using bipartite spectral graph partitioning. KDD (pp. 269--274). Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Dhillon, I. S., Mallela, S., & Modha, D. S. (2003). Information-theoretic co-clustering. KDD'03 (pp. 89--98). Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Ding, C., He, X., & Simon, H. (2005). On the equivalence of non-negative matrix factorization and spectral clustering. SDM'05.Google ScholarGoogle ScholarCross RefCross Ref
  11. Ding, C., Li, T., Peng, W., & Park, H. (2006). Orthogonal non-negative matrix tri-factorizations for clustering. kdd'06. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Ding, C. H. Q., He, X., Zha, H., Gu, M., & Simon, H. D. (2001). A min-max cut algorithm for graph partitioning and data clustering. Proceedings of ICDM 2001 (pp. 107--114). Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Hendrickson, B., & Leland, R. (1995). A multilevel algorithm for partitioning graphs. Supercomputing '95 (p. 28). Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Henzinger, M., Motwani, R., & Silverstein, C. (2003). Challenges in web search engines. Proc. of the 18th International Joint Conference on Artificial Intelligence (pp. 1573--1579). Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Karypis, G. (2002). A clustering toolkit.Google ScholarGoogle Scholar
  16. Karypis, G., & Kumar, V. (1998). A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J. Sci. Comput., 20, 359--392. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Kernighan, B., & Lin, S. (1970). An efficient heuristic procedure for partitioning graphs. The Bell System Technical Journal, 49, 291--307.Google ScholarGoogle ScholarCross RefCross Ref
  18. Kumar, R., Raghavan, P., Rajagopalan, S., & Tomkins, A. (1999). Trawling the Web for emerging cyber-communities. Computer Networks (Amsterdam, Netherlands: 1999), 31, 1481--1493. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Lang, K. (1995). News weeder: Learning to filter netnews. ICML.Google ScholarGoogle Scholar
  20. Li, T. (2005). A general model for clustering binary data. KDD'05. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Long, B., Zhang, Z. M., & Yu, P. S. (2005). Co-clustering by block value decomposition. KDD'05. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Nasraoui, O., Krishnapuram, R., & Joshi, A. (1999). Relational clustering based on a new robust estimator with application to web mining. NAFIPS 99.Google ScholarGoogle Scholar
  23. Salakhutdinov, R., & Roweis, S. (2003). Adaptive overrelaxed bound optimization methods. ICML'03.Google ScholarGoogle Scholar
  24. Shi, J., & Malik, J. (2000). Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22, 888--905. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Strehl, A., & Ghosh, J. (2002). Cluster ensembles -- a knowledge reuse framework for combining partitionings. AAAI 2002 (pp. 93--98). Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Yu, K., Yu, S., & Tresp, V. (2005). Soft clustering on graphs. NIPS'05.Google ScholarGoogle Scholar
  27. Yu, S., & Shi, J. (2003). Multiclass spectral clustering. ICCV'03. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Zha, H., Ding, C., Gu, M., He, X., & Simon, H. (2001). Bi-partite graph partitioning and data clustering. ACM CIKM'01. Google ScholarGoogle ScholarDigital LibraryDigital Library
  1. Relational clustering by symmetric convex coding

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      ICML '07: Proceedings of the 24th international conference on Machine learning
      June 2007
      1233 pages
      ISBN:9781595937933
      DOI:10.1145/1273496

      Copyright © 2007 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 20 June 2007

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • Article

      Acceptance Rates

      Overall Acceptance Rate140of548submissions,26%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader