skip to main content
10.1145/1015330.1015403acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicmlConference Proceedingsconference-collections
Article

Learning to cluster using local neighborhood structure

Published: 04 July 2004 Publication History

Abstract

This paper introduces an approach for clustering/classification which is based on the use of local, high-order structure present in the data. For some problems, this local structure might be more relevant for classification than other measures of point similarity used by popular unsupervised and semi-supervised clustering methods. Under this approach, changes in the class label are associated to changes in the local properties of the data. Using this idea, we also pursue to learn how to cluster given examples of clustered data (including from different datasets). We make these concepts formal by presenting a probability model that captures their fundamentals and show that in this setting, learning to cluster is a well defined and tractable task. Based on probabilistic inference methods, we then present an algorithm for computing the posterior probability distribution of class labels for each data point. Experiments in the domain of spatial grouping and functional gene classification are used to illustrate and test these concepts.

References

[1]
Ashburner, M. et al. (2000). Gene ontology: tool for the unification of biology. The gene ontology consortium. Nat. Genet., 25, 25--29.
[2]
Bach, F. R., & Jordan, M. I. (2004). Learning spectral clustering. Neural Inf. Processing Systems.
[3]
Belkin, M., & Niyogi, P. (2004). Semi-supervised learning on Riemannian manifolds. Journal of Machine Learning Research (to appear).
[4]
Corduneanu, A., & Jaakkola, T. (2003). On information regularization. Uncert. in Artificial Intelligence.
[5]
Kannan, R., Vempala, S., & Vetta, A. (2000). On clusterings: good, bad and spectral. 41st Foundations of Computer Science (FOCS 00).
[6]
Kschischang, F., Frey, B., & Loeliger, H. (2001). Factor graphs and the sum-product algorithm. IEEE Transactions on Information Theory.
[7]
McEliece, R., MacKay, D., & Cheng, J. (1998). Turbo decoding as an instance of pearl's belief propagation algorithm. IEEE J. Sel. Areas in Comm., 16.
[8]
Meila, M., & Shi, J. (2001). Learning segmentation with random walks. Neural Inf. Processing Systems.
[9]
Ng, A., Jordan, M., & Weiss, Y. (2002). On spectral clustering: Analysis and an algorithm. Advances in Neural Inf. Processing Systems.
[10]
Pearl, J. (1988). Probabilistic reasoning in intelligent systems. Morgan-Kaufman.
[11]
Rosales, R., & Frey, B. (2003). Generative models of affinity matrices. Uncert. in Artificial Intelligence.
[12]
Roweis, S., & Saul, L. (2000). Nonlinear dimensionality reduction by locally linear embedding. Science, 290, 2323--2326.
[13]
Shi, J., & Malik, J. (2000). Normalized cuts and image segmentation. Pattern Analysis and Machine Intelligence, 22, 888--905.
[14]
Szummer, M., & Jaakkola, T. (2002). Partially labeled classification with markov random walks. Neural Inf. Processing Systems.
[15]
Tenenbaum, J., Silva, V. D., & Langford, J. (2000). A global geometric framework for nonlinear dimensionality reduction. Science, 290.
[16]
Xing, E., Ng, A., Jordan, M., & Russell, S. (2003). Distance metric learning, with application to clustering with side-information. Neural Inf. Processing Systems.
[17]
Yedidia, J., Freeman, W., & Weiss, Y. (2000). Generalized belief propagation. Neural Inf. Processing Systems (pp. 689--695).
[18]
Zhang, W. et al. (2004). The functional landscape of mouse gene expression. Submitted.

Cited By

View all
  • (2014)Bayesian analysis of similarity matrices for speaker diarization2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)10.1109/ICASSP.2014.6853567(106-110)Online publication date: May-2014
  • (2011)Distributed inferencing with ambient and wearable sensorsWireless Communications and Mobile Computing10.1002/wcm.89312:1(117-131)Online publication date: 28-Dec-2011
  • (2004)Proximity graphs for clustering and manifold learningProceedings of the 18th International Conference on Neural Information Processing Systems10.5555/2976040.2976069(225-232)Online publication date: 1-Dec-2004

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICML '04: Proceedings of the twenty-first international conference on Machine learning
July 2004
934 pages
ISBN:1581138385
DOI:10.1145/1015330
  • Conference Chair:
  • Carla Brodley
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 July 2004

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Article

Acceptance Rates

Overall Acceptance Rate 140 of 548 submissions, 26%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2014)Bayesian analysis of similarity matrices for speaker diarization2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)10.1109/ICASSP.2014.6853567(106-110)Online publication date: May-2014
  • (2011)Distributed inferencing with ambient and wearable sensorsWireless Communications and Mobile Computing10.1002/wcm.89312:1(117-131)Online publication date: 28-Dec-2011
  • (2004)Proximity graphs for clustering and manifold learningProceedings of the 18th International Conference on Neural Information Processing Systems10.5555/2976040.2976069(225-232)Online publication date: 1-Dec-2004

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media