skip to main content
10.1145/2623330.2630808acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
tutorial

Correlation clustering: from theory to practice

Published: 24 August 2014 Publication History

Abstract

Correlation clustering is arguably the most natural formulation of clustering. Given a set of objects and a pairwise similarity measure between them, the goal is to cluster the objects so that, to the best possible extent, similar objects are put in the same cluster and dissimilar objects are put in different clusters. As it just needs a definition of similarity, its broad generality makes it applicable to a wide range of problems in different contexts, and in particular makes it naturally suitable to clustering structured objects for which feature vectors can be difficult to obtain.
Despite its simplicity, generality and wide applicability, correlation clustering has so far received much more attention from the algorithmic theory community than from the data mining community. The goal of this tutorial is to show how correlation clustering can be a powerful addition to the toolkit of the data mining researcher and practitioner, and to encourage discussions and further research in the area.
In the tutorial we will survey the problem and its most common variants, with an emphasis on the algorithmic techniques and key ideas developed to derive efficient solutions. We will motivate the problems and discuss real-world applications, the scalability issues that may arise, and the existing approaches to handle them.

Supplementary Material

Part 1 of 3 (p1972-sidebyside1.mp4)
Part 2 of 3 (p1972-sidebyside2.mp4)
Part 3 of 3 (p1972-sidebyside3.mp4)

Cited By

View all
  • (2024)An Elemental Decomposition of DNS Name-to-IP GraphsIEEE INFOCOM 2024 - IEEE Conference on Computer Communications10.1109/INFOCOM52122.2024.10621147(1661-1670)Online publication date: 20-May-2024
  • (2023)A combinatorial multi-armed bandit approach to correlation clusteringData Mining and Knowledge Discovery10.1007/s10618-023-00937-537:4(1630-1691)Online publication date: 29-Jun-2023
  • (2023)An Efficient Local Search Algorithm for Correlation Clustering on Large GraphsCombinatorial Optimization and Applications10.1007/978-3-031-49611-0_1(3-15)Online publication date: 9-Dec-2023
  • Show More Cited By

Index Terms

  1. Correlation clustering: from theory to practice

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    KDD '14: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining
    August 2014
    2028 pages
    ISBN:9781450329569
    DOI:10.1145/2623330
    Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 24 August 2014

    Check for updates

    Author Tag

    1. clustering

    Qualifiers

    • Tutorial

    Conference

    KDD '14
    Sponsor:

    Acceptance Rates

    KDD '14 Paper Acceptance Rate 151 of 1,036 submissions, 15%;
    Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

    Upcoming Conference

    KDD '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)8
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 20 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)An Elemental Decomposition of DNS Name-to-IP GraphsIEEE INFOCOM 2024 - IEEE Conference on Computer Communications10.1109/INFOCOM52122.2024.10621147(1661-1670)Online publication date: 20-May-2024
    • (2023)A combinatorial multi-armed bandit approach to correlation clusteringData Mining and Knowledge Discovery10.1007/s10618-023-00937-537:4(1630-1691)Online publication date: 29-Jun-2023
    • (2023)An Efficient Local Search Algorithm for Correlation Clustering on Large GraphsCombinatorial Optimization and Applications10.1007/978-3-031-49611-0_1(3-15)Online publication date: 9-Dec-2023
    • (2022)Correlation ClusteringSynthesis Lectures on Data Mining and Knowledge Discovery10.2200/S01163ED1V01Y202201DMK01912:1(1-149)Online publication date: 8-Mar-2022
    • (2022)Narrowing the LOCAL-CONGEST Gaps in Sparse Networks via Expander DecompositionsProceedings of the 2022 ACM Symposium on Principles of Distributed Computing10.1145/3519270.3538423(301-312)Online publication date: 20-Jul-2022
    • (2021)Correlation Clustering in Data StreamsAlgorithmica10.1007/s00453-021-00816-9Online publication date: 13-Mar-2021
    • (2021)Correlation Clustering with Global Weight BoundsMachine Learning and Knowledge Discovery in Databases. Research Track10.1007/978-3-030-86520-7_31(499-515)Online publication date: 10-Sep-2021
    • (2017)Correlation clustering methodologies and their fundamental resultsExpert Systems10.1111/exsy.1222935:1Online publication date: 13-Sep-2017
    • (2017)Revealing structure in large graphsPattern Recognition Letters10.1016/j.patrec.2016.09.00787:C(4-11)Online publication date: 1-Feb-2017
    • (2015)Parallel correlation clustering on big graphsProceedings of the 29th International Conference on Neural Information Processing Systems - Volume 110.5555/2969239.2969249(82-90)Online publication date: 7-Dec-2015
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media