skip to main content
10.1145/3511808.3557504acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
tutorial

Learning and Mining with Noisy Labels

Published: 17 October 2022 Publication History

Abstract

''Knowledge should not be accessible only to those who can pay" said Robert May, chair of UC's faculty Academic Senate. Similarly, machine learning should not be accessible only to those who can pay. Thus, machine learning should benefit to the whole world, especially for developing countries in Africa and Asia. When dataset sizes grow bigger, it is laborious and expensive to obtain clean supervision, especially for developing countries. As a result, the volume of noisy supervision becomes enormous, e.g., web-scale image and speech data with noisy labels. However, standard machine learning assumes that the supervised information is fully clean and intact. Therefore, noisy data harms the performance of most of the standard learning algorithms, and sometimes even makes existing algorithms broken down.
There are bunch of theories and approaches proposed to deal with noisy data. As far as we know, learning and mining with noisy labels spans over two important ages in machine learning, data mining and knowledge management community: statistical learning (i.e., shallow learning) and deep learning. In the age of statistical learning, learning and mining with noisy labels focused on designing noise-tolerant losses or unbiased risk estimators. Nonetheless, in the age of deep learning, learning and mining with noisy labels has more options to combat with noisy labels, such as designing biased risk estimators or leveraging memorization effects of deep networks. In this tutorial, we summarize the foundations and go through the most recent noisy-label-tolerant techniques. By participating the tutorial, the audience will gain a broad knowledge of learning and mining with noisy labels from the viewpoint of statistical learning theory, deep learning, detailed analysis of typical algorithms and frameworks, and their real-world data mining applications.

Cited By

View all

Index Terms

  1. Learning and Mining with Noisy Labels

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CIKM '22: Proceedings of the 31st ACM International Conference on Information & Knowledge Management
    October 2022
    5274 pages
    ISBN:9781450392365
    DOI:10.1145/3511808
    • General Chairs:
    • Mohammad Al Hasan,
    • Li Xiong
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 17 October 2022

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. data mining
    2. machine learning
    3. noisy labels

    Qualifiers

    • Tutorial

    Funding Sources

    • JST CREST Grant
    • Institute for AI and Beyond UTokyo
    • NSFC Young Scientists Fund
    • RGC Early Career Scheme
    • ARC DECRA Project
    • JST AIP Acceleration Research Grant
    • Guangdong Basic and Applied Basic Research Foundation
    • NSF IIS Grant

    Conference

    CIKM '22
    Sponsor:

    Acceptance Rates

    CIKM '22 Paper Acceptance Rate 621 of 2,257 submissions, 28%;
    Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

    Upcoming Conference

    CIKM '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 252
      Total Downloads
    • Downloads (Last 12 months)51
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 20 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media