skip to main content
10.1145/1014052.1014110acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
Article

On demand classification of data streams

Published: 22 August 2004 Publication History

Abstract

Current models of the classification problem do not effectively handle bursts of particular classes coming in at different times. In fact, the current model of the classification problem simply concentrates on methods for one-pass classification modeling of very large data sets. Our model for data stream classification views the data stream classification problem from the point of view of a dynamic approach in which simultaneous training and testing streams are used for dynamic classification of data sets. This model reflects real life situations effectively, since it is desirable to classify test streams in real time over an evolving training and test stream. The aim here is to create a classification system in which the training model can adapt quickly to the changes of the underlying data stream. In order to achieve this goal, we propose an on-demand classification process which can dynamically select the appropriate window of past training data to build the classifier. The empirical results indicate that the system maintains a high classification accuracy in an evolving data stream, while providing an efficient solution to the classification task.

References

[1]
C. C. Aggarwal, J. Han, J.Wang, P. Yu. CluStream: A Framework for Clustering Evolving Data Streams. VLDB Conference, 2003.
[2]
C. C. Aggarwal. A Framework for Diagnosing Changes in Evolving Data Streams. ACM SIGMOD Conference, 2003.
[3]
B. Babcock, S. Babu, M. Datar, R. Motwani, J. Widom. Models and Issues in Data Stream Systems, ACM PODS Conference, 2002.
[4]
P. Domingos, G. Hulten. Mining High-Speed Data Streams. ACM SIGKDD Conference, 2000.
[5]
R. Duda, P. Hart. Pattern Classification and Scene Analysis, Wiley, New York, 1973.
[6]
G. Hulten, L. Spencer, P. Domingos. Mining Time Changing Data Streams. ACM KDD Conference, 2001.
[7]
T. Zhang, R. Ramakrishnan, M. Livny. BIRCH: An Efficient Data Clustering Method for Very Large Databases. ACM SIGMOD Conference, 1996.

Cited By

View all
  • (2024)Probabilistic neural networks for incremental learning over time-varying streaming data with application to air pollution monitoringApplied Soft Computing10.1016/j.asoc.2024.111702161(111702)Online publication date: Aug-2024
  • (2024)A Novel Approach to Address Concept Drift Detection with the Accuracy Enhanced Ensemble (AEE) in Data Stream MiningIntelligent Computing for Sustainable Development10.1007/978-3-031-61287-9_14(177-189)Online publication date: 24-May-2024
  • (2023)A dynamic few-shot learning framework for medical image stream mining based on self-trainingEURASIP Journal on Advances in Signal Processing10.1186/s13634-023-00999-z2023:1Online publication date: 1-May-2023
  • Show More Cited By

Index Terms

  1. On demand classification of data streams

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    KDD '04: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
    August 2004
    874 pages
    ISBN:1581138881
    DOI:10.1145/1014052
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 22 August 2004

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. classification
    2. data streams

    Qualifiers

    • Article

    Conference

    KDD04

    Acceptance Rates

    Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

    Upcoming Conference

    KDD '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)27
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 02 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Probabilistic neural networks for incremental learning over time-varying streaming data with application to air pollution monitoringApplied Soft Computing10.1016/j.asoc.2024.111702161(111702)Online publication date: Aug-2024
    • (2024)A Novel Approach to Address Concept Drift Detection with the Accuracy Enhanced Ensemble (AEE) in Data Stream MiningIntelligent Computing for Sustainable Development10.1007/978-3-031-61287-9_14(177-189)Online publication date: 24-May-2024
    • (2023)A dynamic few-shot learning framework for medical image stream mining based on self-trainingEURASIP Journal on Advances in Signal Processing10.1186/s13634-023-00999-z2023:1Online publication date: 1-May-2023
    • (2023)An Evolving Population Approach to Data-Stream Classification with Extreme Verification Latency2023 IEEE Symposium Series on Computational Intelligence (SSCI)10.1109/SSCI52147.2023.10371923(1843-1848)Online publication date: 5-Dec-2023
    • (2023)An Adaptive Hierarchical Method for Anytime Set-wise Clustering of Variable and High-Speed Data Streams2023 IEEE International Conference on Big Data (BigData)10.1109/BigData59044.2023.10386118(568-577)Online publication date: 15-Dec-2023
    • (2023)Landslide Susceptibility Prediction based on Decision Tree and Feature Selection MethodsJournal of the Indian Society of Remote Sensing10.1007/s12524-022-01645-151:4(771-786)Online publication date: 7-Feb-2023
    • (2022)Scarcity of Labels in Non-Stationary Data Streams: A SurveyACM Computing Surveys10.1145/349483255:2(1-39)Online publication date: 21-Jan-2022
    • (2022)StreamSoNG: A Soft Streaming Classification ApproachIEEE Transactions on Emerging Topics in Computational Intelligence10.1109/TETCI.2021.30977406:3(700-709)Online publication date: Jun-2022
    • (2021)ADAW: Age decay accuracy weighted ensemble method for drifting data stream miningIntelligent Data Analysis10.3233/IDA-20524925:5(1131-1152)Online publication date: 15-Sep-2021
    • (2021)Low-Rank Transfer Learning for Multi-stream Data Classification2021 11th International Conference on Information Science and Technology (ICIST)10.1109/ICIST52614.2021.9440558(584-593)Online publication date: 21-May-2021
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media