research-article

Pairwise combination of classifiers for ensemble learning on data streams

Authors:
Heitor Murilo Gomes

Pontifícia Universidade Católica do Paraná Rua Imaculada Conceição, Curitiba, Brazil

Pontifícia Universidade Católica do Paraná Rua Imaculada Conceição, Curitiba, Brazil
View Profile

,
Jean Paul Barddal

Pontifícia Universidade Católica do Paraná Rua Imaculada Conceição, Curitiba, Brazil

Pontifícia Universidade Católica do Paraná Rua Imaculada Conceição, Curitiba, Brazil
View Profile

,
Fabrício Enembreck

Pontifícia Universidade Católica do Paraná Rua Imaculada Conceição, Curitiba, Brazil

Pontifícia Universidade Católica do Paraná Rua Imaculada Conceição, Curitiba, Brazil
View Profile

SAC '15: Proceedings of the 30th Annual ACM Symposium on Applied ComputingApril 2015Pages 941–946https://doi.org/10.1145/2695664.2695754

Published:13 April 2015Publication History

SAC '15: Proceedings of the 30th Annual ACM Symposium on Applied Computing

Pages 941–946

ABSTRACT

This work presents two different voting strategies for ensemble learning on data streams based on pairwise combination of component classifiers. Despite efforts to build a diverse ensemble, there is always some degree of overlap between component classifiers models. Our voting strategies are aimed at using these overlaps to support ensemble prediction. We hypothesize that by combining pairs of classifiers it is possible to alleviate incorrect individual predictions that would otherwise negatively impact the overall ensemble decision. The first strategy, Pairwise Accuracy (PA), combines the shared accuracy estimation of all possible pairs in the ensemble, while the second strategy, Pairwise Patterns (PP), record patterns of pairwise decisions during training and use these patterns during prediction. We present empirical results comparing ensemble classifiers with their original voting methods and our proposed methods in both real and synthetic datasets, with and without concept drifts. Our analysis indicates that pairwise voting is able to enhance overall performance for PP, especially on real datasets, and that PA is useful whenever there are noticeable differences in accuracy estimates among ensemble members, which is common during concept drifts.

References

R. Agrawal, T. Imilielinski, and A. Swani. Database mining: A performance perspective. IEEE Trans. on Knowledge and Data Engineering, 5(6):914--925, Dec. 1993. Google ScholarDigital Library
J. P. Barddal, H. M. Gomes, and F. Enembreck. Sfnclassifier: A scale-free social network method to handle concept drift. In Proceedings of the 29th Annual ACM Symposium on Applied Computing, SAC '14, pages 786--791. ACM, 2014. Google ScholarDigital Library
A. Bifet and R. Gavaldà. Learning from time-changing data with adaptive windowing. In SIAM, 2007.Google ScholarCross Ref
A. Bifet, G. Holmes, R. Kirkby, and B. Pfahringer. Moa: Massive online analysis. The Journal of Machine Learning Research, 11:1601--1604, 2010. Google ScholarDigital Library
A. Bifet, G. Holmes, and B. Pfahringer. Leveraging bagging for evolving data streams. In PKDD, pages 135--150, 2010. Google ScholarDigital Library
A. Bifet, G. Holmes, B. Pfahringer, R. Kirkby, and R. Gavaldà. New ensemble methods for evolving data streams. In 15th ACM SIGKDD, pages 139--148, 2009. Google ScholarDigital Library
L. Breiman. Bagging predictors. Machine Learning, 24(2):123--140, 1996. Google ScholarCross Ref
D. Brzezinski and J. Stefanowski. Combining block-based and online methods in learning ensembles from concept drifting data streams. Information Sciences, 265:50--67, 2014. Google ScholarDigital Library
J. Demšar. Statistical comparisons of classifiers over multiple data sets. Journal of Machine Learning Research, 7:1--30, Dec. 2006. Google ScholarDigital Library
P. Domingos and G. Hulten. Mining high-speed data streams. In Proc. of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 71--80. ACM SIGKDD, Sep. 2000. Google ScholarDigital Library
J. Gama and P. Rodrigues. Issues in evaluation of stream learning algorithms. In 15th ACM SIGKDD, pages 329--338. ACM SIGKDD, June 2009. Google ScholarDigital Library
H. M. Gomes and F. Enembreck. Sae: Social adaptive ensemble classifier for data streams. In IEEE Symposium on Computational Intelligence and Data Mining (CIDM), pages 199--206, 2013.Google ScholarCross Ref
H. M. Gomes and F. Enembreck. Sae2: Advances on the social adaptive ensemble classifier for data streams. In Proceedings of the 29th Annual ACM Symposium on Applied Computing, SAC '14. ACM, 2014. Google ScholarDigital Library
M. Harries. Splice-2 comparative evaluation: Electricity pricing. Technical report, 1999.Google Scholar
T. Hastie, R. Tibshirani, et al. classification by pairwise coupling. The annals of statistics, 26(2):451--471, 1998.Google Scholar
G. Hulten, L. Spencer, and P. Domingos. Mining time-changing data streams. In Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, pages 97--106. ACM, 2001. Google ScholarDigital Library
I. Katakis, G. Tsoumakas, and I. Vlahavas. An adaptive personalized news dissemination system. Journal of Intelligent Information Systems, 32(2):191--212, Apr. 2009. Google ScholarDigital Library
J. Z. Kolter and M. A. Maloof. Dynamic weighted majority: An ensemble method for drifting concepts. In The Journal of Machine Learning Research, pages 123--130. JMLR, Jan. 2007. Google ScholarDigital Library
L. I. Kuncheva and C. J. Whitaker. Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Machine learning, 51(2):181--207, 2003. Google ScholarDigital Library
L. I. Kuncheva, C. J. Whitaker, C. A. Shipp, and R. P. Duin. Limits on the majority vote accuracy in classifier fusion. Pattern Analysis & Applications, 6(1):22--31, 2003.Google ScholarCross Ref
N. Littlestone and M. K. Warmuth. The weighted majority algorithm. Information and computation, 108(2):212--261, 1994. Google ScholarDigital Library
N. C. Oza and S. Russell. Online bagging and boosting. In Artificial Intelligence and Statistics, pages 105--112. Society for Artificial Intelligence and Statistics, Jan. 2001.Google Scholar
B. Quost, T. Denoeux, and M.-H. Masson. Pairwise classifier combination using belief functions. Pattern Recognition Letters, 28(5):644--653, 2007. Google ScholarDigital Library
W. N. Street and Y. Kim. A streaming ensemble algorithm (sea) for large-classification. In Proc. of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, pages 377--382. ACM SIGKDD, Aug. 2001. Google ScholarDigital Library
H. Wang, W. Fan, P. S. Yu, and J. Han. Mining concept-drifting data streams using ensemble classifiers. In Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 226--235. ACM, 2003. Google ScholarDigital Library
I. Žliobaitė, A. Bifet, J. Read, B. Pfahringer, and G. Holmes. Evaluation methods and decision theory for classification of streaming data with temporal dependence. Machine Learning, pages 1--28, 2014. Google ScholarDigital Library

Index Terms

Pairwise combination of classifiers for ensemble learning on data streams
1. Computing methodologies
  1. Machine learning
    1. Machine learning approaches
      1. Logical and relational learning
        Inductive logic learning
2. Information systems
  1. Information systems applications
    1. Data mining

Recommendations

SAE2: advances on the social adaptive ensemble classifier for data streams
SAC '14: Proceedings of the 29th Annual ACM Symposium on Applied Computing

This work presents SAE2, a dynamic ensemble classifier for data stream classification that is built on the Social Adaptive Ensemble (SAE). Similarly to its predecessor, SAE2 maintains an ensemble of classifiers arranged as a network in which connections ...
Read More
Online ensemble learning with abstaining classifiers for drifting and noisy data streams
Graphical abstract

Display Omitted
Highlights
- Lightweight and flexible extension of online ensembles with abstaining classifiers.
Abstract
Mining data streams is among most vital contemporary topics in machine learning. Such scenario requires adaptive algorithms that are able to process constantly arriving instances, adapt to potential changes in data, use limited ...
Read More
Mining concept-drifting data streams using ensemble classifiers
KDD '03: Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining

Recently, mining data streams with concept drifts for actionable insights has become an important and challenging task for a wide range of applications including credit card fraud protection, target marketing, network intrusion detection, etc. ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SAC '15: Proceedings of the 30th Annual ACM Symposium on Applied Computing
April 2015
2418 pages
ISBN:9781450331968
DOI:10.1145/2695664
Conference Chairs:
Roger L. Wainwright
University of Tulsa
,
Juan Manuel Corchado
University of Salamanca, Spain
,
Program Chairs:
Alessio Bechini
University of Pisa, Italy
,
Jiman Hong
Soongsil University, South Korea
Copyright © 2015 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 13 April 2015
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
concept drift
data stream mining
ensemble classifiers
machine learning
supervised learning
Qualifiers
- research-article
Conference

Acceptance Rates
SAC '15 Paper Acceptance Rate291of1,211submissions,24%Overall Acceptance Rate1,650of6,669submissions,25%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 4
  Total Citations
  View Citations
- 156
  Total Downloads
- Downloads (Last 12 months)3
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Pairwise combination of classifiers for ensemble learning on data streams

SAC '15: Proceedings of the 30th Annual ACM Symposium on Applied Computing

ABSTRACT

References

Cited By

Index Terms

Recommendations

SAE2: advances on the social adaptive ensemble classifier for data streams

Online ensemble learning with abstaining classifiers for drifting and noisy data streams

Mining concept-drifting data streams using ensemble classifiers

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Pairwise combination of classifiers for ensemble learning on data streams

SAC '15: Proceedings of the 30th Annual ACM Symposium on Applied Computing

ABSTRACT

References

Cited By

Index Terms

Recommendations

SAE2: advances on the social adaptive ensemble classifier for data streams

Online ensemble learning with abstaining classifiers for drifting and noisy data streams

Mining concept-drifting data streams using ensemble classifiers

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media