skip to main content
10.1145/2695664.2695754acmconferencesArticle/Chapter ViewAbstractPublication PagessacConference Proceedingsconference-collections
research-article

Pairwise combination of classifiers for ensemble learning on data streams

Authors Info & Claims
Published:13 April 2015Publication History

ABSTRACT

This work presents two different voting strategies for ensemble learning on data streams based on pairwise combination of component classifiers. Despite efforts to build a diverse ensemble, there is always some degree of overlap between component classifiers models. Our voting strategies are aimed at using these overlaps to support ensemble prediction. We hypothesize that by combining pairs of classifiers it is possible to alleviate incorrect individual predictions that would otherwise negatively impact the overall ensemble decision. The first strategy, Pairwise Accuracy (PA), combines the shared accuracy estimation of all possible pairs in the ensemble, while the second strategy, Pairwise Patterns (PP), record patterns of pairwise decisions during training and use these patterns during prediction. We present empirical results comparing ensemble classifiers with their original voting methods and our proposed methods in both real and synthetic datasets, with and without concept drifts. Our analysis indicates that pairwise voting is able to enhance overall performance for PP, especially on real datasets, and that PA is useful whenever there are noticeable differences in accuracy estimates among ensemble members, which is common during concept drifts.

References

  1. R. Agrawal, T. Imilielinski, and A. Swani. Database mining: A performance perspective. IEEE Trans. on Knowledge and Data Engineering, 5(6):914--925, Dec. 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. J. P. Barddal, H. M. Gomes, and F. Enembreck. Sfnclassifier: A scale-free social network method to handle concept drift. In Proceedings of the 29th Annual ACM Symposium on Applied Computing, SAC '14, pages 786--791. ACM, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. A. Bifet and R. Gavaldà. Learning from time-changing data with adaptive windowing. In SIAM, 2007.Google ScholarGoogle ScholarCross RefCross Ref
  4. A. Bifet, G. Holmes, R. Kirkby, and B. Pfahringer. Moa: Massive online analysis. The Journal of Machine Learning Research, 11:1601--1604, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. A. Bifet, G. Holmes, and B. Pfahringer. Leveraging bagging for evolving data streams. In PKDD, pages 135--150, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. A. Bifet, G. Holmes, B. Pfahringer, R. Kirkby, and R. Gavaldà. New ensemble methods for evolving data streams. In 15th ACM SIGKDD, pages 139--148, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. L. Breiman. Bagging predictors. Machine Learning, 24(2):123--140, 1996. Google ScholarGoogle ScholarCross RefCross Ref
  8. D. Brzezinski and J. Stefanowski. Combining block-based and online methods in learning ensembles from concept drifting data streams. Information Sciences, 265:50--67, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. J. Demšar. Statistical comparisons of classifiers over multiple data sets. Journal of Machine Learning Research, 7:1--30, Dec. 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. P. Domingos and G. Hulten. Mining high-speed data streams. In Proc. of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 71--80. ACM SIGKDD, Sep. 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. J. Gama and P. Rodrigues. Issues in evaluation of stream learning algorithms. In 15th ACM SIGKDD, pages 329--338. ACM SIGKDD, June 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. H. M. Gomes and F. Enembreck. Sae: Social adaptive ensemble classifier for data streams. In IEEE Symposium on Computational Intelligence and Data Mining (CIDM), pages 199--206, 2013.Google ScholarGoogle ScholarCross RefCross Ref
  13. H. M. Gomes and F. Enembreck. Sae2: Advances on the social adaptive ensemble classifier for data streams. In Proceedings of the 29th Annual ACM Symposium on Applied Computing, SAC '14. ACM, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. M. Harries. Splice-2 comparative evaluation: Electricity pricing. Technical report, 1999.Google ScholarGoogle Scholar
  15. T. Hastie, R. Tibshirani, et al. classification by pairwise coupling. The annals of statistics, 26(2):451--471, 1998.Google ScholarGoogle Scholar
  16. G. Hulten, L. Spencer, and P. Domingos. Mining time-changing data streams. In Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, pages 97--106. ACM, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. I. Katakis, G. Tsoumakas, and I. Vlahavas. An adaptive personalized news dissemination system. Journal of Intelligent Information Systems, 32(2):191--212, Apr. 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. J. Z. Kolter and M. A. Maloof. Dynamic weighted majority: An ensemble method for drifting concepts. In The Journal of Machine Learning Research, pages 123--130. JMLR, Jan. 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. L. I. Kuncheva and C. J. Whitaker. Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Machine learning, 51(2):181--207, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. L. I. Kuncheva, C. J. Whitaker, C. A. Shipp, and R. P. Duin. Limits on the majority vote accuracy in classifier fusion. Pattern Analysis & Applications, 6(1):22--31, 2003.Google ScholarGoogle ScholarCross RefCross Ref
  21. N. Littlestone and M. K. Warmuth. The weighted majority algorithm. Information and computation, 108(2):212--261, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. N. C. Oza and S. Russell. Online bagging and boosting. In Artificial Intelligence and Statistics, pages 105--112. Society for Artificial Intelligence and Statistics, Jan. 2001.Google ScholarGoogle Scholar
  23. B. Quost, T. Denoeux, and M.-H. Masson. Pairwise classifier combination using belief functions. Pattern Recognition Letters, 28(5):644--653, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. W. N. Street and Y. Kim. A streaming ensemble algorithm (sea) for large-classification. In Proc. of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, pages 377--382. ACM SIGKDD, Aug. 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. H. Wang, W. Fan, P. S. Yu, and J. Han. Mining concept-drifting data streams using ensemble classifiers. In Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 226--235. ACM, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. I. Žliobaitė, A. Bifet, J. Read, B. Pfahringer, and G. Holmes. Evaluation methods and decision theory for classification of streaming data with temporal dependence. Machine Learning, pages 1--28, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Pairwise combination of classifiers for ensemble learning on data streams

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        SAC '15: Proceedings of the 30th Annual ACM Symposium on Applied Computing
        April 2015
        2418 pages
        ISBN:9781450331968
        DOI:10.1145/2695664

        Copyright © 2015 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 13 April 2015

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        SAC '15 Paper Acceptance Rate291of1,211submissions,24%Overall Acceptance Rate1,650of6,669submissions,25%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader