Skip to main content

Concept Drift Detection Using Online Histogram-Based Bayesian Classifiers

  • Conference paper
  • First Online:
AI 2016: Advances in Artificial Intelligence (AI 2016)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9992))

Included in the following conference series:

  • 3267 Accesses

Abstract

In this paper, we present a novel algorithm that performs online histogram-based classification, i.e., specifically designed for the case when the data is dynamic and its distribution is non-stationary. Our method, called the Online Histogram-based Naïve Bayes Classifier (OHNBC) involves a statistical classifier based on the well-established Bayesian theory, but which makes some assumptions with respect to the independence of the attributes. Moreover, this classifier generates a prediction model using uni-dimensional histograms, whose segments or buckets are fixed in terms of their cardinalities but dynamic in terms of their widths. Additionally, our algorithm invokes the principles of information theory to automatically identify changes in the performance of the classifier, and consequently, forces the reconstruction of the classification model in run-time as and when it is needed. These properties have been confirmed experimentally over numerous data sets (In the interest of space and brevity, we present here only a subset of the available results. More detailed results are found in [2].) from different domains. As far as we know, our histogram-based Naïve Bayes classification paradigm for time-varying datasets is both novel and of a pioneering sort.

C.A. Astudillo—This work was partially supported by the FONDECYT Grant No. 11121350, Chile.

B.J. Oommen—Chancellor’s Professor; Fellow: IEEE and Fellow: IAPR. This author is also an Adjunct Professor with the University of Agder in Grimstad, Norway. The work of this author was partially supported by NSERC, the Natural Sciences and Engineering Research Council of Canada.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    As mentioned earlier, in the interest of space and brevity, we present here only a subset of the available results. More detailed results are found in [2].

  2. 2.

    The data generation tool DatGen is publicly available at the following URL: http://www.datasetgenerator.com.

References

  1. Abdulsalam, H., Skillicorn, D., Martin, P.: Classification using streaming random forests. IEEE Trans. Knowl. Data Eng. 23(1), 22–36 (2011)

    Article  Google Scholar 

  2. Astudillo, C.A., Gonzalez, J., Oommen, B.J., Yazidi, A.: Concept drift detection using classifiers that are online, Bayesian and histogram-based. Unabridged Version of this paper. (In Preparation)

    Google Scholar 

  3. Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)

    Article  MATH  Google Scholar 

  4. García-Laencina, P., Sancho-Gómez, J., Figueiras-Vidal, A.: Pattern classification with missing data: a review. Neural Comput. Appl. 19, 263–282 (2010). doi:10.1007/s00521-009-0295-6

    Article  Google Scholar 

  5. Last, M.: Online classification of nonstationary data streams. Intell. Data Anal. 6(2), 129–147 (2002)

    Article  MATH  Google Scholar 

  6. Littlestone, N.: Learning quickly when irrelevant attributes abound: a new linear-threshold algorithm. Mach. Learn. 2(4), 285–318 (1988)

    Article  Google Scholar 

  7. Saffari, A., Leistner, C., Santner, J., Godec, M., Bischof, H.: On-line random forests. In: 2009 IEEE 12th International Conference on Computer Vision Workshops (ICCV Workshops), pp. 1393–1400, October 2009

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to B. John Oommen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing AG

About this paper

Cite this paper

Astudillo, C.A., González, J.I., Oommen, B.J., Yazidi, A. (2016). Concept Drift Detection Using Online Histogram-Based Bayesian Classifiers. In: Kang, B.H., Bai, Q. (eds) AI 2016: Advances in Artificial Intelligence. AI 2016. Lecture Notes in Computer Science(), vol 9992. Springer, Cham. https://doi.org/10.1007/978-3-319-50127-7_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-50127-7_14

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-50126-0

  • Online ISBN: 978-3-319-50127-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics