Skip to main content

Noise Tolerant Classification by Chi Emerging Patterns

  • Conference paper
Advances in Knowledge Discovery and Data Mining (PAKDD 2004)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3056))

Included in the following conference series:

Abstract

Classification is an important data mining problem. A desirable property of a classifier is noise tolerance. Emerging Patterns (EPs) are itemsets whose supports change significantly from one data class to another. In this paper, we first introduce Chi Emerging Patterns (Chi EPs), which are more resistant to noise than other kinds of EPs. We then use Chi EPs in a probabilistic approach for classification. The classifier, Bayesian Classification by Chi Emerging Patterns (BCCEP), can handle noise very well due to the inherent noise tolerance of the Bayesian approach and high quality patterns used in the probability approximation. The empirical study shows that our method is superior to other well-known classification methods such as NB, C4.5, SVM and JEP-C in terms of overall predictive accuracy, on “noisy” as well as “clean” benchmark datasets from the UCI Machine Learning Repository. Out of the 116 cases, BCCEP wins on 70 cases, NB wins on 30, C4.5 wins on 33, SVM wins on 32 and JEP-C wins on 21.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Morgan Kaufmann Publishers, San Francisco (2000)

    Google Scholar 

  2. Dong, G., Li, J.: Efficient mining of emerging patterns: Discovering trends and differences. In: Proc. 5th ACM SIGKDD, San Diego, CA, USA, pp. 43–52 (1999)

    Google Scholar 

  3. Dong, G., Zhang, X., Wong, L., Li, J.: Classification by aggregating emerging patterns. In: Proc. 2nd Int’l Conf. on Discovery Science (DS 1999), Tokyo, Japan, pp. 30–42 (1999)

    Google Scholar 

  4. Li, J., Dong, G., Ramamohanarao, K.: Making use of the most expressive jumping emerging patterns for classification. Knowledge and Information Systems 3, 131–145 (2001)

    Article  Google Scholar 

  5. Sun, Q., Zhang, X., Ramamohanarao, K.: The noise tolerance of ep-based classifiers. In: Proc. 16th Australian Conf. on Artificial Intelligence, Perth, Australia, pp. 796–806 (2003)

    Google Scholar 

  6. Bethea, R.M., Duran, B.S., Boullion, T.L.: Statistical methods for engineers and scientists. M. Dekker, New York (1995)

    Google Scholar 

  7. Fan, H., Ramamohanarao, K.: Efficiently mining interesting emerging patterns. In: Dong, G., Tang, C., Wang, W. (eds.) WAIM 2003. LNCS, vol. 2762, pp. 189–201. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  8. Domingos, P., Pazzani, M.J.: Beyond independence: Conditions for the optimality of the simple bayesian classifier. In: Proc. 13th ICML, pp. 105–112 (1996)

    Google Scholar 

  9. Fan, H., Ramamohanarao, K.: A bayesian approach to use emerging patterns for classification. In: Proc. 14th Australasian Database Conference (ADC 2003), Adelaide, Australia, pp. 39–48 (2003)

    Google Scholar 

  10. Fan, H., Ramamohanarao, K.: Noise tolerant classification by chi emerging patterns. Technical report, Department of Computer Science and Software Engineering, University of Melbourne (2004)

    Google Scholar 

  11. Blake, C., Merz, C.: UCI repository of machine learning databases (1998)

    Google Scholar 

  12. Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann, San Francisco (1999)

    Google Scholar 

  13. Kohavi, R., Sommerfield, D., Dougherty, J.: Data mining using MLC++: A machine learning library in C++. International Journal on Artificial Intelligence Tools 6, 537–566 (1997)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Fan, H., Ramamohanarao, K. (2004). Noise Tolerant Classification by Chi Emerging Patterns. In: Dai, H., Srikant, R., Zhang, C. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2004. Lecture Notes in Computer Science(), vol 3056. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24775-3_26

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-24775-3_26

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-22064-0

  • Online ISBN: 978-3-540-24775-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics