Skip to main content

A Study of Interestingness Measures for Associative Classification on Imbalanced Data

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9441))

Abstract

Associative Classification (AC) is a well known tool in knowledge discovery and it has been proved to extract competitive classifiers. However, imbalanced data has posed a challenge for most classifier learning algorithms including AC methods. Because in the AC process, Interestingness Measure (IM) plays an important role to generate interesting rules and build good classifiers, it is very important to select IMs for improving AC’s performance in the context of imbalanced data. In this paper, we aim at improving AC’s performance on imbalanced data through studying IMs. To achieve this, there are two main tasks to be settled. The first one is to find which measures have similar behaviors on imbalanced data. The second is to select appropriate measures. We evaluate each measure’s performance by AUC which is usually used for evaluation of imbalanced data classification. Firstly, based on the performances, we propose a frequent correlated patterns mining method to extract stable clusters in which the IMs have similar behaviors. Secondly, we find 26 proper measures for imbalanced data after the IM ranking computation method and divide them into two groups with one especially for extremely imbalanced data and the other suitable for slightly imbalanced data.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Ali, K., Manganaris, S., Srikant, R.: Partial classification using association rules. In: KDD-97, pp. 115–118 (1997)

    Google Scholar 

  2. Liu, B., Hsu, W., Ma, Y.: Integrating classification and association rule mining. In: KDD, pp. 80–86 (1998)

    Google Scholar 

  3. Antonie, M.L., ZaÄ3ane, O.R.: Text document categorization by term association. In: Proceedings of the IEEE 2002 International Conference on Data Mining, pp. 19–26, Maebashi City, Japan (2002)

    Google Scholar 

  4. Li, W., Han, J., Pei, J.: CMAR: accurate and efficient classification based on multiple class-association rules. In: IEEE International Conference on Data Mining (ICDM 2001), San Jose, California, 29 November–2 December 2001

    Google Scholar 

  5. He, H., Garcia, E.A.: Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21(9), 1263–1284 (2009)

    Article  Google Scholar 

  6. Chawla, N.V.: Data mining for imbalanced datasets: an overview. In: Maimon, O., Rokach, L. (eds.) Data Mining and Knowledge Discovery Handbook, pp. 875–886. Springer, New York (2010)

    Google Scholar 

  7. Tan, P., Kumar, V., Srivastava, J.: Selecting the right objective measure for association analysis. Inf. Syst. 29(4), 293–313 (2004)

    Article  Google Scholar 

  8. Huynh, X.H., Guillet, F., Blanchard, J., Kuntz, P., Briand, H., Gras, R.: A graph-based clustering approach to evaluate interestingness measures: a tool and a comparative study. In: Guillet, F., Hamilton, H. (eds.) Quality Measures in Data Mining. SCI, vol. 43, pp. 25–50. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  9. Lallich, S., Teytaud, O., Prudhomme, E.: Association rule interestingness: measure and statistical validation. In: Guillet, F., Hamilton, H. (eds.) Quality Measures in Data Mining. SCI, vol. 43, pp. 251–275. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  10. Lenca, P., Meyer, P., Vaillant, B., Lallich, S.: On selecting interestingness measures for association rules: user oriented description and multiple criteria decision aid. Eur. J. Oper. Res. 184(2), 610–626 (2008)

    Article  MATH  Google Scholar 

  11. Abe, H., Tsumoto, S.: Analyzing behavior of objective rule evaluation indices based on a correlation coefficient. In: Lovrek, I., Howlett, R.J., Jain, L.C. (eds.) KES 2008, Part II. LNCS (LNAI), vol. 5178, pp. 758–765. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  12. Sahar, S.: Interestingness measures-on determining what is interesting. In: Maimon, O., Rokach, L. (eds.) Data Mining and Knowledge Discovery Handbook, 2nd edn., pp. 603–612. Springer, New York (2010)

    Google Scholar 

  13. Wu, T., Chen, Y., Han, J.: Re-examination of interestingness measures in pattern mining: a unified framework. Data Min. Knowl. Discov. 21(3), 371–397 (2010)

    Article  MathSciNet  Google Scholar 

  14. Jalali-Heravi, M., Zaïane, O.: A study on interestingness measures for associative classifiers. In: Proceedings of the 25th ACM Symposium on Applied Computing, pp. 1039–1046 (2010)

    Google Scholar 

  15. Tew, C., Giraud-Carrier, C., Tanner, K., Burton, S.: Behavior-based clustering and analysis of interestingness measures for association rule mining. Data Min. Knowl. Discov. 28(4), 1004–1045 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  16. Arunasalam, B., Chawla, S.: CCCS: a top-down associative classifier for imbalanced class distribution. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 517–522, New York, NY, USA (2006)

    Google Scholar 

  17. Huynh, X.-H., Guillet, F., Briand, H.: ARQAT: an exploratory analysis tool for interestingness measures. In: ASMDA 2005, Proceedings of the 11th International Symposium on Applied Stochastic Models and Data Analysis, pp. 334–344, Brest, France (2005)

    Google Scholar 

  18. Asuncion, A., Newman, D.: UCI machine learning repository. School of Information and Computer Sciences, University of California, Irvine. http://www.ics.uci.edu/mlearn/mlrepository.html (2007)

  19. Johnson, S.: Hierarchical clustering schemes. Psychometrika 2, 241–254 (1967)

    Article  Google Scholar 

Download references

Acknowledgements

This work is supported by the National Natural Science Foundation of China (71001016).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Guangfei Yang .

Editor information

Editors and Affiliations

Appendix

Appendix

See Table 3.

Table 3. Interestingness measures for association patterns

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Yang, G., Cui, X. (2015). A Study of Interestingness Measures for Associative Classification on Imbalanced Data. In: Li, XL., Cao, T., Lim, EP., Zhou, ZH., Ho, TB., Cheung, D. (eds) Trends and Applications in Knowledge Discovery and Data Mining. Lecture Notes in Computer Science(), vol 9441. Springer, Cham. https://doi.org/10.1007/978-3-319-25660-3_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-25660-3_12

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-25659-7

  • Online ISBN: 978-3-319-25660-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics