Evaluation of Association Rule Quality Measures through Feature Extraction

Balcázar, José L.; Dogbey, Francis

doi:10.1007/978-3-642-41398-8_7

Evaluation of Association Rule Quality Measures through Feature Extraction

José L. Balcázar¹⁹ &
Francis Dogbey²⁰

Conference paper

2421 Accesses
3 Citations

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8207))

Abstract

The practical success of association rule mining depends heavily on the criterion to choose among the many rules often mined. Many rule quality measures exist in the literature. We propose a protocol to evaluate the evaluation measures themselves. For each association rule, we measure the improvement in accuracy that a commonly used predictor can obtain from an additional feature, constructed according to the exceptions to the rule. We select a reference set of rules that are helpful in this sense. Then, our evaluation method takes into account both how many of these helpful rules are found near the top rules for a given quality measure, and how near the top they are. We focus on seven association rule quality measures. Our experiments indicate that multiplicative improvement and (to a lesser extent) support and leverage (a.k.a. weighted relative accuracy) tend to obtain better results than the other measures.

This work has been partially supported by project BASMATI (TIN2011-27479-C04) of Programa Nacional de Investigación, Ministerio de Ciencia e Innovación (MICINN), Spain, and by the Pascal-2 Network of the European Union.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agrawal, R., Imielinski, T., Swami, A.N.: Mining association rules between sets of items in large databases. In: Buneman, P., Jajodia, S. (eds.) SIGMOD Conference, pp. 207–216. ACM Press (1993)
Google Scholar
Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., Verkamo, A.I.: Fast discovery of association rules. In: Advances in Knowledge Discovery and Data Mining, pp. 307–328. AAAI/MIT Press (1996)
Google Scholar
Balcázar, J.L.: Formal and computational properties of the confidence boost in association rules. To appear in ACM Transactions on KDD (2013), http://www.lsi.upc.edu/~balqui/
Balcázar, J.L., Dogbey, F.K.: Feature extraction from top association rules: Effect on average predictive accuracy. In: 3rd EUCogIII Members Conference and Final Pascal Review Meeting (2013), http://www.lsi.upc.edu/~balqui/
Bayardo, R., Agrawal, R., Gunopulos, D.: Constraint-based rule mining in large, dense databases. In: ICDE, pp. 188–197 (1999)
Google Scholar
Borgelt, C.: Efficient implementations of Apriori and Eclat. In: Goethals, B., Zaki, M.J. (eds.) FIMI, CEUR Workshop Proceedings, vol. 90. CEUR-WS.org (2003)
Google Scholar
Bradley, A.P.: The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognition 30(7), 1145–1159 (1997)
Article Google Scholar
Brin, S., Motwani, R., Ullman, J.D., Tsur, S.: Dynamic itemset counting and implication rules for market basket data. In: Peckham, J. (ed.) SIGMOD Conference, pp. 255–264. ACM Press (1997)
Google Scholar
Ceglar, A., Roddick, J.F.: Association mining. ACM Comput. Surv. 38(2) (2006)
Google Scholar
Fawcett, T.: ROC graphs: Notes and practical considerations for researchers. Pattern Recognition Letters 27(8), 882–891 (2004)
Article Google Scholar
Frank, A., Asuncion, A.: UCI machine learning repository (2010), http://archive.ics.uci.edu/ml
Freitas, A.A.: Understanding the crucial differences between classification and discovery of association rules - a position paper. SIGKDD Explorations 2(1), 65–69 (2000)
Article Google Scholar
Fürnkranz, J., Flach, P.A.: ROC ’n’ rule learning—towards a better understanding of covering algorithms. Machine Learning 58(1), 39–77 (2005)
Article MATH Google Scholar
Garriga, G.C., Kralj, P., Lavrac, N.: Closed sets for labeled data. Journal of Machine Learning Research 9, 559–580 (2008)
MathSciNet MATH Google Scholar
Geng, L., Hamilton, H.J.: Interestingness measures for data mining: A survey. ACM Comput. Surv. 38(3) (2006)
Google Scholar
Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. Journal of Machine Learning Research 3, 1157–1182 (2003)
MATH Google Scholar
Lavrač, N., Flach, P.A., Zupan, B.: Rule evaluation measures: A unifying view. In: Džeroski, S., Flach, P.A. (eds.) ILP 1999. LNCS (LNAI), vol. 1634, pp. 174–185. Springer, Heidelberg (1999)
Chapter Google Scholar
Lenca, P., Meyer, P., Vaillant, B., Lallich, S.: On selecting interestingness measures for association rules: User oriented description and multiple criteria decision aid. European Journal of Operational Research 184(2), 610–626 (2008)
Article MATH Google Scholar
Liu, B., Hsu, W., Ma, Y.: Pruning and summarizing the discovered associations. In: Proc. Knowledge Discovery in Databases, pp. 125–134 (1999)
Google Scholar
Luxenburger, M.: Implications partielles dans un contexte. Mathématiques et Sciences Humaines 29, 35–55 (1991)
MathSciNet Google Scholar
Mutter, S., Hall, M., Frank, E.: Using classification to evaluate the output of confidence-based association rule mining. In: Webb, G.I., Yu, X. (eds.) AI 2004. LNCS (LNAI), vol. 3339, pp. 538–549. Springer, Heidelberg (2004)
Chapter Google Scholar
Piatetsky-Shapiro, G.: Discovery, analysis, and presentation of strong rules. In: Proc. Knowledge Discovery in Databases, pp. 229–248 (1991)
Google Scholar
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann (1993)
Google Scholar
Tan, P.N., Kumar, V., Srivastava, J.: Selecting the right objective measure for association analysis. Information Systems 29(4), 293–313 (2004)
Article Google Scholar
Wu, X., Kumar, V., Quinlan, J.R., Ghosh, J., Yang, Q., Motoda, H., McLachlan, G.J., Ng, A.F.M., Liu, B., Yu, P.S., Zhou, Z.H., Steinbach, M., Hand, D.J., Steinberg, D.: Top 10 algorithms in data mining. Knowl. Inf. Syst. 14(1), 1–37 (2008)
Article Google Scholar
Yin, X., Han, J.: CPAR: Classification based on predictive association rules. In: Barbará, D., Kamath, C. (eds.) SDM. SIAM (2003)
Google Scholar
Zimmermann, A.: Objectively evaluating interestingness measures for frequent itemset mining. In: Li, J., Cao, L., Wang, C., Tan, K.C., Liu, B., Pei, J., Tseng, V.S. (eds.) PAKDD 2013 Workshops. LNCS, vol. 7867, pp. 354–366. Springer, Heidelberg (2013)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Departament de Llenguatges i Sistemes Informàtics, Universitat Politècnica de Catalunya, Barcelona, Spain
José L. Balcázar
Advanced Information Technology Institute, Ghana-India Kofi Annan Centre of Excellence in ICT, Accra, Ghana
Francis Dogbey

Authors

José L. Balcázar
View author publications
You can also search for this author in PubMed Google Scholar
Francis Dogbey
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Information Systems, Computing and Mathematics, Brunel University, UB8 3PH, Uxbridge, Middlesex, UK
Allan Tucker & Stephen Swift &
Faculty of Computer Science/IT, Ostfalia University of Applied Sciences, Am Exer 2, 38302, Wolfenbüttel, Germany
Frank Höppner
Faculty of Science, Department of Information and Computing Science, Buys Ballot Laboratory, Universiteit Utrecht, Princetonplein 5, 3584 CC, Utrecht, The Netherlands
Arno Siebes

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Balcázar, J.L., Dogbey, F. (2013). Evaluation of Association Rule Quality Measures through Feature Extraction. In: Tucker, A., Höppner, F., Siebes, A., Swift, S. (eds) Advances in Intelligent Data Analysis XII. IDA 2013. Lecture Notes in Computer Science, vol 8207. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41398-8_7

Download citation

DOI: https://doi.org/10.1007/978-3-642-41398-8_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-41397-1
Online ISBN: 978-3-642-41398-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics