Skip to main content

Quality-Aware Association Rule Mining

  • Conference paper
Advances in Knowledge Discovery and Data Mining (PAKDD 2006)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3918))

Included in the following conference series:

  • 3169 Accesses

Abstract

The quality of discovered association rules is commonly evaluated by interestingness measures (commonly support and confidence) with the purpose of supplying subsidies to the user in the understanding and use of the new discovered knowledge. Low-quality datasets have a very bad impact over the quality of the discovered association rules, and one might legitimately wonder whether a so-called “interesting” rule noted LHS -> RHS is meaningful when 30 % of LHS data are not up-to-date anymore, 20% of RHS data are not accurate, and 15% of LHS data come from a data source that is well-known for its bad credibility. In this paper we propose to integrate data quality measures for effective and quality-aware association rule mining and we propose a cost-based probabilistic model for selecting legitimately interesting rules. Experiments on the challenging KDD-CUP-98 datasets show for different variations of data quality indicators the corresponding cost and quality of discovered association rules that can be legitimately (or not) selected.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Batini, C., Catarci, T., Scannapiceco, M.: A Survey of Data Quality Issues in Cooperative Information Systems. In: Tutorial, Intl. Conf. on Conceptual Modeling, ER (2004)

    Google Scholar 

  2. Berti-Equille, L., Moussouni, F.: Quality-Aware Integration and Warehousing of Genomic Data. In: Proc. of the Intl. Conf. on Information Quality. MIT, Cambridge (2005)

    Google Scholar 

  3. Dasu, T., Johnson, T.: Hunting of the Snark: Finding Data Glitches with Data Mining Methods. In: Proc. of the Intl. Conf. on Information Quality. MIT, Cambridge (1999)

    Google Scholar 

  4. Dasu, T., Johnson, T.: Exploratory Data Mining and Data Cleaning. Wiley, Chichester (2003)

    Book  MATH  Google Scholar 

  5. Hipp, J., Guntzer, U., Grimmer, U.: Data Quality Mining - Making a Virtue of Necessity. In: Proc. of the Workshop on Research Issues in Data Mining and Knowledge Discovery (DMKD2001), Santa Barbara, CA, U.S.A, May 20th (2001)

    Google Scholar 

  6. Jeusfeld, M.A., Quix, C., Jarke, M.: Design and Analysis of Quality Information for Data Warehouses. In: Ling, T.-W., Ram, S., Li Lee, M. (eds.) ER 1998. LNCS, vol. 1507, pp. 349–362. Springer, Heidelberg (1998)

    Chapter  Google Scholar 

  7. Lavrač, N., Flach, P.A., Zupan, B.: Rule Evaluation Measures: A Unifying View, ILP, pp. 174–185 (1999)

    Google Scholar 

  8. Lübbers, D., Grimmer, U., Jarke, M.: Systematic Development of Data Mining-Based Data Quality Tools. In: Proc. of the Intl. VLDB Conf., pp. 548–559 (2003)

    Google Scholar 

  9. Pearson, R.K.: Data Mining in Face of Contaminated and Incomplete Records. In: Proc. of SIAM Intl. Conf. Data Mining (2002)

    Google Scholar 

  10. Pyle, D.: Data Preparation for Data Mining, Morgan Kaufmann (1999)

    Google Scholar 

  11. Rahm, E., Do, H.: Data Cleaning: Problems and Current Approaches. IEEE Data Eng. Bull. 23(4), 3–13 (2000)

    Google Scholar 

  12. Tan, P.-N., Kumar, V., Srivastava, J.: Selecting the Right Interestingness Measure for Association Patterns. In: Proc. of Intl. KDD Conf., pp. 32–41 (2002)

    Google Scholar 

  13. Wang, R., Storey, V., Firth, C.: A Framework for Analysis of Data Quality Research. IEEE TKDE 7(4), 670–677 (1995)

    Google Scholar 

  14. Wang, K., Zhou, S., Yang, Q., Yeung, J.M.S.: Mining Customer Value: from Association Rules to Direct Marketing. J. of Data Mining and Knowledge Discovery (2005)

    Google Scholar 

  15. Zhang, C., Yang, Q., Liu, B.: Introduction: Special Section on Intelligent Data Preparation. IEEE Transactions on Knowledge and Data Engineering 17(9) (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Berti-Équille, L. (2006). Quality-Aware Association Rule Mining. In: Ng, WK., Kitsuregawa, M., Li, J., Chang, K. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2006. Lecture Notes in Computer Science(), vol 3918. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11731139_51

Download citation

  • DOI: https://doi.org/10.1007/11731139_51

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-33206-0

  • Online ISBN: 978-3-540-33207-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics