Skip to main content

Mining Association Rules under Privacy Constraints

  • Chapter
Privacy-Preserving Data Mining

Part of the book series: Advances in Database Systems ((ADBS,volume 34))

  • 5083 Accesses

Data mining services require accurate input data for their results to be meaningful, but privacy concerns may impel users to provide spurious information. In this chapter, we study whether users can be encouraged to provide correct information by ensuring that the mining process cannot, with any reasonable degree of certainty, violate their privacy. Our analysis is in the context of extracting association rules from large historical databases, a popular mining process that identifies interesting correlations between database attributes. We analyze the various schemes that have been proposed for this purpose with regard to a variety of parameters including the degree of trust, privacy metric, model accuracy and mining efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. N. Adam and J. Wortman. Security control methods for statistical databases. ACM Computing Surveys, 21(4), 1989.

    Google Scholar 

  2. C. Aggarwal and P. Yu. A condensation approach to privacy preserving data mining. Proc. of 9th Intl. Conf. on Extending Database Technology (EDBT), March 2004.

    Google Scholar 

  3. D. Agrawal and C. Aggarwal. On the design and quantification of privacy preserving data mining algorithms. Proc. of ACM Symp. on Principles of Database Systems (PODS), May 2001.

    Google Scholar 

  4. R. Agrawal, R. Bayardo, C. Faloutsos, J. Kiernan, R. Rantzau and R. Srikant. Auditing compliance with a hippocratic database. Proc. of 30th Intl. Conf. on Very Large Data Bases (VLDB), August 2004.

    Google Scholar 

  5. R. Agrawal, J. Kiernan, R. Srikant and Y. Xu. Hippocratic databases. Proc. of 28th Intl. Conf. on Very Large Data Bases (VLDB), August 2002.

    Google Scholar 

  6. R. Agrawal, A. Kini, K. LeFevre, A. Wang, Y. Xu and D. Zhou. Managing healthcare data hippocratically. Proc. of ACM SIGMOD Intl. Conf. on Management of Data, June 2004.

    Google Scholar 

  7. R. Agrawal, T. Imielinski and A. Swami. Mining association rules between sets of items in large databases. Proc. of ACM SIGMOD Intl. Conf. on Management of Data, May 1993.

    Google Scholar 

  8. R. Agrawal and R. Srikant. Fast algorithms for mining association rules. Proc. of 20th Intl. Conf. on Very Large Data Bases (VLDB), September 1994.

    Google Scholar 

  9. R. Agrawal and R. Srikant. Privacy-preserving data mining. Proc. of ACM SIGMOD Intl. Conf. on Management of Data, May 2000.

    Google Scholar 

  10. S. Agrawal and J. Haritsa. A Framework for High-Accuracy Privacy-Preserving Mining. Proc. of 21st IEEE Intl. Conf. on Data Engineering (ICDE), April 2005.

    Google Scholar 

  11. S. Agrawal and J. Haritsa. A Framework for High-Accuracy Privacy-Preserving Mining. Tech. Rep. TR-2004-02, DSL/SERC, Indian Institute of Science, 2004. http://dsl.serc.iisc.ernet.in/pub/TR/TR-2004-02.pdf

  12. S. Agrawal, V. Krishnan and J. Haritsa. On addressing efficiency concerns in privacy-preserving mining. Proc. of 9th Intl. Conf. on Database Systems for Advanced Applications (DASFAA), March 2004.

    Google Scholar 

  13. M. Atallah, E. Bertino, A. Elmagarmid, M. Ibrahim and V. Verykios. Disclosure limitation of sensitive rules. Proc. of IEEE Knowledge and Data Engineering Exchange Workshop (KDEX), November 1999.

    Google Scholar 

  14. L. Cranor, J. Reagle and M. Ackerman. Beyond concern: Understanding net users’ attitudes about online privacy. AT&T Tech. Rep. 99.4.3, April 1999.

    Google Scholar 

  15. E. Dasseni, V. Verykios, A. Elmagarmid and E. Bertino. Hiding association rules by using confidence and support. Proc. of 4th Intl. Information Hiding Workshop (IHW), April 2001.

    Google Scholar 

  16. P. de Wolf, J. Gouweleeuw, P. Kooiman, and L. Willenborg. Reflections on PRAM. Proc. of Statistical Data Protection Conf., March 1998.

    Google Scholar 

  17. D. Denning. Cryptography and Data Security. Addison-Wesley, 1982.

    Google Scholar 

  18. A. Evfimievski, J. Gehrke and R. Srikant. Limiting privacy breaches in privacy preserving data mining. Proc. of ACM Symp. on Principles of Database Systems (PODS), June 2003.

    Google Scholar 

  19. A. Evfimievski, R. Srikant, R. Agrawal and J. Gehrke. Privacy preserving mining of association rules. Proc. of 8th ACM Intl. Conf. on Knowledge Discovery and Data Mining (KDD), July 2002.

    Google Scholar 

  20. W. Feller. An Introduction to Probability Theory and its Applications (Vol. I). Wiley, 1988.

    Google Scholar 

  21. M. Garey and D. Johnson. Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman, 1979.

    Google Scholar 

  22. A. Gkoulalas-Divanis and V. Verykios. An integer programming approach for frequent itemset hiding. Proc. of 15th ACM Conf. on Information and Knowledge Management (CIKM), November 2006.

    Google Scholar 

  23. O. Goldreich. Secure Multi-party Computation. www.wisdom.weizmann.ac.il/Ëœoded/pp.html, 1998.

  24. J. Gouweleeuw, P. Kooiman, L. Willenborg and P. de Wolf. Post randomisation for statistical disclosure control: Theory and implementation. Journal of Official Statistics, 14(4), 1998.

    Google Scholar 

  25. M. Kantarcioglu and C. Clifton. Privacy-preserving distributed mining of association rules on horizontally partitioned data. Proc. of ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery (DMKD), June 2002.

    Google Scholar 

  26. H. Kargupta, S. Datta, Q. Wang and K. Sivakumar. On the privacy preserving properties of random data perturbation techniques. Proc. of the 3rd IEEE Intl. Conf. on Data Mining (ICDM), December 2003.

    Google Scholar 

  27. K. LeFevre, R. Agrawal, V. Ercegovac, R. Ramakrishnan, Y. Xu and D. DeWitt. Limiting disclosure in hippocratic databases. Proc. of 30th Intl. Conf. on Very Large Data Bases (VLDB), 2004.

    Google Scholar 

  28. N. Mishra and M. Sandler. Privacy via pseudorandom sketches. Proc. of 25th ACM Symp. on Principles of Database Systems (PODS), 2006.

    Google Scholar 

  29. T. Mitchell. Machine Learning. McGraw Hill, 1997.

    Google Scholar 

  30. G. Moustakides and V. Verykios. A Max-Min Approach for Hiding Frequent Itemsets. Proc. of 6th IEEE Intl. Conf. on Data Mining - Workshops, December 2006.

    Google Scholar 

  31. R. Motwani and P. Raghavan. Randomized Algorithms. Cambridge University Press, 1995.

    Google Scholar 

  32. V. Pudi and J. Haritsa. Quantifying the Utility of the Past in Mining Large Databases. Information Systems, Elsevier Science Publishers, vol. 25, no. 5, July 2000, pgs. 323–344

    Article  Google Scholar 

  33. J. R. Quinlan. C4.5: Programs for Machine Learning. Morgan Kaufmann, 1993.

    Google Scholar 

  34. S. Rizvi and J. Haritsa. Maintaining data privacy in association rule mining. Proc. of 28th Intl. Conf. on Very Large Databases (VLDB), August 2002.

    Google Scholar 

  35. P. Samarati and L. Sweeney. Generalizing data to provide anonymity when disclosing information. Proc. of 17th ACM Symp. on Principles of Database Systems (PODS), June 1998.

    Google Scholar 

  36. Y. Saygin, V. Verykios and C. Clifton. Using unknowns to prevent discovery of association rules. ACM SIGMOD Record, vol. 30, no. 4, 2001.

    Google Scholar 

  37. Y. Saygin, V. Verykios and A. Elmagarmid. Privacy preserving association rule mining. Proc. of 12th Intl. Workshop on Research Issues in Data Engineering (RIDE), February 2002.

    Google Scholar 

  38. A. Shoshani. Statistical databases: Characteristics, problems and some solutions. Proc. of 8th Intl. Conf. on Very Large Databases (VLDB), September 1982.

    Google Scholar 

  39. G. Strang. Linear Algebra and its Applications. Thomson Learning Inc., 1988.

    Google Scholar 

  40. H. Toivonen. Sampling large databases for association rules. Proc. of 22nd Intl. Conf. on Very Large Databases (VLDB), August 1996.

    Google Scholar 

  41. J. Vaidya and C. Clifton. Privacy preserving association rule mining in vertically partitioned data. Proc. of 8th ACM Intl. Conference on Knowledge Discovery and Data Mining (KDD), July 2002.

    Google Scholar 

  42. J. Vaidya and C. Clifton. Privacy-preserving k-means clustering over vertically partitioned data. Proc. of 9th ACM Intl. Conf. on Knowledge Discovery and Data Mining (KDD), August 2003.

    Google Scholar 

  43. J. Vaidya and C. Clifton. Privacy preserving naive bayes classifier for vertically partitioned data. Proc. of SIAM Intl. Conf. on Data Mining, April 2004.

    Google Scholar 

  44. V. Verykios, A. Elmagarmid, E. Bertino, Y. Saygin and E. Dasseni. Association Rule Hiding. IEEE Trans. on Knowledge and Data Engineering, 16(4), 2004.

    Google Scholar 

  45. Y. Wang. On the number of successes in independent trials. Statistica Silica 3, 1993.

    Google Scholar 

  46. A. Westin. Freebies and privacy: What net users think. Tech. Rep., Opinion Research Corporation, 1999.

    Google Scholar 

  47. N. Zhang, S. Wang and W. Zhao. A new scheme on privacy-preserving association rule mining. Proc. of 8th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD), September 2004.

    Google Scholar 

  48. Data from US Census beaurau : National Health Interview Survey : Person, 1993. http://dataferrett.census.gov.

  49. http://en.wikibooks.org/wiki/cookbook:frapp%c3%a9_coffee.

  50. http://www.cs.waikato.ac.nz/ml/weka.

  51. http://www.ics.uci.edu/ mlearn/mlsummary.html.

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

Haritsa, J.R. (2008). Mining Association Rules under Privacy Constraints. In: Aggarwal, C.C., Yu, P.S. (eds) Privacy-Preserving Data Mining. Advances in Database Systems, vol 34. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-70992-5_10

Download citation

  • DOI: https://doi.org/10.1007/978-0-387-70992-5_10

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-0-387-70991-8

  • Online ISBN: 978-0-387-70992-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics