Mining Association Rules under Privacy Constraints

Haritsa, Jayant R.

doi:10.1007/978-0-387-70992-5_10

Jayant R. Haritsa⁵

Part of the book series: Advances in Database Systems ((ADBS,volume 34))

5083 Accesses

Data mining services require accurate input data for their results to be meaningful, but privacy concerns may impel users to provide spurious information. In this chapter, we study whether users can be encouraged to provide correct information by ensuring that the mining process cannot, with any reasonable degree of certainty, violate their privacy. Our analysis is in the context of extracting association rules from large historical databases, a popular mining process that identifies interesting correlations between database attributes. We analyze the various schemes that have been proposed for this purpose with regard to a variety of parameters including the degree of trust, privacy metric, model accuracy and mining efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.00; Price excludes VAT (USA)

Hardcover Book: USD 219.00; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Post–mining on Association Rule Bases

A Framework for Interestingness Measures for Association Rules with Discrete and Continuous Attributes Based on Statistical Validity

Performance Comparisons in Association Rule Mining Over Public Datasets

References

N. Adam and J. Wortman. Security control methods for statistical databases. ACM Computing Surveys, 21(4), 1989.
Google Scholar
C. Aggarwal and P. Yu. A condensation approach to privacy preserving data mining. Proc. of 9th Intl. Conf. on Extending Database Technology (EDBT), March 2004.
Google Scholar
D. Agrawal and C. Aggarwal. On the design and quantification of privacy preserving data mining algorithms. Proc. of ACM Symp. on Principles of Database Systems (PODS), May 2001.
Google Scholar
R. Agrawal, R. Bayardo, C. Faloutsos, J. Kiernan, R. Rantzau and R. Srikant. Auditing compliance with a hippocratic database. Proc. of 30th Intl. Conf. on Very Large Data Bases (VLDB), August 2004.
Google Scholar
R. Agrawal, J. Kiernan, R. Srikant and Y. Xu. Hippocratic databases. Proc. of 28th Intl. Conf. on Very Large Data Bases (VLDB), August 2002.
Google Scholar
R. Agrawal, A. Kini, K. LeFevre, A. Wang, Y. Xu and D. Zhou. Managing healthcare data hippocratically. Proc. of ACM SIGMOD Intl. Conf. on Management of Data, June 2004.
Google Scholar
R. Agrawal, T. Imielinski and A. Swami. Mining association rules between sets of items in large databases. Proc. of ACM SIGMOD Intl. Conf. on Management of Data, May 1993.
Google Scholar
R. Agrawal and R. Srikant. Fast algorithms for mining association rules. Proc. of 20th Intl. Conf. on Very Large Data Bases (VLDB), September 1994.
Google Scholar
R. Agrawal and R. Srikant. Privacy-preserving data mining. Proc. of ACM SIGMOD Intl. Conf. on Management of Data, May 2000.
Google Scholar
S. Agrawal and J. Haritsa. A Framework for High-Accuracy Privacy-Preserving Mining. Proc. of 21st IEEE Intl. Conf. on Data Engineering (ICDE), April 2005.
Google Scholar
S. Agrawal and J. Haritsa. A Framework for High-Accuracy Privacy-Preserving Mining. Tech. Rep. TR-2004-02, DSL/SERC, Indian Institute of Science, 2004. http://dsl.serc.iisc.ernet.in/pub/TR/TR-2004-02.pdf
S. Agrawal, V. Krishnan and J. Haritsa. On addressing efficiency concerns in privacy-preserving mining. Proc. of 9th Intl. Conf. on Database Systems for Advanced Applications (DASFAA), March 2004.
Google Scholar
M. Atallah, E. Bertino, A. Elmagarmid, M. Ibrahim and V. Verykios. Disclosure limitation of sensitive rules. Proc. of IEEE Knowledge and Data Engineering Exchange Workshop (KDEX), November 1999.
Google Scholar
L. Cranor, J. Reagle and M. Ackerman. Beyond concern: Understanding net users’ attitudes about online privacy. AT&T Tech. Rep. 99.4.3, April 1999.
Google Scholar
E. Dasseni, V. Verykios, A. Elmagarmid and E. Bertino. Hiding association rules by using confidence and support. Proc. of 4th Intl. Information Hiding Workshop (IHW), April 2001.
Google Scholar
P. de Wolf, J. Gouweleeuw, P. Kooiman, and L. Willenborg. Reflections on PRAM. Proc. of Statistical Data Protection Conf., March 1998.
Google Scholar
D. Denning. Cryptography and Data Security. Addison-Wesley, 1982.
Google Scholar
A. Evfimievski, J. Gehrke and R. Srikant. Limiting privacy breaches in privacy preserving data mining. Proc. of ACM Symp. on Principles of Database Systems (PODS), June 2003.
Google Scholar
A. Evfimievski, R. Srikant, R. Agrawal and J. Gehrke. Privacy preserving mining of association rules. Proc. of 8th ACM Intl. Conf. on Knowledge Discovery and Data Mining (KDD), July 2002.
Google Scholar
W. Feller. An Introduction to Probability Theory and its Applications (Vol. I). Wiley, 1988.
Google Scholar
M. Garey and D. Johnson. Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman, 1979.
Google Scholar
A. Gkoulalas-Divanis and V. Verykios. An integer programming approach for frequent itemset hiding. Proc. of 15th ACM Conf. on Information and Knowledge Management (CIKM), November 2006.
Google Scholar
O. Goldreich. Secure Multi-party Computation. www.wisdom.weizmann.ac.il/˜oded/pp.html, 1998.
J. Gouweleeuw, P. Kooiman, L. Willenborg and P. de Wolf. Post randomisation for statistical disclosure control: Theory and implementation. Journal of Official Statistics, 14(4), 1998.
Google Scholar
M. Kantarcioglu and C. Clifton. Privacy-preserving distributed mining of association rules on horizontally partitioned data. Proc. of ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery (DMKD), June 2002.
Google Scholar
H. Kargupta, S. Datta, Q. Wang and K. Sivakumar. On the privacy preserving properties of random data perturbation techniques. Proc. of the 3rd IEEE Intl. Conf. on Data Mining (ICDM), December 2003.
Google Scholar
K. LeFevre, R. Agrawal, V. Ercegovac, R. Ramakrishnan, Y. Xu and D. DeWitt. Limiting disclosure in hippocratic databases. Proc. of 30th Intl. Conf. on Very Large Data Bases (VLDB), 2004.
Google Scholar
N. Mishra and M. Sandler. Privacy via pseudorandom sketches. Proc. of 25th ACM Symp. on Principles of Database Systems (PODS), 2006.
Google Scholar
T. Mitchell. Machine Learning. McGraw Hill, 1997.
Google Scholar
G. Moustakides and V. Verykios. A Max-Min Approach for Hiding Frequent Itemsets. Proc. of 6th IEEE Intl. Conf. on Data Mining - Workshops, December 2006.
Google Scholar
R. Motwani and P. Raghavan. Randomized Algorithms. Cambridge University Press, 1995.
Google Scholar
V. Pudi and J. Haritsa. Quantifying the Utility of the Past in Mining Large Databases. Information Systems, Elsevier Science Publishers, vol. 25, no. 5, July 2000, pgs. 323–344
Article Google Scholar
J. R. Quinlan. C4.5: Programs for Machine Learning. Morgan Kaufmann, 1993.
Google Scholar
S. Rizvi and J. Haritsa. Maintaining data privacy in association rule mining. Proc. of 28th Intl. Conf. on Very Large Databases (VLDB), August 2002.
Google Scholar
P. Samarati and L. Sweeney. Generalizing data to provide anonymity when disclosing information. Proc. of 17th ACM Symp. on Principles of Database Systems (PODS), June 1998.
Google Scholar
Y. Saygin, V. Verykios and C. Clifton. Using unknowns to prevent discovery of association rules. ACM SIGMOD Record, vol. 30, no. 4, 2001.
Google Scholar
Y. Saygin, V. Verykios and A. Elmagarmid. Privacy preserving association rule mining. Proc. of 12th Intl. Workshop on Research Issues in Data Engineering (RIDE), February 2002.
Google Scholar
A. Shoshani. Statistical databases: Characteristics, problems and some solutions. Proc. of 8th Intl. Conf. on Very Large Databases (VLDB), September 1982.
Google Scholar
G. Strang. Linear Algebra and its Applications. Thomson Learning Inc., 1988.
Google Scholar
H. Toivonen. Sampling large databases for association rules. Proc. of 22nd Intl. Conf. on Very Large Databases (VLDB), August 1996.
Google Scholar
J. Vaidya and C. Clifton. Privacy preserving association rule mining in vertically partitioned data. Proc. of 8th ACM Intl. Conference on Knowledge Discovery and Data Mining (KDD), July 2002.
Google Scholar
J. Vaidya and C. Clifton. Privacy-preserving k-means clustering over vertically partitioned data. Proc. of 9th ACM Intl. Conf. on Knowledge Discovery and Data Mining (KDD), August 2003.
Google Scholar
J. Vaidya and C. Clifton. Privacy preserving naive bayes classifier for vertically partitioned data. Proc. of SIAM Intl. Conf. on Data Mining, April 2004.
Google Scholar
V. Verykios, A. Elmagarmid, E. Bertino, Y. Saygin and E. Dasseni. Association Rule Hiding. IEEE Trans. on Knowledge and Data Engineering, 16(4), 2004.
Google Scholar
Y. Wang. On the number of successes in independent trials. Statistica Silica 3, 1993.
Google Scholar
A. Westin. Freebies and privacy: What net users think. Tech. Rep., Opinion Research Corporation, 1999.
Google Scholar
N. Zhang, S. Wang and W. Zhao. A new scheme on privacy-preserving association rule mining. Proc. of 8th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD), September 2004.
Google Scholar
Data from US Census beaurau : National Health Interview Survey : Person, 1993. http://dataferrett.census.gov.
http://en.wikibooks.org/wiki/cookbook:frapp%c3%a9_coffee.
http://www.cs.waikato.ac.nz/ml/weka.
http://www.ics.uci.edu/ mlearn/mlsummary.html.

Download references

Author information

Authors and Affiliations

Database Systems Lab, Indian Institute of Science, Bangalore, Karnataka, India
Jayant R. Haritsa

Authors

Jayant R. Haritsa
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

IBM Thomas J. Watson Research Center, 19 Skyline Drive, 10532, Hawthorne, NY, USA
Charu C. Aggarwal
Department of Computer Science, University of Illinois at Chicago, 854 South Morgan Street, 60607-7053, Chicago, IL, USA
Philip S. Yu

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Haritsa, J.R. (2008). Mining Association Rules under Privacy Constraints. In: Aggarwal, C.C., Yu, P.S. (eds) Privacy-Preserving Data Mining. Advances in Database Systems, vol 34. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-70992-5_10

Download citation

DOI: https://doi.org/10.1007/978-0-387-70992-5_10
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-70991-8
Online ISBN: 978-0-387-70992-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Mining Association Rules under Privacy Constraints

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Post–mining on Association Rule Bases

A Framework for Interestingness Measures for Association Rules with Discrete and Continuous Attributes Based on Statistical Validity

Performance Comparisons in Association Rule Mining Over Public Datasets

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Mining Association Rules under Privacy Constraints

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Post–mining on Association Rule Bases

A Framework for Interestingness Measures for Association Rules with Discrete and Continuous Attributes Based on Statistical Validity

Performance Comparisons in Association Rule Mining Over Public Datasets

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation