Preserving privacy in association rule mining with bloom filters

Qiu, Ling; Li, Yingjiu; Wu, Xintao

doi:10.1007/s10844-006-0018-8

Preserving privacy in association rule mining with bloom filters

Published: 27 January 2007

Volume 29, pages 253–278, (2007)
Cite this article

Journal of Intelligent Information Systems Aims and scope Submit manuscript

Ling Qiu¹,
Yingjiu Li² &
Xintao Wu³

167 Accesses
22 Citations
Explore all metrics

Abstract

Privacy preserving association rule mining has been an active research area since recently. To this problem, there have been two different approaches—perturbation based and secure multiparty computation based. One drawback of the perturbation based approach is that it cannot always fully preserve individual’s privacy while achieving precision of mining results. The secure multiparty computation based approach works only for distributed environment and needs sophisticated protocols, which constrains its practical usage. In this paper, we propose a new approach for preserving privacy in association rule mining. The main idea is to use keyed Bloom filters to represent transactions as well as data items. The proposed approach can fully preserve privacy while maintaining the precision of mining results. The tradeoff between mining precision and storage requirement is investigated. We also propose δ-folding technique to further reduce the storage requirement without sacrificing mining precision and running time.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Agrawal, D., & Aggarwal, C. C. (2001). On the design and quantification of privacy preserving data mining algorithms. In Proceedings of the 20th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, Santa Barbara, California (pp. 247–255).
Agrawal, R., Imilienski, T., & Swami, A. (1993). Mining association rules between sets of items in large databases. In Proceedings of the ACM SIGMOD International Conference on Management of Database (pp. 207–216). New York: ACM Press.
Google Scholar
Agrawal, R., Kiernan, J., Srikant, R., & Xu, Y. (2004). Order preserving encryption for numeric data. In Proceedings of the ACM SIGMOD International Conference on Management of Database, Paris, France (pp. 563–574).
Agrawal, R., & Srikant, R. (1994). Fast algorithms for mining association rules in large databases. In VLDB’94, Santiago, Chile (pp. 487–499).
Agrawal, R. & Srikant, R. (2000). Privacy-preserving data mining. In Proceedings of the ACM SIGMOD International Conference on Management of Data, Dallas, Texas (pp. 439–450).
Atallah, M., Bertino, E., Elmagarmid, A. K., Ibrahim, M., & Verykios, V. S. (1999). Disclosure limitation of sensitive rules. Proceedings of the IEEE Knowledge and Data Engineering Exchange Workshop, Chicago, Illinois (pp. 45–52).
Bloom, B. (1970) Space time tradeoffs in hash coding with allowable errors. Communications of theACM, 13(7), 422–426.
Article MATH Google Scholar
Border, A. Z., & Mitzenmacher, M. (2002). Network applications of bloom filters: A survey. In Proceedings of the 40th Annual Allerton Conference on Communication, Control, and Computing, Urbana-Champaign, Illinois (pp. 636–646).
Chernoff, H. (1952). A measure of asymptotic efficiency for tests based on the sum of observations. Annals of Mathematical Statistics, 23, 493–509.
MathSciNet Google Scholar
Cohen, S., & Matias, Y. (2003). Spectral bloom filters. In Proceedings of the ACM SIGMOD International Conference on Management of Database, San Diego, California (pp. 241–252).
Dasseni, E., Verykios, V. S., Elmagarmid, A. K., & Bertino, E. (2001). Hiding association rules by using confidence and support. In Proceedings of the 4th International Information Hiding Workshop, Pittsburg, Pennsylvania (pp. 369–383).
Du, W., & Atallah, M. J. (2001). Secure multi-party computation problems and their applications: A review and open problems. In Proceedings of New Security Paradigms Workshop 2001, Cloudcroft, New Mexico (pp. 11–20).
Du, W., & Zhan, Z. (2002). Building decision tree classifier on private data. In Proceedings of IEEE ICDM’02 Workshop on Privacy, Security, and Data Mining, volume 14, Maebashi City, Japan (pp. 1–8).
Evfimievski, A.,Srikant, R., Agrawal, R., & Gehrke, J. (2002). Privacy preserving mining of association rules. In Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Edmonton, Canada (pp. 217–228).
Evfimievski, A., Gehrke, J., & Srikant, R. (2003). Limiting privacy breaches in privacy preserving data mining. In Proceedings of the 22nd ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database System, San Diego, California (pp. 211–222).
Fan, L., Cao, P., Almeida, J., & Border, A. Z. (2000). Summary cache: A scalable wide-area web cachesharing protocol. IEEE/ACM Transactions on Networking, 8(3), 281–293.
Article Google Scholar
Hacigumus, H., Iyer, B., Li, C., & Mehrotra, S. (2002a). Executing SQL over encrypted data in the database-service-provider model. In Proceedings of the ACM SIGMOD International Conference on Management of Database, Madison, Wisconsin (pp. 216–227).
Hacigumus, H., Iyer, B., & Mehrotra, S. (2002b). Providing database as a service. In Proceedings of the International Conference on Data Engineering, San Jose, California (pp. 29–40).
Hacigumus, H., Iyer, B., & Mehrotra, S. (2004). Efficient execution of aggregation queries over encrypted relational databases. In Proceedings of International Conference on Database Systems for Advanced Applications, (pp. 125–136). Jeju Island, Korea.
Hoeffding, W. (1963). Probability for sums of bounded random variables. Journal of the American Statistical Association, 58, 13–30.
Article MATH MathSciNet Google Scholar
Iyer, B., Mehrotra, S., Mykletun, E., Tsudik, G., & Wu, Y. (2004). A framework for efficient storagesecurity in RDBMS. In Proceedings of International Conference on EDBT, Crete, Greece (pp. 147–164).
Kantarcıoǧlu, M., & Clifton, C. (2002). Privacy preserving distributed mining of association rules on horizontally partitioned data. In Proceedings of the ACM SIGMOD Workshop on Research Issues on Data Mining and Knowledge Discovery, Madison, Wisconsin (pp. 24–31).
Kantarcıoǧlu, M., Jin, J., & Clifton, C. (2004). When do data mining results violate privacy? In Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Seattle, Washington (pp. 599–604).
Kargupta, H., Datta, S., Wang, Q., & Sivakumar, K. (2003). On the privacy preserving properties of random data perturbation techniques. In Proceedings of the 3rd International Conference on DataMining, Melbourne, Florida (pp. 99–106).
Li, Z., & Ross, K. A. (1995). PERF join: An alternative to semijoin and Bloom join. In Proceedings of the International Conference on Information and Knowledge Management, Baltimore, Maryland (pp. 137–144).
Lindell, Y., & Pinkas, B. (2002). Privacy preserving data mining. Journal of Cryptology, 15(3), 177–206.
Article MATH MathSciNet Google Scholar
Mullin, J. K. (1990). Optimal semijoins for distributed database systems. IEEE Transactions on Software Engineering, 16(5), 558–560.
Article Google Scholar
Mykletun, E., Narasimha, M., & Tsudik, G. (2004). Authentication and integrity in outsourced databases. In Proceedings of the 11th ISOC Annual Network and Distributed System Security Symposium, San Diego, California.
Oliveira, S., & Zaiane, O. (2002). Privacy preserving frequent itemset mining. In Proceedings of the IEEE ICDM Workshop on Privacy, Security and Data Mining, Maebashi City, Japan (pp. 43–54).
Oliveira, S., & Zaiane, O. (2003a). Algorithms for balancing privacy and knowledge discovery in association rule mining. In Proceedings of the 7th International Database Engineering and Applications Symposium, Hongkong, China (pp. 54–63).
Oliveira, S., & Zaiane, O. (2003b). Protecting sensitive knowledge by data sanitization. In Proceedings of the 3rd IEEE International Conference on Data Mining, Melbourne, Florida (pp. 211–218).
Pang, H., & Tan, K. L. (2004). Authenticating query results in edge computing. In Proceedings of the 20th International Conference on Data Engineering, Boston, Massachusetts (pp. 560–571).
Pinkas, B. (2002). Cryptographic techniques for privacy preserving data mining. ACM SIGKDDExplorations, 4(2), 12–19.
Google Scholar
Rizvi, S., & Haritsa, J. (2002). Maintaining data privacy in association rule mining. In VLDB’02, Hongkong, China (pp. 682–693).
Saygin, Y., Verykios, V. S., & Clifton, C. (2001). Using unknowns to prevent discovery of association rules. Sigmod Record, 30(4), 45–54.
Article Google Scholar
Vaidya, J., & Clifton, C. (2002). Privacy preserving association rule mining in vertically partitioned data. In Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Alberta, Canada (pp. 639–644).
Yao, A. (1986). How to generate and exchange secrets. In Proceedings of the 27th IEEEFOCS, Ontario, Canada (pp. 162–167).
Zheng, Z., Kohavi, R., & Mason, L. (2001).Real world performance of association rule algorithms.In Proceedings of the 7th ACM-SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, California (pp. 401–406).

Download references

Author information

Authors and Affiliations

School of Math, Physics and Information Technology, James Cook University, Townsville, Queensland, 4811, Australia
Ling Qiu
School of Information Systems, Singapore Management University, Singapore, 178902, Singapore
Yingjiu Li
Department of Computer Science, University of North Carolina at Charlotte, Charlotte, NC, 28223, USA
Xintao Wu

Authors

Ling Qiu
View author publications
You can also search for this author in PubMed Google Scholar
Yingjiu Li
View author publications
You can also search for this author in PubMed Google Scholar
Xintao Wu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ling Qiu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Qiu, L., Li, Y. & Wu, X. Preserving privacy in association rule mining with bloom filters. J Intell Inf Syst 29, 253–278 (2007). https://doi.org/10.1007/s10844-006-0018-8

Download citation

Received: 19 May 2005
Revised: 04 November 2005
Accepted: 17 January 2006
Published: 27 January 2007
Issue Date: December 2007
DOI: https://doi.org/10.1007/s10844-006-0018-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Preserving privacy in association rule mining with bloom filters

Abstract

Access this article

Similar content being viewed by others

Distortion-Based Privacy-Preserved Association Rules Mining Without Side Effects Using Closed Itemsets

Information-Theoretically Secure Privacy Preserving Approaches for Collaborative Association Rule Mining

A Multi-level Access Technique for Privacy-Preserving Perturbation in Association Rule Mining

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Preserving privacy in association rule mining with bloom filters

Abstract

Access this article

Similar content being viewed by others

Distortion-Based Privacy-Preserved Association Rules Mining Without Side Effects Using Closed Itemsets

Information-Theoretically Secure Privacy Preserving Approaches for Collaborative Association Rule Mining

A Multi-level Access Technique for Privacy-Preserving Perturbation in Association Rule Mining

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation