Hiding sensitive knowledge without side effects

Gkoulalas-Divanis, Aris; Verykios, Vassilios S.

doi:10.1007/s10115-008-0178-7

Hiding sensitive knowledge without side effects

Regular Paper
Published: 14 November 2008

Volume 20, pages 263–299, (2009)
Cite this article

Knowledge and Information Systems Aims and scope Submit manuscript

Aris Gkoulalas-Divanis¹ &
Vassilios S. Verykios¹

418 Accesses
45 Citations
Explore all metrics

Abstract

Sensitive knowledge hiding in large transactional databases is one of the major goals of privacy preserving data mining. However, it is only recently that researchers were able to identify exact solutions for the hiding of knowledge, depicted in the form of sensitive frequent itemsets and their related association rules. Exact solutions allow for the hiding of vulnerable knowledge without any critical compromises, such as the hiding of nonsensitive patterns or the accidental uncovering of infrequent itemsets, amongst the frequent ones, in the sanitized outcome. In this paper, we highlight the process of border revision, which plays a significant role towards the identification of exact hiding solutions, and we provide efficient algorithms for the computation of the revised borders. Furthermore, we review two algorithms that identify exact hiding solutions, and we extend the functionality of one of them to effectively identify exact solutions for a wider range of problems (than its original counterpart). Following that, we introduce a novel framework for decomposition and parallel solving of hiding problems, which are handled by each of these approaches. This framework improves to a substantial degree the size of the problems that both algorithms can handle and significantly decreases their runtime. Through experimentation, we demonstrate the effectiveness of these approaches toward providing high quality knowledge hiding solutions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Agrawal R, Shafer JC (1996) Parallel mining of association rules. IEEE Trans Knowl Data Eng (TKDE) 8(1): 962–969
Article Google Scholar
Agrawal R, Srikant R (1994) Fast algorithms for mining association rules in large databases. In: Proceedings of the 20th International Conference on Very Large Databases (VLDB), pp 487–499
Agrawal R, Srikant R (2000) Privacy-preserving data mining. In: Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, pp 439–450
Atallah M, Bertino E, Elmagarmid A, Ibrahim M, Verykios VS (1999) Disclosure limitation of sensitive rules. In: Proceedings of the 1999 IEEE Knowledge and Data Engineering Exchange Workshop (KDEX), pp 45–52
Bayardo R (1998) Efficiently mining long patterns from databases. In: Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data
Bertino E, Fovino IN, Povenza LP (2005) A framework for evaluating privacy preserving data mining algorithms. Data Mining Knowl Discov (DMKD) 11(2): 121–154
Article Google Scholar
Cheung D, Xiao Y (1998) Effect of data skewness in parallel mining of association rules. In: Proceedings of the 2nd Pacific-Asia Conference on Research and Development in Knowledge Discovery and Data Mining (PAKDD), pp 48–60
Clifton C, Kantarciog̈lu M, Vaidya J (2002) Defining privacy for data mining. National Science Foundation Workshop on Next Generation Data Mining (WNGDM), pp 126–133
Clifton C, Marks D (1996) Security and privacy implications of data mining. In: Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data, pp 15–19
Dasseni E, Verykios VS, Elmagarmid AK, Bertino E (2001) Hiding association rules by using confidence and support. In: Proceedings of the 4th International Workshop on Information Hiding, pp 369–383
Evfimievski A, Srikant R, Agrawal R, Gehrke J (2002) Privacy preserving mining of association rules. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 343–364
Farkas C, Jajodia S (2002) The inference problem: a survey. ACM SIGKDD Exploration Newsl 4(2): 6–11
Article Google Scholar
Fienberg S, Slavkovic A (2005) Preserving the confidentiality of categorical statistical data bases when releasing information for association rules. Data Mining Knowl Discov (DMKD) 11(2): 155–180
Article MathSciNet Google Scholar
Gkoulalas-Divanis A, Verykios VS (2006) An integer programming approach for frequent itemset hiding. In: Proceedings of the 2006 ACM Conference on Information and Knowledge Management (CIKM)
Gkoulalas-Divanis A, Verykios VS (2007) A hybrid approach to frequent itemset hiding. In: Proceedings of the 2007 IEEE International Conference on Tools with Artificial Intelligence (ICTAI), pp 297–304
Han E-H, Karypis G, Kumar V (2007) Scalable parallel data mining for association rules. In: Proceedings of the 1997 ACM SIGMOD International Conference on Management of Data, pp 277–288
ILOG CPLEX 9.0 User’s Manual (2003) ILOG Inc, Gentilly, France
Kantarciog̈lu M, Clifton C (2004) Privacy-preserving distributed mining of association rules on horizontally partitioned data. IEEE Trans Knowl Data Eng (TKDE) 16(9): 1026–1037
Article Google Scholar
Kargupta H, Datta S, Wang Q, Sivakumar K (2005) Random-data perturbation techniques and privacy-preserving data mining. Knowl Inform Syst (KAIS) 7(4): 387–414
Article Google Scholar
Karypis G, Kumar V (1998) A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J Sci Comput 20(1): 359–392
Article MathSciNet Google Scholar
Kohavi R, Brodley C, Frasca B, Mason L, Zheng Z (2000) KDD-Cup 2000 organizers’ report: Peeling the onion. SIGKDD Explorations 2(2): 86–98. http://www.ecn.purdue.edu/KDDCUP
Lee G, Lee K, Chen A (2001) Efficient graph-based algorithms for discovering and maintaining association rules in large databases. Knowl Inform Syst (KAIS) 3(3): 338–355
Article MATH MathSciNet Google Scholar
Menon S, Sarkar S, Mukherjee S (2005) Maximizing accuracy of shared databases when concealing sensitive patterns. Inform Syst Res 16(3): 256–270
Article Google Scholar
Morgenstern M (1988) Controlling logical inference in multilevel database and knowledge-base systems. In: Proceedings of the 1988 IEEE Symposium on Security and Privacy, pp 245–255
Moustakides G, Verykios VS (2006) A max-min approach for hiding frequent itemsets. In: Proceedings of the 6th IEEE International Conference on Data Mining (ICDM), pp 502–506
Oliveira SRM, Zaïane OR (2002) Privacy preserving frequent itemset mining. In: Proceedings of the 2002 IEEE International Conference on Privacy, Security and Data Mining (CRPITS), pp 43–54
Oliveira SRM, Zaïane OR (2003) Protecting sensitive knowledge by data sanitization. In: Proceedings of the Third IEEE International Conference on Data Mining (ICDM), pp 211–218
Parthasarathy S, Zaki M, Ogihara M, Li W (2001) Parallel data mining for association rules on shared-memory systems. Knowl Inform Syst (KAIS) 3(1): 1–29
Article MATH Google Scholar
Pontikakis E, Theodoridis Y, Tsitsonis A, Chang L, Verykios VS (2004) A quantitative and qualitative analysis of blocking in association rule hiding. In: Proceedings of the 2004 ACM Workshop on Privacy in the Electronic Society (WPES), pp 29–30
Rizvi S, Haritsa JR (2002) Maintaining data privacy in association rule mining. In: Proceedings of the 28th International Conference on Very Large Databases (VLDB)
Saygin Y, Verykios VS, Clifton C (2001) Using unknowns to prevent discovery of association rules. ACM SIGMOD Record 30(4): 45–54
Article Google Scholar
Sun X, Yu PS (2005) A border-based approach for hiding sensitive frequent itemsets. In: Proceedings of the Fifth IEEE International Conference on Data Mining (ICDM), pp 426–433
Vaidya J, Clifton C (2002) Privacy preserving association rule mining in vertically partitioned data. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 639–644
Verykios VS, Bertino E, Fovino IN, Provenza LP, Saygin Y, Theodoridis Y (2004a) State-of-the-art in privacy preserving data mining. ACM SIGMOD Record 33(1): 50–57
Article Google Scholar
Verykios VS, Emagarmid AK, Bertino E, Saygin Y, Dasseni E (2004b) Association rule hiding. IEEE Trans Knowl Data Eng (TKDE) 16(4): 434–447
Article Google Scholar
Xu S, Zhang J, Han D, Wang J (2006) Singular value decomposition based data distortion strategy for privacy protection. Knowl Inform Syst (KAIS) 10(3): 383–397
Article Google Scholar
Yokoo M, Durfee E, Ishida T, Kuwabara K (1998) The distributed constraint satisfaction problem: formalization and algorithms. IEEE Trans Knowl Data Eng (TKDE) 10(5): 673–685
Article Google Scholar
Zaïane OR, El-Hajj M, Lu P (2001) Fast parallel association rule mining without candidacy generation. In: Proceedings of the 2001 IEEE International Conference on Data Mining (ICDM), pp 665–668
Zou Q, Chu W, Johnson D, Chiu H (2002) A pattern decomposition algorithm for data mining of frequent patterns. Knowl Inform Syst (KAIS) 4(4): 466–482
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer and Communication Engineering, University of Thessaly, 382 21, Vólos, Greece
Aris Gkoulalas-Divanis & Vassilios S. Verykios

Authors

Aris Gkoulalas-Divanis
View author publications
You can also search for this author inPubMed Google Scholar
Vassilios S. Verykios
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Aris Gkoulalas-Divanis.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gkoulalas-Divanis, A., Verykios, V.S. Hiding sensitive knowledge without side effects. Knowl Inf Syst 20, 263–299 (2009). https://doi.org/10.1007/s10115-008-0178-7

Download citation

Received: 18 January 2008
Revised: 16 September 2008
Accepted: 27 September 2008
Published: 14 November 2008
Issue Date: September 2009
DOI: https://doi.org/10.1007/s10115-008-0178-7

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Hiding sensitive knowledge without side effects

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Frequent itemset hiding revisited: pushing hiding constraints into mining

Hiding sensitive itemsets without side effects

A transversal hypergraph approach for the frequent itemset hiding problem

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Hiding sensitive knowledge without side effects

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Frequent itemset hiding revisited: pushing hiding constraints into mining

Hiding sensitive itemsets without side effects

A transversal hypergraph approach for the frequent itemset hiding problem

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now