A Constraint-Based Model for the Frequent Itemset Hiding Problem

Verykios, Vassilios S.; Stavropoulos, Elias C.; Zorkadis, Vasilis; Elmagarmid, Ahmed K.

doi:10.1007/978-3-030-37545-4_4

Vassilios S. Verykios⁸,
Elias C. Stavropoulos⁹,
Vasilis Zorkadis¹⁰ &
…
Ahmed K. Elmagarmid¹¹

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1111))

Included in the following conference series:

International Conference on e-Democracy

1056 Accesses
5 Citations

Abstract

This paper introduces a novel constraint-based hiding model to drastically reduce the preprocessing overhead that is incurred by border-based techniques in the hiding of sensitive frequent itemsets. The proposed model is solved by an efficient constraint-based mining algorithm that pushes a conjunction of antimonotone constraints into an Apriori-like algorithm, for inducing the support theory of non-sensitive frequent itemsets along with its negative border. The patterns induced by the constraint-based mining algorithm can be used in border-based hiding algorithms to construct a sanitized version of the original database, where the sensitive knowledge is concealed. The efficiency of the constraint-based mining algorithm is evaluated on real and synthetic datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
The algorithm processes the itemsets levelwise: it firstly determines the frequent itemsets of size 1, then generates all itemesets of size 2 and determines which of these are frequent, etc. At the n-th level, all frequent itemsets of size n-1 are known and an itemset of size n is frequent if all its subsets of size n-1 are frequent (the Apriory property).
2.
Non-sensitive itemsets are indifferent to our problem and there is to need to be specified explicitly.

References

Abul, O., Gökçe, H.: Knowledge hiding from tree and graph databases. Data Knowl. Eng. 72, 148–171 (2012)
Article Google Scholar
Aggarwal, C., Yu, P.: Privacy-Preserving Data Mining: Models and Algorithms. Advances in Database Systems. Springer, Boston (2008). https://doi.org/10.1007/978-0-387-70992-5
Book Google Scholar
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: VLDB, pp. 487–499 (1994)
Google Scholar
Agrawal, R., Srikant, R.: Privacy-preserving data mining. In: SIGMOD Conference, pp. 439–450 (2000)
Article Google Scholar
Atallah, M., Bertino, E., Elmagarmid, A., Ibrahim, M., Verykios, V.: Disclosure limitation of sensitive rules. In: KDEX Workshop, pp. 45–52. IEEE (1999)
Google Scholar
Atzori, M., Bonchi, F., Giannotti, F., Pedreschi, D.: Anonymity preserving pattern discovery. VLDB J. 17(4), 703–727 (2008)
Article Google Scholar
Bayardo, R.: Efficiently mining long patterns from databases. In: Proceedings of SIGMOD 1998, pp. 85–93 (1998)
Article Google Scholar
Bodon, F.: A fast APRIORI implementation. In: Proceedings of the IEEE ICDM Workshop on Frequent Itemset Mining Implementations, FIMI 2003, vol. 90, pp. 56–65 (2003)
Google Scholar
Bonchi, F., Ferrari, E.: Privacy-Aware Knowledge Discovery: Novel Applications and New Techniques. Chapman & Hall/CRC Data Mining and Knowledge Discovery Series. CRC Press Inc., Boca Raton (2011)
Google Scholar
Bonchi, F., Lucchese, C.: On condensed representations of constrained frequent patterns. Knowl. Inf. Syst. 9(2), 180–201 (2006)
Article Google Scholar
Bonchi, F., et al.: Privacy in spatiotemporal data mining. In: Giannotti, F., Pedreschi, D. (eds.) Mobility, Data Mining and Privacy, pp. 297–333. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-75177-9_12
Chapter Google Scholar
Borgelt, C.: Frequent item set mining. Wiley Interdisc. Rev.: Data Min. Knowl. Discov. 2(6), 437–456 (2012)
Google Scholar
Boulicaut, J.-F., Jeudy, B.: Constraint-based data mining. In: Maimon, O., Rokach, L. (eds.) The Data Mining and Knowledge Discovery Handbook, pp. 399–416. Springer, Boston (2005). https://doi.org/10.1007/0-387-25465-X_18
Chapter Google Scholar
Bu, S., Lakshmanan, L.V.S., Ng, R.T., Ramesh, G.: Preservation of patterns and input-output privacy. In: ICDE, pp. 696–705 (2007)
Google Scholar
Clifton, C.: Protecting against data mining through samples. In: Atluri, V., Hale, J. (eds.) Research Advances in Database and Information Systems Security. ITIFIP, vol. 43, pp. 193–207. Springer, Boston, MA (2000). https://doi.org/10.1007/978-0-387-35508-5_13
Chapter Google Scholar
Dasseni, E., Verykios, V.S., Elmagarmid, A.K., Bertino, E.: Hiding association rules by using confidence and support. In: Moskowitz, I.S. (ed.) IH 2001. LNCS, vol. 2137, pp. 369–383. Springer, Heidelberg (2001). https://doi.org/10.1007/3-540-45496-9_27
Chapter MATH Google Scholar
Delis, A., Verykios, V.S., Tsitsonis, A.A.: A data perturbation approach to sensitive classification rule hiding. In: SAC, pp. 605–609 (2010)
Google Scholar
Evfimievski, A.V., Srikant, R., Agrawal, R., Gehrke, J.: Privacy preserving mining of association rules. Inf. Syst. 29(4), 343–364 (2004)
Article Google Scholar
Fredman, M.L., Khachiyan, L.: On the complexity of dualization of monotone disjunctive normal forms. J. Algorithms 21, 618–628 (1996)
Article MathSciNet Google Scholar
Frequent Itemset Mining Dataset Repository. http://fimi.ua.ac.be/data/
Gkoulalas-Divanis, A., Verykios, V.: Association Rule Hiding for Data Mining. Advances in Database Systems. Springer, Boston (2010). https://doi.org/10.1007/978-1-4419-6569-1
Book MATH Google Scholar
Gkoulalas-Divanis, A., Verykios, V.S.: An integer programming approach for frequent itemset hiding. In: CIKM, pp. 748–757 (2006)
Google Scholar
Gkoulalas-Divanis, A., Verykios, V.S.: Exact knowledge hiding through database extension. IEEE Trans. Knowl. Data Eng. 21(5), 699–713 (2009)
Article Google Scholar
Gkoulalas-Divanis, A., Verykios, V.S.: Hiding sensitive knowledge without side effects. Knowl. Inf. Syst. 20(3), 263–299 (2009)
Article Google Scholar
Gurvich, V., Khachiyan, L.: Hiding sensitive knowledge without side effects. Discrete Appl. Math. 96–97, 363–373 (1999)
Article Google Scholar
IBM Basket Data Generator. http://sourceforge.net/projects/ibmquestdatagen/
Kantarcioglu, M., Clifton, C.: Privacy-preserving distributed mining of association rules on horizontally partitioned data. IEEE Trans. Knowl. Data Eng. 16(9), 1026–1037 (2004)
Article Google Scholar
Kantarcioglu, M., Jin, J., Clifton, C.: When do data mining results violate privacy? In: KDD, pp. 599–604 (2004)
Google Scholar
Kohavi, R., Brodley, C., Frasca, B., Mason, L., Zheng, Z.: KDD-Cup 2000 organizers’ report: peeling the onion. SIGKDD Explor. 2(2), 86–98 (2000). http://www.ecn.purdue.edu/KDDCUP
Lindell,Y., Pinkas, B.: Privacy preserving data mining. In: CRYPTO, pp. 36–54 (2000)
Chapter Google Scholar
Menon, S., Sarkar, S.: Minimizing information loss and preserving privacy. Manag. Sci. 53(1), 101–116 (2007)
Article Google Scholar
Menon, S., Sarkar, S., Mukherjee, S.: Maximizing accuracy of shared databases when concealing sensitive patterns. Inf. Syst. Res. 16(3), 256–270 (2005)
Article Google Scholar
Moustakides, G.V., Verykios, V.S.: A maxmin approach for hiding frequent itemsets. Data Knowl. Eng. 65(1), 75–89 (2008)
Article Google Scholar
Oliveira, S.R.M., Zaïane,O.R.: Algorithms for balancing privacy and knowledge discovery in association rule mining. In: IDEAS, pp. 54–65 (2003)
Google Scholar
Oliveira, S.R.M., Zaïane,O.R.: Protecting sensitive knowledge by data sanitization. In: ICDM, pp. 613–616 (2003)
Google Scholar
Rizvi, S., Haritsa, J.R.: Maintaining data privacy in association rule mining. In: VLDB, pp. 682–693 (2002)
Chapter Google Scholar
Saygin, Y., Verykios, V.S., Clifton, C.: Using unknowns to prevent discovery of association rules. SIGMOD Rec. 30(4), 45–54 (2001)
Article Google Scholar
Srikant, R., Vu, Q., Agrawal, R.: Mining association rules with item constraints. In: KDD, pp. 67–73 (1997)
Google Scholar
Stavropoulos, E.C., Verykios, V.S., Kagklis, V.: A transversal hypergraph approach for the frequent itemset hiding problem. Knowl. Inf. Syst. 47(3), 625–645 (2016)
Article Google Scholar
Sun, X., Yu, P.S.: A border-based approach for hiding sensitive frequent itemsets. In: ICDM, pp. 426–433 (2005)
Google Scholar
Sun, X., Yu, P.S.: Hiding sensitive frequent itemsets by a border-based approach. JCSE 1(1), 74–94 (2007)
Article Google Scholar
Verykios, V.S., Elmagarmid, A.K., Bertino, E., Saygin, Y., Dasseni, E.: Association rule hiding. IEEE Trans. Knowl. Data Eng. 16(4), 434–447 (2004)
Article Google Scholar
Verykios, V.S., Pontikakis, E.D., Theodoridis, Y., Chang, L.: Efficient algorithms for distortion and blocking techniques in association rule hiding. Distrib. Parallel Databases 22(1), 85–104 (2007)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Big Data Analytics and Anonymization Laboratory, School of Science and Technology, Hellenic Open University, Patras, Greece
Vassilios S. Verykios
Educational Content, Methodology and Technology Laboratory, Hellenic Open University, Patras, Greece
Elias C. Stavropoulos
Hellenic Data Protection Authority, Athens, Greece
Vasilis Zorkadis
Qatar Computing Research Institute, HBKU, Doha, Qatar
Ahmed K. Elmagarmid

Authors

Vassilios S. Verykios
View author publications
You can also search for this author in PubMed Google Scholar
Elias C. Stavropoulos
View author publications
You can also search for this author in PubMed Google Scholar
Vasilis Zorkadis
View author publications
You can also search for this author in PubMed Google Scholar
Ahmed K. Elmagarmid
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Elias C. Stavropoulos .

Editor information

Editors and Affiliations

Open University of Cyprus, Nicosia, Cyprus
Sokratis Katsikas
Hellenic Data Protection Authority, Athens, Greece
Vasilios Zorkadis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Verykios, V.S., Stavropoulos, E.C., Zorkadis, V., Elmagarmid, A.K. (2020). A Constraint-Based Model for the Frequent Itemset Hiding Problem. In: Katsikas, S., Zorkadis, V. (eds) E-Democracy – Safeguarding Democracy and Human Rights in the Digital Age. e-Democracy 2019. Communications in Computer and Information Science, vol 1111. Springer, Cham. https://doi.org/10.1007/978-3-030-37545-4_4

Download citation

DOI: https://doi.org/10.1007/978-3-030-37545-4_4
Published: 10 December 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-37544-7
Online ISBN: 978-3-030-37545-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics