Skip to main content

A Constraint-Based Model for the Frequent Itemset Hiding Problem

  • Conference paper
  • First Online:
E-Democracy – Safeguarding Democracy and Human Rights in the Digital Age (e-Democracy 2019)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1111))

Included in the following conference series:

Abstract

This paper introduces a novel constraint-based hiding model to drastically reduce the preprocessing overhead that is incurred by border-based techniques in the hiding of sensitive frequent itemsets. The proposed model is solved by an efficient constraint-based mining algorithm that pushes a conjunction of antimonotone constraints into an Apriori-like algorithm, for inducing the support theory of non-sensitive frequent itemsets along with its negative border. The patterns induced by the constraint-based mining algorithm can be used in border-based hiding algorithms to construct a sanitized version of the original database, where the sensitive knowledge is concealed. The efficiency of the constraint-based mining algorithm is evaluated on real and synthetic datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The algorithm processes the itemsets levelwise: it firstly determines the frequent itemsets of size 1, then generates all itemesets of size 2 and determines which of these are frequent, etc. At the n-th level, all frequent itemsets of size n-1 are known and an itemset of size n is frequent if all its subsets of size n-1 are frequent (the Apriory property).

  2. 2.

    Non-sensitive itemsets are indifferent to our problem and there is to need to be specified explicitly.

References

  1. Abul, O., Gökçe, H.: Knowledge hiding from tree and graph databases. Data Knowl. Eng. 72, 148–171 (2012)

    Article  Google Scholar 

  2. Aggarwal, C., Yu, P.: Privacy-Preserving Data Mining: Models and Algorithms. Advances in Database Systems. Springer, Boston (2008). https://doi.org/10.1007/978-0-387-70992-5

    Book  Google Scholar 

  3. Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: VLDB, pp. 487–499 (1994)

    Google Scholar 

  4. Agrawal, R., Srikant, R.: Privacy-preserving data mining. In: SIGMOD Conference, pp. 439–450 (2000)

    Article  Google Scholar 

  5. Atallah, M., Bertino, E., Elmagarmid, A., Ibrahim, M., Verykios, V.: Disclosure limitation of sensitive rules. In: KDEX Workshop, pp. 45–52. IEEE (1999)

    Google Scholar 

  6. Atzori, M., Bonchi, F., Giannotti, F., Pedreschi, D.: Anonymity preserving pattern discovery. VLDB J. 17(4), 703–727 (2008)

    Article  Google Scholar 

  7. Bayardo, R.: Efficiently mining long patterns from databases. In: Proceedings of SIGMOD 1998, pp. 85–93 (1998)

    Article  Google Scholar 

  8. Bodon, F.: A fast APRIORI implementation. In: Proceedings of the IEEE ICDM Workshop on Frequent Itemset Mining Implementations, FIMI 2003, vol. 90, pp. 56–65 (2003)

    Google Scholar 

  9. Bonchi, F., Ferrari, E.: Privacy-Aware Knowledge Discovery: Novel Applications and New Techniques. Chapman & Hall/CRC Data Mining and Knowledge Discovery Series. CRC Press Inc., Boca Raton (2011)

    Google Scholar 

  10. Bonchi, F., Lucchese, C.: On condensed representations of constrained frequent patterns. Knowl. Inf. Syst. 9(2), 180–201 (2006)

    Article  Google Scholar 

  11. Bonchi, F., et al.: Privacy in spatiotemporal data mining. In: Giannotti, F., Pedreschi, D. (eds.) Mobility, Data Mining and Privacy, pp. 297–333. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-75177-9_12

    Chapter  Google Scholar 

  12. Borgelt, C.: Frequent item set mining. Wiley Interdisc. Rev.: Data Min. Knowl. Discov. 2(6), 437–456 (2012)

    Google Scholar 

  13. Boulicaut, J.-F., Jeudy, B.: Constraint-based data mining. In: Maimon, O., Rokach, L. (eds.) The Data Mining and Knowledge Discovery Handbook, pp. 399–416. Springer, Boston (2005). https://doi.org/10.1007/0-387-25465-X_18

    Chapter  Google Scholar 

  14. Bu, S., Lakshmanan, L.V.S., Ng, R.T., Ramesh, G.: Preservation of patterns and input-output privacy. In: ICDE, pp. 696–705 (2007)

    Google Scholar 

  15. Clifton, C.: Protecting against data mining through samples. In: Atluri, V., Hale, J. (eds.) Research Advances in Database and Information Systems Security. ITIFIP, vol. 43, pp. 193–207. Springer, Boston, MA (2000). https://doi.org/10.1007/978-0-387-35508-5_13

    Chapter  Google Scholar 

  16. Dasseni, E., Verykios, V.S., Elmagarmid, A.K., Bertino, E.: Hiding association rules by using confidence and support. In: Moskowitz, I.S. (ed.) IH 2001. LNCS, vol. 2137, pp. 369–383. Springer, Heidelberg (2001). https://doi.org/10.1007/3-540-45496-9_27

    Chapter  MATH  Google Scholar 

  17. Delis, A., Verykios, V.S., Tsitsonis, A.A.: A data perturbation approach to sensitive classification rule hiding. In: SAC, pp. 605–609 (2010)

    Google Scholar 

  18. Evfimievski, A.V., Srikant, R., Agrawal, R., Gehrke, J.: Privacy preserving mining of association rules. Inf. Syst. 29(4), 343–364 (2004)

    Article  Google Scholar 

  19. Fredman, M.L., Khachiyan, L.: On the complexity of dualization of monotone disjunctive normal forms. J. Algorithms 21, 618–628 (1996)

    Article  MathSciNet  Google Scholar 

  20. Frequent Itemset Mining Dataset Repository. http://fimi.ua.ac.be/data/

  21. Gkoulalas-Divanis, A., Verykios, V.: Association Rule Hiding for Data Mining. Advances in Database Systems. Springer, Boston (2010). https://doi.org/10.1007/978-1-4419-6569-1

    Book  MATH  Google Scholar 

  22. Gkoulalas-Divanis, A., Verykios, V.S.: An integer programming approach for frequent itemset hiding. In: CIKM, pp. 748–757 (2006)

    Google Scholar 

  23. Gkoulalas-Divanis, A., Verykios, V.S.: Exact knowledge hiding through database extension. IEEE Trans. Knowl. Data Eng. 21(5), 699–713 (2009)

    Article  Google Scholar 

  24. Gkoulalas-Divanis, A., Verykios, V.S.: Hiding sensitive knowledge without side effects. Knowl. Inf. Syst. 20(3), 263–299 (2009)

    Article  Google Scholar 

  25. Gurvich, V., Khachiyan, L.: Hiding sensitive knowledge without side effects. Discrete Appl. Math. 96–97, 363–373 (1999)

    Article  Google Scholar 

  26. IBM Basket Data Generator. http://sourceforge.net/projects/ibmquestdatagen/

  27. Kantarcioglu, M., Clifton, C.: Privacy-preserving distributed mining of association rules on horizontally partitioned data. IEEE Trans. Knowl. Data Eng. 16(9), 1026–1037 (2004)

    Article  Google Scholar 

  28. Kantarcioglu, M., Jin, J., Clifton, C.: When do data mining results violate privacy? In: KDD, pp. 599–604 (2004)

    Google Scholar 

  29. Kohavi, R., Brodley, C., Frasca, B., Mason, L., Zheng, Z.: KDD-Cup 2000 organizers’ report: peeling the onion. SIGKDD Explor. 2(2), 86–98 (2000). http://www.ecn.purdue.edu/KDDCUP

  30. Lindell,Y., Pinkas, B.: Privacy preserving data mining. In: CRYPTO, pp. 36–54 (2000)

    Chapter  Google Scholar 

  31. Menon, S., Sarkar, S.: Minimizing information loss and preserving privacy. Manag. Sci. 53(1), 101–116 (2007)

    Article  Google Scholar 

  32. Menon, S., Sarkar, S., Mukherjee, S.: Maximizing accuracy of shared databases when concealing sensitive patterns. Inf. Syst. Res. 16(3), 256–270 (2005)

    Article  Google Scholar 

  33. Moustakides, G.V., Verykios, V.S.: A maxmin approach for hiding frequent itemsets. Data Knowl. Eng. 65(1), 75–89 (2008)

    Article  Google Scholar 

  34. Oliveira, S.R.M., Zaïane,O.R.: Algorithms for balancing privacy and knowledge discovery in association rule mining. In: IDEAS, pp. 54–65 (2003)

    Google Scholar 

  35. Oliveira, S.R.M., Zaïane,O.R.: Protecting sensitive knowledge by data sanitization. In: ICDM, pp. 613–616 (2003)

    Google Scholar 

  36. Rizvi, S., Haritsa, J.R.: Maintaining data privacy in association rule mining. In: VLDB, pp. 682–693 (2002)

    Chapter  Google Scholar 

  37. Saygin, Y., Verykios, V.S., Clifton, C.: Using unknowns to prevent discovery of association rules. SIGMOD Rec. 30(4), 45–54 (2001)

    Article  Google Scholar 

  38. Srikant, R., Vu, Q., Agrawal, R.: Mining association rules with item constraints. In: KDD, pp. 67–73 (1997)

    Google Scholar 

  39. Stavropoulos, E.C., Verykios, V.S., Kagklis, V.: A transversal hypergraph approach for the frequent itemset hiding problem. Knowl. Inf. Syst. 47(3), 625–645 (2016)

    Article  Google Scholar 

  40. Sun, X., Yu, P.S.: A border-based approach for hiding sensitive frequent itemsets. In: ICDM, pp. 426–433 (2005)

    Google Scholar 

  41. Sun, X., Yu, P.S.: Hiding sensitive frequent itemsets by a border-based approach. JCSE 1(1), 74–94 (2007)

    Article  Google Scholar 

  42. Verykios, V.S., Elmagarmid, A.K., Bertino, E., Saygin, Y., Dasseni, E.: Association rule hiding. IEEE Trans. Knowl. Data Eng. 16(4), 434–447 (2004)

    Article  Google Scholar 

  43. Verykios, V.S., Pontikakis, E.D., Theodoridis, Y., Chang, L.: Efficient algorithms for distortion and blocking techniques in association rule hiding. Distrib. Parallel Databases 22(1), 85–104 (2007)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Elias C. Stavropoulos .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Verykios, V.S., Stavropoulos, E.C., Zorkadis, V., Elmagarmid, A.K. (2020). A Constraint-Based Model for the Frequent Itemset Hiding Problem. In: Katsikas, S., Zorkadis, V. (eds) E-Democracy – Safeguarding Democracy and Human Rights in the Digital Age. e-Democracy 2019. Communications in Computer and Information Science, vol 1111. Springer, Cham. https://doi.org/10.1007/978-3-030-37545-4_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-37545-4_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-37544-7

  • Online ISBN: 978-3-030-37545-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics