Achieving Privacy Preservation Constraints in Missing-Value Datasets

Riyana, Surapon; Nanthachumphu, Srikul; Riyana, Noppamas

doi:10.1007/s42979-020-00241-9

Achieving Privacy Preservation Constraints in Missing-Value Datasets

Original Research
Published: 04 July 2020

Volume 1, article number 227, (2020)
Cite this article

SN Computer Science Aims and scope Submit manuscript

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Abstract

Privacy violation issues must be taken into consideration when datasets are released for public use. To address these issues, there are various anonymization models to be proposed, e.g., k-anonymity, l-diversity, and t-closeness. However, these anonymization models generally propose to address privacy violation issues in datasets which are assumed that all attributes of them must be completed. Thus, these anonymization models could be insufficient to address privacy violation issues in such a dataset which is allowed to collect missing-values, e.g., rating datasets and trajectory datasets. Therefore, a new appropriate privacy preservation model for missing-value datasets is proposed by this work. With the proposed model, aside from privacy preservation, the data utility is also maintained as much as possible. Moreover, a suitable data utility metric for missing-value datasets is also presented by this work. Furthermore, the proposed model is shown that it is an NP-Complete problem by reduction from the X3C problem.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

($l^{p_1}, \ldots ,l^{p_n}$)-Privacy: privacy preservation models for numerical quasi-identifiers and multiple sensitive attributes

Article 02 January 2021

Sanitizing and measuring privacy of large sparse datasets for recommender systems

Article 13 July 2019

A framework for utility enhanced incomplete microdata anonymization

Article 28 February 2017

References

Aggarwal G, Feder T, Kenthapadi K, Motwani R, Panigrahy R, Thomas D, Zhu A. Approximation algorithms for k-anonymity. J Priv Technol. 2005. http://ilpubs.stanford.edu:8090/645/
Bayardo RJ, Rakesh A. Data privacy through optimal k-anonymization. In: 21st international conference on data engineering (ICDE’05). 2005. p. 217–28. https://doi.org/10.1109/ICDE.2005.42
Bredereck R, Froese V, Hartung S, Nichterlein A, Niedermeier R, Talmon N. The complexity of degree anonymization by vertex addition. In: Gu Q, Hell P, Yang B, editors. Algorithmic aspects in information and management. Cham: Springer International Publishing; 2014. p. 44–55.
Google Scholar
Burke R. Knowledge-Based Recommender Systems. In: Encyclopedia of library and information systems; 2000
Byun JW, Kamra A, Bertino E, Li N. Efficient k-anonymization using clustering techniques. In: Kotagiri R, Krishna PR, Mohania M, Nantajeewarawat E, editors. Advances in databases: concepts, systems and applications. Berlin: Springer; 2007. p. 188–200.
Chapter Google Scholar
Chen W, Niu Z, Zhao X, Li Y. A hybrid recommendation algorithm adapted in e-learning environments. World Wide Web. 2014;17(2):271–84. https://doi.org/10.1007/s11280-012-0187-z.
Article Google Scholar
Chi Y, Hong J, Jurek A, Liu W, O’Reilly D. Privacy preserving record linkage in the presence of missing values. Inf Syst. 2017;71:199–210. https://doi.org/10.1016/j.is.2017.07.001.
Article Google Scholar
De Vimercati SDC, Foresti S, Livraga G, Samarati P. Data privacy: deinitions and techniques. Int J Uncertainty, Fuzziness and Knowl Based Syst. 2012;20(6):793–817. https://doi.org/10.1142/S0218488512400247
Article Google Scholar
Fung, BCM, Cao M, Desai BC, Xu H. Privacy protection for rfid data. In: Proceedings of the 2009 ACM symposium on applied computing, SAC ’09. New York: ACM; 2009. p. 1528–35. https://doi.org/10.1145/1529282.1529626
Fung BCM, Wang K, Chen R, Yu PS. Privacy-preserving data publishing: a survey of recent developments. ACM Comput Surv. 2010;42(4):14:1–53. https://doi.org/10.1145/1749603.1749605.
Article Google Scholar
Fung BCM, Wang K, Yu PS. Top-down specialization for information and privacy preservation. In: 21st international conference on data engineering (ICDE’05). 2005. p. 205–16. https://doi.org/10.1109/ICDE.2005.143
Garey MR, Johnson DS. Computers and Intractability: a guide to the theory of NP-completeness. New York: W. H. Freeman & Co.; 1979.
MATH Google Scholar
Ghinita G, Karras P, Kalnis P, Mamoulis N. A framework for efficient data anonymization under privacy and accuracy constraints. ACM Trans Database Syst. 2009;34(2):9:1–47. https://doi.org/10.1145/1538909.1538911.
Article Google Scholar
Gionis A, Tassa T. k-anonymization with minimal loss of information. IEEE Trans Knowl Data Eng. 2009;21(2):206–19.
Article Google Scholar
Jagannathan G, Wright RN. Privacy-preserving imputation of missing data. Data Knowl Eng. 2008;65(1):40–56. https://doi.org/10.1016/j.datak.2007.06.013.
Article Google Scholar
Kordelas GA, Alexiadis DS, Daras P, Izquierdo E. Content-based guided image filtering, weighted semi-global optimization, and efficient disparity refinement for fast and accurate disparity estimation. IEEE Trans Multimed. 2016;18(2):155–70. https://doi.org/10.1109/TMM.2015.2505905.
Article Google Scholar
LeFevre K, DeWitt DJ, Ramakrishnan R. Incognito: efficient full-domain k-anonymity. In: Proceedings of the 2005 ACM SIGMOD international conference on management of data, SIGMOD ’05. ACM. 2005. p. 49–60. https://doi.org/10.1145/1066157.1066164
LeFevre K, DeWitt DJ, Ramakrishnan R. Mondrian multidimensional k-anonymity. In: 22nd international conference on data engineering (ICDE’06). 2006. p. 25. https://doi.org/10.1109/ICDE.2006.101
Li N, Li T, Venkatasubramanian S. t-closeness: privacy beyond k-anonymity and l-diversity. In: 2007 IEEE 23rd international conference on data engineering. 2007. p. 106–15. https://doi.org/10.1109/ICDE.2007.367856
Liu J, Tang M, Zheng Z, Liu X, Lyu S. Location-aware and personalized collaborative filtering for web service recommendation. IEEE Trans Serv Comput. 2016;9(5):686–99. https://doi.org/10.1109/TSC.2015.2433251.
Article Google Scholar
Machanavajjhala A, Kifer D, Gehrke J, Venkitasubramaniam M. L-diversity: privacy beyond k-anonymity. ACM Trans Knowl Discov Data. 2007;. https://doi.org/10.1145/1217299.1217302.
Article Google Scholar
Meyerson A, Williams R. On the complexity of optimal k-anonymity. In: Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on principles of database systems, PODS ’04. New York: Association for Computing Machinery; 2004. p. 223–28. https://doi.org/10.1145/1055558.1055591.
Nergiz ME, Clifton C. Thoughts on k-anonymization. In: 22nd international conference on data engineering workshops (ICDEW’06). 2006. p. 96. https://doi.org/10.1109/ICDEW.2006.147
Ramakrishnan N, Keller BJ, Mirza BJ, Grama AY, Karypis G. Privacy risks in recommender systems. IEEE Internet Comput. 2001;5(6):54–62. https://doi.org/10.1109/4236.968832.
Article Google Scholar
Riyana S, Harnsamut N, Soontornphand T, Natwichai J. (k, e)-anonymous for ordinal data. In: 2015 18th international conference on network-based information systems. 2015. p. 489–93. https://doi.org/10.1109/NBiS.2015.118
Riyana S, Natwichai J. Privacy preservation for recommendation databases. Serv Oriented Comput Appl. 2018;12(3–4):259–73. https://doi.org/10.1007/s11761-018-0248-y.
Article Google Scholar
Riyana S, Riyana N, Nanthachumphu S. Enhanced (k,e)-anonymous for categorical data. In: Proceedings of the 6th international conference on software and computer applications, ICSCA ’17. New York: ACM; 2017. p. 62–7. https://doi.org/10.1145/3056662.3056668
Schnell R, Bachteler T, Reiher J. Privacy-preserving record linkage using bloom filters. BMC Med Inform Decis Mak. 2009;9:41. https://doi.org/10.1186/1472-6947-9-41.
Article Google Scholar
Sitti S, Riyana S, Riyana N. Scenario of privacy violation within the recommendation databases. In: 2017 international conference on digital arts, media and technology (ICDAMT). 2017. p. 383–88. https://doi.org/10.1109/ICDAMT.2017.7904997
Sweeney L. Achieving k-anonymity privacy protection using generalization and suppression. Int J Uncertain Fuzziness Knowl Based Syst. 2002;10(5):571–88. https://doi.org/10.1142/S021848850200165X.
Article MathSciNet MATH Google Scholar
Sweeney L. K-anonymity: a model for protecting privacy. Int J Uncertain Fuzziness Knowl Based Syst. 2002;10(5):557–70. https://doi.org/10.1142/S0218488502001648.
Article MathSciNet MATH Google Scholar
Xu J, Wang W, Pei J, Wang X, Shi B, Fu AWC. Utility-based anonymization using local recoding. In: Proceedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’06. New York, ACM; 2006. p. 785–90, 383–88. https://doi.org/10.1145/1150402.1150504
Zhang Q, Koudas N, Srivastava D, Yu T. Aggregate query answering on anonymized tables. In: 2007 IEEE 23rd international conference on data engineering. 2007. p. 116–25. https://doi.org/10.1109/ICDE.2007.367857

Download references

Author information

Authors and Affiliations

Maejo University (MJU), Chiangmai-Phrao Road, Maejo, Sansai, Chiang Mai, 50290, Thailand
Surapon Riyana, Srikul Nanthachumphu & Noppamas Riyana

Authors

Surapon Riyana
View author publications
You can also search for this author inPubMed Google Scholar
Srikul Nanthachumphu
View author publications
You can also search for this author inPubMed Google Scholar
Noppamas Riyana
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Surapon Riyana.

Ethics declarations

Conflict of interest

Author declares that they have no conflict of interest.

Ethical approval

This paper does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article is part of the topical collection “Privacy, Data Protection and Digital Identity” guest edited by Fernando Boavida, Andrea Praitano and Georgios V. Lioudakis.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Riyana, S., Nanthachumphu, S. & Riyana, N. Achieving Privacy Preservation Constraints in Missing-Value Datasets. SN COMPUT. SCI. 1, 227 (2020). https://doi.org/10.1007/s42979-020-00241-9

Download citation

Received: 22 January 2020
Accepted: 24 June 2020
Published: 04 July 2020
DOI: https://doi.org/10.1007/s42979-020-00241-9

Keywords

Part of a collection:

Privacy, Data Protection and Digital Identity

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Achieving Privacy Preservation Constraints in Missing-Value Datasets

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

(\(l^{p_1}, \ldots ,l^{p_n}\))-Privacy: privacy preservation models for numerical quasi-identifiers and multiple sensitive attributes

Sanitizing and measuring privacy of large sparse datasets for recommender systems

A framework for utility enhanced incomplete microdata anonymization

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Achieving Privacy Preservation Constraints in Missing-Value Datasets

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

(\(l^{p_1}, \ldots ,l^{p_n}\))-Privacy: privacy preservation models for numerical quasi-identifiers and multiple sensitive attributes

Sanitizing and measuring privacy of large sparse datasets for recommender systems

A framework for utility enhanced incomplete microdata anonymization

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now