Abstract
Machine learning in artificial intelligence relies on legitimate big data, where the process of data publishing involves a large number of privacy issues. m-Invariance is a fundamental privacy-preserving notion in microdata republication. Unfortunately, if for big data release, the existing generalization based m-Invariance requiring to modify the origin microdata incurs the problems of data utility loss and poor aggregate querying performance. Furthermore, due to the high dimension of quasi-identifiers in big data, unaffordable generalization operations makes it difficult to be practical. In this paper, we remedy the drawbacks above to achieve m-Invariance in big data release. We first propose a new anatomy based m-Invariance definition and framework, where the anatomy approach tries to achieve privacy by breaking the correlations between the sensitive attributes and non-sensitive identifiers. We next establish a series of criteria for anatomy to cope with republication due to the data dynamics. We then develop an algorithm to realize the above ideas. Theoretical and experimental analysis confirm the advantages of our anatomy based m-Invariance approach in the terms of data utility, aggregate querying accuracy and capacity to process high dimension of quasi-identifiers in big data release.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
O’Leary, D.E.: Artificial intelligence and big data. IEEE Intell. Syst. 28(2), 96–99 (2013)
Mehmood, A., Natgunanathan, I., Xiang, Y., Hua, G., Guo, S.: Protection of big data privacy. IEEE Access 4, 1821–1834 (2016)
Sweeney, L.: k-anonymity: a model for protecting privacy. IEEE Secur. Priv. Mag. 10(5), 1–14 (2002)
Machanavajjhala, A., Gehrke, J., Kifer, D., Venkitasubramaniam, M.: L-diversity: privacy beyond k-anonymity. In: ICDE, p. 24 (2010)
Li, N., Li, T., Venkatasubramanian, S.: t-closeness: Privacy beyond k-anonymity and l-diversity. In: IEEE International Conference on Data Engineering, pp. 106–115 (2007)
Nergiz, M.E., Atzori, M., Clifton, C.: Hiding the presence of individuals from shared databases. In: ACM SIGMOD International Conference on Management of Data, pp. 665–676, Beijing, China, June 2007
Dwork, C.: Differential privacy: a survey of results. In: Proceedings of International Conference on Theory and Applications of MODELS of Computation, Tamc 2008, Xi’an, China, 25–29 April 2008, pp. 1–19 (2008)
Lefevre, K., Dewitt, D.J., Ramakrishnan, R.: Mondrian multidimensional k-anonymity. In: International Conference on Data Engineering, pp. 25–25 (2006)
Xiao, X., Tao, Y.: Anatomy: simple and effective privacy preservation. In: International Conference on Very Large Data Bases, pp. 139–150, Korea, September, Seoul (2006)
Xiao, X., Tiao, Y.: M-invariance: towards privacy preserving re-publication of dynamic datasets. In: ACM SIGMOD International Conference on Management of Data, Beijing, China, June 2007, pp. 689–700 (2007)
Liu, X., Xie, Q., Wang, L.: Personalized extended (\(\alpha \), k)-anonymity model for privacy-preserving data publishing. Concurr. Comput. Pract. Exp. 29(6), e3886 (2017)
Wang, P., Wang, J.: L-diversity algorithm for incremental data release. Appl. Math. Inf. 7(5), 2055–2060 (2013)
Pramanik, M.I., Lau, R.Y.K., Zhang, W.: K-anonymity through the enhanced clustering method. In: IEEE International Conference on E-Business Engineering, pp. 85–91 (2016)
Meyerson, A., Williams, R.: On the complexity of optimal k-anonymity. In: ACM Sigact-Sigmod-Sigart Symposium on Principles of Database Systems, 14–16 June 2004, Paris, France, 2004, pp. 223–228 (2004)
Bayardo, R.J., Agrawal, R.: Data privacy through optimal k-anonymization. In: ICDE, pp. 217–228 (2005)
Zhang, Q., Koudas, N., Srivastava, D., Yu, T.: Aggregate query answering on anonymized tables. In: IEEE International Conference on Data Engineering, pp. 116–125 (2007)
Agrawal, R., Srikant, R.: Privacy-preserving data mining. Decis. Eng. 2(3), 86–92 (2008)
Soria-Comas, J., Domingo-Ferrer, J., Snchez, D.: t-closeness through microaggregation: strict privacy with enhanced utility preservation. In: IEEE Transactions on Knowledge and Data Engineering, vol. 27, no. 11, pp. 3098–3110 (2015)
Clifton, C., Tassa, T.: On syntactic anonymity and differential privacy. In: IEEE International Conference on Data Engineering Workshops, pp. 88–93 (2013)
Zhao, J., Jung, T., Wang, Y., Li, X.: Achieving differential privacy of data disclosure in the smart grid. In: 2014 Proceedings IEEE INFOCOM, pp. 504–512 (2014)
Ji, Z., Elkan, C.: Differential privacy based on importance weighting. Mach. Lear. 93(1), 163 (2013)
Pei, J., Xu, J., Wang, Z., Wang, W., Wang, K.: Maintaining k-anonymity against incremental updates. In: International Conference on Scientific and Statistical Database Management, pp. 5 (2007)
Byun, J.-W., Sohn, Y., Bertino, E., Li, N.: Secure anonymization for incremental datasets. In: Jonker, W., Petković, M. (eds.) SDM 2006. LNCS, vol. 4165, pp. 48–63. Springer, Heidelberg (2006). https://doi.org/10.1007/11844662_4
Bu, Y., Fu, A.W.C., Wong, R.C.W., Chen, L., Li, J.: Privacy preserving serial data publishing by role composition. Proc. VLDB Endow. 1(1), 845–856 (2008)
Merugu, S., Ghosh, J.: Privacy-preserving distributed clustering using generative models. In: IEEE International Conference on Data Mining, pp. 211–218 (2003)
Fung, B.C.M., Wang, K., Wang, L., Hung, P.C.K.: Privacy-preserving data publishing for cluster analysis. Data Knowl. Eng. 68(6), 552–575 (2009)
Goryczka, S., Li, X., Fung, B.C.M.: m-privacy for collaborative data publishing. In: International Conference on Collaborative Computing: Networking, Applications and Worksharing, pp. 1–10 (2011)
Horvitz, E., Mulligan, D.: Data, privacy, and the greater good. Science 349(6245), 253–255 (2015)
Leenen, L., Meyer, T.: Artificial intelligence and big data analytics in support of cyber defense. In: Developments in Information Security and Cybernetic Wars, pp. 42–63 (2019)
Acknowledgements
Haibin Zheng is the corresponding author. This paper is supported by the National Key R&D Program of China through project H1943050901, by the Natural Science Foundation of China through projects 61972019, 61932011, 61772538.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Wang, H., Ma, W., Zheng, H., Liang, Z., Wu, Q. (2019). Privacy-Preserving Sequential Data Publishing. In: Liu, J., Huang, X. (eds) Network and System Security. NSS 2019. Lecture Notes in Computer Science(), vol 11928. Springer, Cham. https://doi.org/10.1007/978-3-030-36938-5_37
Download citation
DOI: https://doi.org/10.1007/978-3-030-36938-5_37
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-36937-8
Online ISBN: 978-3-030-36938-5
eBook Packages: Computer ScienceComputer Science (R0)