Abstract
In this paper, we propose a practical privacy-preserving generative model for data sanitization and sharing, called Sanitizer-Variational Autoencoder (SVAE). We assume that the data consists of identification-relevant and irrelevant components. A variational autoencoder (VAE) based sanitization model is proposed to strip the identification-relevant features and only retain identification-irrelevant components in a privacy-preserving manner. The sanitization allows for task-relevant discrimination (utility) but minimizes the personal identification information leakage (privacy). We conduct extensive empirical evaluations on the real-world face, biometric signal and speech datasets, and validate the effectiveness of our proposed SVAE, as well as the robustness against the membership inference attack.
Keywords
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Abadi, M., et al.: Deep learning with differential privacy. In: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, pp. 308–318. ACM (2016)
Abdulkader, S.N., Atia, A., Mostafa, M.S.M.: Brain computer interfacing: applications and challenges. Egypt. Inf. J. 16(2), 213–230 (2015)
Alyasseri, Z.A.A., Khader, A.T., Al-Betar, M.A., Papa, J.P., Alomari, O.A.: Eeg feature extraction for person identification using wavelet decomposition and multi-objective flower pollination algorithm. IEEE Access 6, 76007–76024 (2018)
Arjovsky, M., Chintala, S., Bottou, L.: Wasserstein gan. arXiv preprint arXiv:1701.07875 (2017)
Beaulieu-Jones, B.K., Wu, Z.S., Williams, C., Greene, C.S.: Privacy-preserving generative deep neural networks support clinical data sharing. BioRxiv, p. 159756 (2017)
Chen, X., Duan, Y., Houthooft, R., Schulman, J., Sutskever, I., Abbeel, P.: Infogan: interpretable representation learning by information maximizing generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2172–2180 (2016)
Cynthia, D.: Differential privacy. Automata, languages and programming, pp. 1–12 (2006)
Esteban, C., Hyland, S.L., Rätsch, G.: Real-valued (medical) time series generation with recurrent conditional gans. arXiv preprint arXiv:1706.02633 (2017)
Garofolo, J.S., Lamel, L.F., Fisher, W.M., Fiscus, J.G., Pallett, D.S.: Darpa timit acoustic-phonetic continous speech corpus cd-rom. nist speech disc 1–1.1. NASA STI/Recon technical report n 93 (1993)
Gatys, L.A., Ecker, A.S., Bethge, M.: A neural algorithm of artistic style. arXiv preprint arXiv:1508.06576 (2015)
Goldberger, A.L., et al.: Physiobank, physiotoolkit, and physionet: components of a new research resource for complex physiologic signals. Circulation 101(23), e215–e220 (2000)
Guibas, J.T., Virdi, T.S., Li, P.S.: Synthetic medical images from dual generative adversarial networks. arXiv preprint arXiv:1709.01872 (2017)
Kingma, D.P., Welling, M.: Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114 (2013)
Kumari, P., Vaish, A.: Brainwave based authentication system: research issues and challenges. Int. J. Comput. Eng. Appl. 4(1), 2 (2014)
Larsen, A.B.L., Sønderby, S.K., Larochelle, H., Winther, O.: Autoencoding beyond pixels using a learned similarity metric. arXiv preprint arXiv:1512.09300 (2015)
Li, Y., Swersky, K., Zemel, R.: Generative moment matching networks. In: International Conference on Machine Learning, pp. 1718–1727 (2015)
Liu, Z., Luo, P., Wang, X., Tang, X.: Deep learning face attributes in the wild. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3730–3738 (2015)
Lyu, L., et al.: Towards fair and privacy-preserving federated deep models. IEEE Trans. Parallel Distrib. Syst. 31(11), 2524–2541 (2020)
Makhzani, A., Shlens, J., Jaitly, N., Goodfellow, I., Frey, B.: Adversarial autoencoders. arXiv preprint arXiv:1511.05644 (2015)
Mescheder, L., Nowozin, S., Geiger, A.: Adversarial variational bayes: unifying variational autoencoders and generative adversarial networks. arXiv preprint arXiv:1701.04722 (2017)
Salimans, T., Goodfellow, I., Zaremba, W., Cheung, V., Radford, A., Chen, X.: Improved techniques for training gans. In: Advances in Neural Information Processing Systems, pp. 2234–2242 (2016)
Schirrmeister, R.T., et al.: Deep learning with convolutional neural networks for EEG decoding and visualization. Hum. Brain Mapp. 38(11), 5391–5420 (2017)
Shokri, R., Stronati, M., Song, C., Shmatikov, V.: Membership inference attacks against machine learning models. In: 2017 IEEE Symposium on Security and Privacy (SP), pp. 3–18. IEEE (2017)
Song, S., Chaudhuri, K., Sarwate, A.D.: Stochastic gradient descent with differentially private updates. In: 2013 IEEE Global Conference on Signal and Information Processing (GlobalSIP), pp. 245–248. IEEE (2013)
Xie, L., Lin, K., Wang, S., Wang, F., Zhou, J.: Differentially private generative adversarial network. arXiv preprint arXiv:1802.06739 (2018)
Zhang, X., Ji, S., Wang, T.: Differentially private releasing via deep generative model. arXiv preprint arXiv:1801.01594 (2018)
Zue, V., Seneff, S., Glass, J.: Speech database development at mit: Timit and beyond. Speech Commun. 9(4), 351–356 (1990)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Wang, S. et al. (2020). Privacy-Preserving Data Generation and Sharing Using Identification Sanitizer. In: Huang, Z., Beek, W., Wang, H., Zhou, R., Zhang, Y. (eds) Web Information Systems Engineering – WISE 2020. WISE 2020. Lecture Notes in Computer Science(), vol 12343. Springer, Cham. https://doi.org/10.1007/978-3-030-62008-0_13
Download citation
DOI: https://doi.org/10.1007/978-3-030-62008-0_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-62007-3
Online ISBN: 978-3-030-62008-0
eBook Packages: Computer ScienceComputer Science (R0)