Abstract
The development of deep learning (DL) technology is dependent on the availability of large-scale image datasets to train deep neural networks (DNNs) for image classification. However, many raw image datasets contain sensitive identity feature information that prohibit entities from disclosing data due to privacy regulations. For example, an image dataset may include age or gender information that could be used to identify an individual. Furthermore, medical images may include additional disease information that could lead to patient re-identification. To address this problem, we propose an image transformation scheme using a convolutional autoencoder and multi-output classification model for privacy enhanced deep learning. The proposed scheme obfuscates image visual information while retaining useful attribute features that are required for model utility. Additionally, the proposed method enhances privacy by generating encoded images that exclude sensitive identity feature information. First, we train a multi-output convolutional neural network (CNN) to classify identity features and image attributes. Second, we use the pre-trained multi-output classifier for regularization in training a standard convolutional autoencoder to generate obfuscated versions of the original images that exclude identity feature information and preserve attribute features that are useful for classification. Our results on CelebA and Cifar-100 datasets illustrate that the proposed method successfully degrades classification accuracy of sensitive image data while maintaining model utility for non-sensitive data features.
Research supported in part by NSF CREST Grant HRD-1736209 (RK) and NSF CAREER Grant CNS-1553696 (RK).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Atallah, M.J., Pantazopoulos, K.N., Rice, J.R., Spafford, E.E.: Secure outsourcing of scientific computations. Adv. Comput. 54, 215–272 (2002)
Yuan, X., Wang, X., Wang, C., Squicciarini, A., Ren, K.: Enabling privacy-preserving image-centric social discovery. In: Proceedings of the 2014 IEEE 34th International Conference on Distributed Computing Systems, ser. ICDCS ’14. USA: IEEE Computer Society, pp. 198–207 (2014). https://doi.org/10.1109/ICDCS.2014.28
Wu, Z., Huang, Y., Wang, L., Wang, X., Tan, T.: A comprehensive study on cross-view gait based human identification with deep CNNs. IEEE Trans. Pattern Anal. Mach. Intell. 39(2), 209–226 (2016)
Packhäuser, K., Gündel, S., Münster, N., Syben, C., Christlein, V., Maier, A.: Is medical chest x-ray data anonymous? arXiv preprint arXiv:2103.08562 (2021)
McPherson, R., Shokri, R., Shmatikov, V.: Defeating image obfuscation with deep learning. arXiv preprint arXiv:1609.00408 (2016)
Tanaka, M.: Learnable image encryption. In: 2018 IEEE International Conference on Consumer Electronics-Taiwan (ICCE-TW), pp. 1–2 (2018)
Sirichotedumrong, W., Maekawa, T., Kinoshita, Y., Kiya, H.: Privacy-preserving deep neural networks with pixel-based image encryption considering data augmentation in the encrypted domain. In: 2019 IEEE International Conference on Image Processing (ICIP), pp. 674–678 (2019)
Sirichotedumrong, W., Kinoshita, Y., Kiya, H.: Pixel-based image encryption without key management for privacy-preserving deep neural networks. IEEE Access 7, 177:844–177:855 (2019)
Sirichotedumrong, W., Kiya, H.: A gan-based image transformation scheme for privacy-preserving deep neural networks (2020). https://arxiv.org/abs/2006.01342
Chen, Z., Zhu, T., Xiong, P., Wang, C., Ren, W.: Privacy preservation for image data: a Gan-based method. Int. J. Intell. Syst. 36(4), 1668–1685 (2021)
Rastogi, V., Suciu, D., Hong, S.: The boundary between privacy and utility in data publishing. In: Proceedings of the 33rd International Conference on Very Large Data Bases, pp. 531–542 (2007)
Li, T., Li, N.: On the tradeoff between privacy and utility in data publishing. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 517–526 (2009)
Yonghao, G., Weiming, W.: A quantifying method for trade-off between privacy and utility. In: IET International Conference on Information and Communications Technologies (IETICT 2013). IET, pp. 270–273 (2013)
Liu, Z., Luo, P., Wang, X., Tang, X.: Deep learning face attributes in the wild. In: Proceedings of International Conference on Computer Vision (ICCV) (2015)
Krizhevsky, A.: Learning multiple layers of features from tiny images (2009)
Yao, A.C.: Protocols for secure computations. In: 23rd Annual Symposium on Foundations of Computer Science (SFCS 1982), pp. 160–164. IEEE (1982)
Chase, M., Gilad-Bachrach, R., Laine, K., Lauter, K., Rindal, P.: Private collaborative neural network learning. Cryptology ePrint Archive (2017)
Mohassel, P., Zhang, Y.: Secureml: a system for scalable privacy-preserving machine learning. In: 2017 IEEE Symposium on Security and Privacy (SP), pp. 19–38 (2017)
Wagh, S., Gupta, D., Chandran, N.: Securenn: 3-party secure computation for neural network training. Proc. Priv. Enhancing Technol. 2019(3), 26–49 (2019)
Nikolaenko, V., Weinsberg, U., Ioannidis, S., Joye, M., Boneh, D., Taft, N.: Privacy-preserving ridge regression on hundreds of millions of records. In: 2013 IEEE Symposium on Security and Privacy, pp. 334–348 (2013)
Aono, Y., Hayashi, T., Trieu Phong, L., Wang, L.: Scalable and secure logistic regression via homomorphic encryption. In: Proceedings of the Sixth ACM Conference on Data and Application Security and Privacy, pp. 142–144 (2016)
Bonte, C., Vercauteren, F.: Privacy-preserving logistic regression training. BMC Med. Genomics 11(4), 13–21 (2018)
Crawford, J.L.H., Gentry, C., Halevi, S., Platt, D., Shoup, V.: Doing real work with FHE: the case of logistic regression. Cryptology ePrint Archive, Paper 2018/202 (2018). https://eprint.iacr.org/2018/202
Graepel, T., Lauter, K., Naehrig, M.: ML confidential: machine learning on encrypted data. In: Kwon, T., Lee, M.-K., Kwon, D. (eds.) ICISC 2012. LNCS, vol. 7839, pp. 1–21. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-37682-5_1
Kim, M., Song, Y., Wang, S., Xia, Y., Jiang, X., et al.: Secure logistic regression based on homomorphic encryption: design and evaluation. JMIR Med. Informat. 6(2), e8805 (2018)
Nandakumar, K., Ratha, N., Pankanti, S., Halevi, S.: Towards deep neural network training on encrypted data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (2019)
Li, T., Sahu, A.K., Talwalkar, A., Smith, V.: Federated learning: challenges, methods, and future directions. IEEE Sig. Process. Mag. 37(3), 50–60 (2020)
Bonawitz, K., et al.: Towards federated learning at scale: system design. Proc. Mach. Learn. Syst. 1, 374–388 (2019)
Zhao, Y., Li, M., Lai, L., Suda, N., Civin, D., Chandra, V.: Federated learning with non-iid data. arXiv preprint arXiv:1806.00582 (2018)
Konečný, J., McMahan, H.B., Yu, F.X., Richtárik, P., Suresh, A.T., Bacon, D.: Federated learning: strategies for improving communication efficiency (2016). https://arxiv.org/abs/1610.05492
Kairouz, P., et al.: Advances and open problems in federated learning. Found. Trends® Mach. Learn. 14(1–2), pp. 1–210 (2021)
Huang, Y., Song, Z., Li, K., Arora, S.: InstaHide: instance-hiding schemes for private distributed learning. In: Proceedings of the 37th International Conference on Machine Learning, Ser. Proceedings of Machine Learning Research, H. D. III and A. Singh, Eds., vol. 119. PMLR, 13–18 Jul, pp. 4507–4518 (2020). https://proceedings.mlr.press/v119/huang20i.html
Yala, A., et al.: Neuracrypt: hiding private health data via random neural networks for public training (2021). https://arxiv.org/abs/2106.02484
Carlini, N., et al.: Is private learning possible with instance encoding? (2020). https://arxiv.org/abs/2011.05315
Raynal, M., Achanta, R., Humbert, M.: Image obfuscation for privacy-preserving machine learning (2020). https://arxiv.org/abs/2010.10139
Malekzadeh, M., Clegg, R.G., Cavallaro, A., Haddadi, H.: Mobile sensor data anonymization. In: Proceedings of the International Conference on Internet of Things Design and Implementation, pp. 49–58 (2019)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition (2015). https://arxiv.org/abs/1512.03385
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 IFIP International Federation for Information Processing
About this paper
Cite this paper
Rodriguez, D., Krishnan, R. (2023). An Autoencoder-Based Image Anonymization Scheme for Privacy Enhanced Deep Learning. In: Atluri, V., Ferrara, A.L. (eds) Data and Applications Security and Privacy XXXVII. DBSec 2023. Lecture Notes in Computer Science, vol 13942. Springer, Cham. https://doi.org/10.1007/978-3-031-37586-6_18
Download citation
DOI: https://doi.org/10.1007/978-3-031-37586-6_18
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-37585-9
Online ISBN: 978-3-031-37586-6
eBook Packages: Computer ScienceComputer Science (R0)