Abstract
The training and application of machine learning models encounter the leakage of a significant amount of information about their training dataset that can be investigated by inference attack or model inversion in the fields, such as computer vision, social robots. The conventional methods for the privacy preserving exploit to apply differential privacy into the training process, which might bring a negative influence on the convergence or robustness. We first conjecture the necessary steps to carry out a successful membership inference attack in a machine learning setting and then explicitly formulate the defense based on the conjecture. This paper investigates the construction of new training parameters with a Loss-based Differentiation Strategy (LDS) for a new learning model. The main idea of LDS is to partition the training dataset into some folds and sort their training parameters by similarity to enable privacy-accuracy inequality. The LDS-based model leakages less information on MIA than the primitive learning model and makes it impossible for the adversary to generate the representative samples. Finally, extensive simulations are conducted to validate the proposed scheme and the results demonstrate that the LDS can lower the MIA accuracy in terms of most CNNs models.










Similar content being viewed by others
Change history
18 July 2022
A Correction to this paper has been published: https://doi.org/10.1007/s11227-022-04716-9
References
Ribeiro M, Grolinger K, Capretz MA (2015) Mlaas: machine learning as a service. In: 2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA). IEEE, pp 896–902
Shokri R, Stronati M, Song C, Shmatikov V (2017) Membership inference attacks against machine learning models. In: 2017 IEEE Symposium on Security and Privacy (SP). IEEE, pp 3–18
Ateniese G, Mancini LV, Spognardi A, Villani A, Vitali D, Felici G (2015) Hacking smart machines with smarter ones: how to extract meaningful data from machine learning classifiers. Int J Secur Netw 10(3):137–150
Fredrikson M, Jha S, Ristenpart T (2015) Model inversion attacks that exploit confidence information and basic countermeasures. In: Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, pp 1322–1333
Hitaj B, Ateniese G, Perez-Cruz F (2017) Deep models under the gan: information leakage from collaborative deep learning. In: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, pp 603–618
Shokri R, Strobel M, Zick Y (2019) Privacy risks of explaining machine learning models. CoRR abs/1907.00164
Yeom S, Giacomelli I, Fredrikson M, Jha S (2018) Privacy risk in machine learning: Analyzing the connection to overfitting. In: 2018 IEEE 31st Computer Security Foundations Symposium (CSF), pp 268–282. https://doi.org/10.1109/CSF.2018.00027
Song L, Shokri R, Mittal P (2019) Privacy risks of securing machine learning models against adversarial examples. In: Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security, pp 241–257
Yeom S, Giacomelli I, Fredrikson M, Jha S (2018) Privacy risk in machine learning: analyzing the connection to overfitting. In: 2018 IEEE 31st Computer Security Foundations Symposium (CSF). IEEE, pp 268–282
Nasr M, Shokri R, Houmansadr A (2019) Comprehensive privacy analysis of deep learning: Passive and active white-box inference attacks against centralized and federated learning. In: 2019 IEEE Symposium on Security and Privacy (SP). IEEE, pp 739–753
Dinur I, Nissim K (2003) Revealing information while preserving privacy. In: Proceedings of the Twenty-second ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, pp 202–210
Abadi M, Chu A, Goodfellow I, McMahan HB, Mironov I, Talwar K, Zhang L (2016) Deep learning with differential privacy. In: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, pp 308–318
Dwork C, Feldman V, Hardt M, Pitassi T, Reingold O, Roth A (2015) The reusable holdout: preserving validity in adaptive data analysis. Science 349(6248):636–638
Tople S, Sharma A, Nori A (2020) Alleviating privacy attacks via causal learning. In: International Conference on Machine Learning. PMLR, pp 9537–9547
Yu H, Yang K, Zhang T, Tsai Y-Y, Ho T-Y, Jin Y (2020) Cloudleak: large-scale deep learning models stealing through adversarial examples. In: NDSS
Zhang H, Cisse M, Dauphin YN, Lopez-Paz D (2017) mixup: beyond empirical risk minimization. arXiv preprint arXiv:1710.09412
Yaghini M, Kulynych B, Cherubin G, Troncoso C (2019) Disparate vulnerability: on the unfairness of privacy attacks against machine learning. arXiv preprint arXiv:1906.00389
Song C, Raghunathan A (2020) Information leakage in embedding models. In: Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security, pp 377–390
Fioretto F, Mitridati L, Van Hentenryck P (2020) Differential privacy for stackelberg games. arXiv preprint arXiv:2002.00944
Phan N, Vu M, Liu Y, Jin R, Dou D, Wu X, Thai MT (2019) Heterogeneous gaussian mechanism: Preserving differential privacy in deep learning with provable robustness. arXiv preprint arXiv:1906.01444
Cummings R, Gupta V, Kimpara D, Morgenstern J (2019) On the compatibility of privacy and fairness. In: Adjunct Publication of the 27th Conference on User Modeling, Adaptation and Personalization, pp 309–315
Chaudhuri K, Monteleoni C, Sarwate AD (2011) Differentially private empirical risk minimization. J Mach Learn Res 12(3)
Papernot N, Song S, Mironov I, Raghunathan A, Talwar K, Erlingsson Ú (2018) Scalable private learning with pate. arXiv preprint arXiv:1802.08908
Sablayrolles A, Douze M, Schmid C, Ollivier Y, Jégou H (2019) White-box vs black-box: Bayes optimal strategies for membership inference. In: International Conference on Machine Learning. PMLR, pp 5558–5567
Wei Z, Pei Q, Zhang N, Liu X, Wu C, Taherkordi A (2021) Lightweight federated learning for large-scale iot devices with privacy guarantee. IEEE Internet Things J. https://doi.org/10.1109/JIOT.2021.3127886
Truex S, Baracaldo N, Anwar A, Steinke T, Ludwig H, Zhang R, Zhou Y (2019) A hybrid approach to privacy-preserving federated learning. In: Proceedings of the 12th ACM Workshop on Artificial Intelligence and Security, pp 1–11
Aono Y, Hayashi T, Wang L, Moriai S et al (2017) Privacy-preserving deep learning via additively homomorphic encryption. IEEE Trans Inf Forensics Secur 13(5):1333–1345
Wang Z., Song M, Zhang Z, Song Y, Wang Q, Qi H (2019) Beyond inferring class representatives: User-level privacy leakage from federated learning. In: IEEE INFOCOM 2019-IEEE Conference on Computer Communications. IEEE, pp 2512–2520
Zhu L, Liu Z, Han S (2019) Deep leakage from gradients. Advances in Neural Information Processing Systems 32
Shokri R, Shmatikov V (2015) Privacy-preserving deep learning. In: Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, pp 1310–1321
Hastie T, Tibshirani R (1996) Discriminant analysis by gaussian mixtures. J Royal Stat Soc Series B (Methodol) 58(1):155–176
Long Y, Bindschaedler V, Wang L, Bu D, Wang X, Tang H, Gunter CA, Chen K (2018)Understanding membership inferences on well-generalized learning models. arXiv preprint arXiv:1802.04889
Leino K, Fredrikson M (2020) Stolen memories: leveraging model memorization for calibrated white-box membership inference. In: 29th \(\{\)USENIX\(\}\) Security Symposium (\(\{\)USENIX\(\}\) Security 20), pp 1605–1622
Wang S, Tuor T, Salonidis T, Leung KK, Makaya C, He T, Chan K (2019) Adaptive federated learning in resource constrained edge computing systems. IEEE J Sel Areas Commun 37(6):1205–1221
LeCun Y, et al.: Lenet-5, convolutional neural networks. http://yann.lecun.com/exdb/lenet 20(5), 14 (2015)
Li W (2017) cifar-10-cnn: play deep learning with CIFAR datasets. https://github.com/BIGBALLON/cifar-10-cnn
Zhu C, Han S, Mao H, Dally WJ (2016) Trained ternary quantization. arXiv preprint arXiv:1612.01064
Xiao H, Rasul K, Vollgraf R (2017) Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747
Krizhevsky A (2017) The CIFAR-100 dataset. https://www.cs.toronto.edu/~kriz/cifar.html (Accessed: 2022-01-01)
He Z, Zhang T, Lee RB (2019) Model inversion attacks against collaborative inference. In: Proceedings of the 35th Annual Computer Security Applications Conference, pp 148–162
Zhao Y, Li M, Lai L, Suda N, Civin D, Chandra V (2018)Federated Learning with Non-IID Data. arXiv . https://doi.org/10.48550/arXiv.1806.00582. https://arXiv.org/abs/1806.00582
Brun M, Xu Q, Dougherty ER (2008) A criterion for choosing between full-sample and hold-out classifier design. In: 2008 IEEE International Workshop on Genomic Signal Processing and Statistics, pp 1–2. https://doi.org/10.1109/GENSIPS.2008.4555662
Acknowledgements
This work is supported with National Nature Science Foundation of China, No. 62072485, and Guangdong Basic and Applied Basic Research Foundation No. 2022A1515011294.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
The original online version of this article was revised: The Acknowledgements section was missing from this article.
Appendix A Proof
Appendix A Proof
1.1 A.1 Membership inference advantage
Lemma 2
\(Adv^{Sa} \le Adv^{Agg}\)
Proof
\(\square\)
where \(L(\cdot )\) denotes L((x, y), F).
1.2 A.2 Membership adversary
Lemma 3
Intersects at two points \(\sigma _{1} < \sigma _{2}\).
Proof
Let \(f\left( \varepsilon ,\sigma \right)\) be standard normal distribution.
where \(\frac{\varepsilon }{\sqrt{2}\sigma } = t\), when \(\varepsilon = x\), \(t = \frac{x }{\sqrt{2}\sigma }\).
Hence, we can get:
\(\square\)
Meanwhile: For the sake for simplify, let \(ln(\cdot )\) denote \(\sqrt{\frac{\ln {\left( \sigma _{O}^{S}/\sigma _{O}^{T} \right) }}{\left( \sigma _{O}^{S}/\sigma _{O}^{T} \right) ^{2}-1}}\).
The advantage of the two structures is expressed as:
Lemma 4
\(\frac{\sigma _{O}^{S}}{\sigma _{T}^{S}} \le \frac{\sigma _{N}^{F}}{\sigma _{T}^{F}}\)
Proof
At the beginning, assuming that \(F_{1}\) and \(erf \left( x \right)\) increases as x increases, we can get:
and
We also get \(\mathrm{erf} \left( \frac{\sigma _{O}^{S}}{\sigma _{P}^{S}}F_{2}\left( \frac{\sigma _{O}^{S}}{\sigma _{T}^{S}} \right) \right) \ge \mathrm{erf}\left( F_{2}\left( \frac{\sigma _{O}^{S}}{\sigma _{T}^{S}} \right) \right)\) since \(\sigma _{P}^{S} \le \sigma _{O}^{S}\).
\(\therefore\)
and
Finally,
\(\square\)
\(\because\) \(F_{2}\) increases as x decreases. \(\therefore\)
and
Hence, we can get \(\mathrm{Adv}^{\mathrm{Agg}} > \mathrm{Adv}^{Sa}\).
1.3 A.3 Probability
Lemma 5
Proof
\(\because\) \(P\left( \left( x^{*},y \right) \in D_{P} \right) < \underset{{D}'\subseteq D and {D}'\bigcap D_{T} = \emptyset }{\mathrm{max}P\left( \left( x^{*},y \right) \in {D}' \right) }\)
\(\therefore\) \(\mathrm{Adv}^{Sa} < P\left( \left( x^{*},y \right) \in D_{T} \right) - P\left( \left( x^{*},y \right) \in D_{P} \right)\)
We can get
let t = \(\frac{\varepsilon \sigma }{\sqrt{2}\sigma }\), when \(\varepsilon \rightarrow +\infty\),\(t \rightarrow +\infty\), when \(\varepsilon \rightarrow 0,\ t \rightarrow 0,\ \varepsilon \rightarrow |\varepsilon _{eq} |,\ t \rightarrow \frac{|\varepsilon _{eq} |}{\sqrt{2}\sigma }\).
Known \(F\left( x \right) = \mathrm{erf}c\left( x \right) -\mathrm{erf}\left( x \right) = 1-2erf\left( x \right)\) \(F^{'} = -2\cdot e^{-x^{2}}< 0\) and \(F\left( x \right)\) increases as x decreases, and \(F\left( 1/2 \right) <0\), we get
\(F\left( x \right) = \frac{x^{2}ln\left( x \right) }{x^{2}-1}\), and monotonically increases.
\(\lim _{x\rightarrow 1}F\left( x \right) = \lim _{x\rightarrow 1}\frac{x^{2}lnx}{x^{2}-1} = \lim _{x\rightarrow 1}\frac{2xlnx + x}{2x} = 1/2\)
\(F\left( x \right)\) increases as x decreases \(F\left( \frac{1}{\sqrt{x}} \right)< F\left( \frac{1}{2} \right) < 0\)
\(\therefore\) \(\mathrm{erf}c\left( \frac{|\varepsilon _{eq} |}{2\sigma } \right) - \mathrm{erf}\left( \frac{|\varepsilon _{eq} |}{2\sigma } \right) < 0\)
\(\square\)
Rights and permissions
About this article
Cite this article
Yang, Q., Wang, T., Zhu, K. et al. Loss-based differentiation strategy for privacy preserving of social robots. J Supercomput 79, 321–348 (2023). https://doi.org/10.1007/s11227-022-04660-8
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-022-04660-8