Abstract
Recently, Mahloujifar and Mahmoody (Theory of Cryptography Conference’17) studied attacks against learning algorithms using a special case of Valiant’s malicious noise, called p-tampering, in which the adversary gets to change any training example with independent probability p but is limited to only choose ‘adversarial’ examples with correct labels. They obtained p-tampering attacks that increase the error probability in the so called ‘targeted’ poisoning model in which the adversary’s goal is to increase the loss of the trained hypothesis over a particular test example. At the heart of their attack was an efficient algorithm to bias the expected value of any bounded real-output function through p-tampering. In this work, we present new biasing attacks for increasing the expected value of bounded real-valued functions. Our improved biasing attacks, directly imply improved p-tampering attacks against learners in the targeted poisoning model. As a bonus, our attacks come with considerably simpler analysis. We also study the possibility of PAC learning under p-tampering attacks in the non-targeted (aka indiscriminate) setting where the adversary’s goal is to increase the risk of the generated hypothesis (for a random test example). We show that PAC learning is possible under p-tampering poisoning attacks essentially whenever it is possible in the realizable setting without the attacks. We further show that PAC learning under ‘no-mistake’ adversarial noise is not possible, if the adversary could choose the (still limited to only p fraction of) tampered examples that she substitutes with adversarially chosen ones. Our formal model for such ‘bounded-budget’ tampering attackers is inspired by the notions of adaptive corruption in cryptography.
Similar content being viewed by others
References
Awasthi, P., Balcan, M.F., Long, P.M.: The power of localization for efficiently learning linear separators with noise. In: Proceedings of the 46th Annual ACM Symposium on Theory of Computing, pp 449–458. ACM (2014)
Austrin, P., Chung, K.-M., Mahmoody, M., Pass, R., Seth, K.: On the impossibility of cryptography with tamperable randomness. In: International Cryptology Conference, pp 462–479. Springer (2014)
Angluin, D., Krikis, M., Sloan, R.H., Turán, G: Malicious omissions and errors in answers to membership queries. Mach. Learn. 28(2–3), 211–255 (1997)
Aumann, Y., Lindell, Y.: Security against covert adversaries: Efficient protocols for realistic adversaries. Theory Cryptogr., 137–156 (2007)
Angluin, D.: Queries and concept learning. Mach. Learn. 2(4), 319–342 (1987)
Beigi, S., Etesami, O., Gohari, A.: Deterministic randomness extraction from generalized and distributed Santha–Vazirani sources. SIAM J. Comput. 46(1), 1–36 (2017)
Blumer, A., Ehrenfeucht, A., Haussler, D., Warmuth, M.K.: Occam’s razor. Inf. Process. Lett. 24(6), 377–380 (1987)
Blumer, A., Ehrenfeucht, A., Haussler, D., Warmuth, M.K.: Learnability and the Vapnik-Chervonenkis dimension. J. ACM 36(4), 929–965 (1989)
Bshouty, N.H., Eiron, N., Kushilevitz, E.: PAC learning with nasty noise. Theor. Comput. Sci. 288(2), 255–275 (2002)
Bentov, I., Gabizon, A., Zuckerman, D.: Bitcoin beacon. arXiv:1605.04559 (2016)
Benedek, G.M., Itai, A.: Learnability with respect to fixed distributions. Theor. Comput. Sci. 86(2), 377–390 (1991)
Biggio, B., Nelson, B., Laskov, P.: Poisoning attacks against support vector machines. In: Proceedings of the 29th International Coference on International Conference on Machine Learning, pp 1467–1474. Omnipress (2012)
Barreno, M., Nelson, B., Sears, R., Joseph, A.D., Tygar, J.D.: Can machine learning be secure?. In: Proceedings of the 2006 ACM Symposium on Information, Computer and Communications Security, pp 16–25. ACM (2006)
Canetti, R., Feige, U., Goldreich, O., Naor, M.: Adaptively secure multi-party computation. In: 28th Annual ACM Symposium on Theory of Computing, pp 639–648. ACM Press, Philadephia (1996)
Chor, B., Goldreich, O.: Unbiased bits from sources of weak randomness and probabilistic communication complexity. In: Proc. 26th FOCS, pp 429–442. IEEE (1985)
Chernoff, H.: A measure of asymptotic efficiency for tests of a hypothesis based on the sum of observations. Ann. Math. Stat. 23(4), 493–507 (1952)
Charikar, M., Steinhardt, J., Valiant, G.: Learning from untrusted data. In: Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, pp 47–60. ACM (2017)
Diochnos, D.I.: On the evolution of monotone conjunctions: drilling for best approximations. In: ALT, pp 98–112 (2016)
Diakonikolas, I., Kamath, G., Kane, D.M., Li, J., Moitra, A, Stewart, A: Robust estimators in high dimensions without the computational intractability. In: 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS), pp 655–664. IEEE (2016)
Diakonikolas, I, Kamath, G., Kane, D.M., Li, J., Steinhardt, J, Stewart, A: Sever: a robust meta-algorithm for stochastic optimization. arXiv:https://arxiv.org/abs/1803.02815 (2018)
Diakonikolas, I., Kane, D.M., Stewart, A.: Statistical query lower bounds for robust estimation of high-dimensional Gaussians and Gaussian mixtures. In: 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS), pp 73–84. IEEE (2017)
Diakonikolas, I, Kane, D.M., Stewart, A.: List-decodable robust mean estimation and learning mixtures of spherical Gaussians. In: Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing, pp 1047–1060. ACM (2018)
Diakonikolas, I., Kong, W., Stewart, A.: Efficient algorithms and lower bounds for robust linear regression. arXiv:1806.00040 (2018)
Dodis, Y., Ong, S.J., Prabhakaran, M., Sahai, A.: On the (Im)possibility of cryptography with imperfect randomness. In: FOCS IEEE Symposium on Foundations of Computer Science (FOCS) (2004)
Dwork, C., Rothblum, G.N., Vadhan, S.: Boosting and differential privacy. In: 2010 51st Annual IEEE Symposium on Foundations of Computer Science (FOCS), pp 51–60. IEEE (2010)
Dodis, Y., Yao, Y.: Privacy with imperfect randomness. In: Annual Cryptology Conference, pp 463–482. Springer (2015)
Ehrenfeucht, A., Haussler, D., Kearns, M.J., Valiant, L.G.: A general lower bound on the number of examples needed for learning. Inf. Comput. 82(3), 247–261 (1989)
Etesami, O., Mahloujifar, S., Mahmoody, M.: Computational concentration of measure: Optimal bounds, reductions, and more. arXiv:1907.05401. To appear in SODA 2020 (2019)
González, C.R., Abu-Mostafa, Y.S.: Mismatched training and test distributions can outperform matched ones. Neural Comput. 27(2), 365–387 (2015)
Garg, S., Jha, S., Mahloujifar, S., Mahmoody, M.: Adversarially robust learning could leverage computational hardness. arXiv:1905.11564(2019)
Goldwasser, S., Kalai, Y.T., Park, S.: Adaptively secure coin-flipping, revisited (2015)
Goldwasser, S., Kalai, Y.T., Park, S.: Adaptively secure coin-flipping, revisited. In: International Colloquium on Automata, Languages, and Programming, pp 663–674. Springer (2015)
Haitner, I., Ishai, Y., Kushilevitz, E., Lindell, Y., Petrank, E.: Black-box constructions of protocols for secure computation. Cryptology ePrint Archive, Report 2010/164 http://eprint.iacr.org/2010/164 (2010)
Hoeffding, W.: Probability inequalities for sums of bounded random variables. J. Am. Stat. Assoc. 58(301), 13–30 (1963)
Kearns, M.J., Li, M.: Learning in the presence of malicious errors. SIAM J. Comput. 22(4), 807–837 (1993)
Lai, K.A., Rao, A.B., Vempala, S.: Agnostic estimation of mean and covariance. In: 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS), pp 665–674. IEEE (2016)
Mahloujifar, S., Mahmoody, M.: Blockwise p-tampering attacks on cryptographic primitives, extractors, and learners. In: Theory of Cryptography Conference, pp 245–279. Springer (2017)
Mahloujifar, S, Mahmoody, M: Blockwise p-tampering attacks on cryptographic primitives, extractors, and learners. Cryptology ePrint Archive, Report 2017/950 https://eprint.iacr.org/2017/950 (2017)
Mahloujifar, S., Mahmoody, M.: Can adversarially robust learning leveragecomputational hardness? In: Algorithmic Learning Theory, pp. 581–609 (2019)
Nakamoto, S.: Bitcoin: A peer-to-peer electronic cash system (2008)
Papernot, N., McDaniel, P., Sinha, A., Wellman, M.: Towards the science of security and privacy in machine learning. arXiv:1611.03814 (2016)
Prasad, A., Suggala, A.S., Balakrishnan, S., Ravikumar, P.: Robust estimation via robust gradient estimation. arXiv:1802.06485 (2018)
Pitt, L., Valiant, L.G.: Computational limitations on learning from examples. J. ACM 35(4), 965–984 (1988)
Rao, C.R.: Information and the accuracy attainable in the estimation of statistical parameters. In: Breakthroughs in Statistics, pp 235–247. Springer (1992)
Rubinstein, B.I.P., Nelson, B., Huang, L., Joseph, A.D., Lau, S.-h., Rao, S., Taft, N., Tygar, J.D.: Antidote: understanding and defending against poisoning of anomaly detectors. In: Proceedings of the 9th ACM SIGCOMM Conference on Internet Measurement Conference, pp 1–14. ACM (2009)
Rubinstein, B.I.P., Nelson, B., Huang, L., Joseph, A.D., Lau, S -h, Rao, S., Taft, N., Tygar, J.D.: Stealthy poisoning attacks on pca-based anomaly detectors. ACM SIGMETRICS Perform. Eval. Rev. 37(2), 73–74 (2009)
Reingold, O., Vadhan, S, Wigderson, A: A note on extracting randomness from Santha-Vazirani sources. Unpublished manuscript (2004)
Shafahi, A., Huang, W.R., Najibi, M., Suciu, O., Studer, C., Dumitras, T., Goldstein, T.: Poison frogs! targeted clean-label poisoning attacks on neural networks. In: Advances in Neural Information Processing Systems, pp 6103–6113 (2018)
Sloan, R.H.: Four types of noise in data for PAC learning. Inf. Process. Lett. 54 (3), 157–162 (1995)
Shen, S., Tople, S., Saxena, P.: A uror: Defending against poisoning attacks in collaborative deep learning systems. In: Proceedings of the 32nd Annual Conference on Computer Security Applications, pp 508–519. ACM (2016)
Santha, M., Vazirani, U.V.: Generating quasi-random sequences from semi-random sources. J. Comput. Syst. Sci. 33(1), 75–87 (1986)
Valiant, L.G.: A Theory of the Learnable. Commun. ACM 27(11), 1134–1142 (1984)
Valiant, L.G.: Learning disjunctions of conjunctions. In: IJCAI, pp 560–566 (1985)
Von Neumann, J.: 13 various techniques used in connection with random digits. Appl. Math. Ser 12, 36–38 (1951)
Wang, Y., Chaudhuri, K.: Data poisoning attacks against online learning. arXiv:1808.08994 (2018)
Xiao, H., Biggio, B., Brown, G., Fumera, G., Eckert, C., Roli, F.: Is feature selection secure against training data poisoning? In: ICML, pp. 1689–1698 (2015)
Xu, H., Mannor, S.: Robustness and generalization. Mach. Learn. 86(3), 391–423 (2012)
Yamazaki, K, Kawanabe, M., Watanabe, S., Sugiyama, M, Müller, K.-R.: Asymptotic Bayesian generalization error when training and test distributions are different. In: ICML, pp 1079–1086 (2007)
Acknowledgements
We thank the anonymous reviewers of the International Conference in Algorithmic Learning Theory (ALT) 2018 as well as of the International Symposium in Artificial Intelligence and Mathematics (ISAIM) 2018 for their useful comments on earlier versions of this work.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
A preliminary version titled “Learning under p-Tampering Attacks” appeared in Algorithmic Learning Theory (ALT) 2018.
Saeed Mahloujifar was supported by University of Virginia’s SEAS Research Innovation Award.
Mohammad Mahmoody was supported by NSF CAREER award CCF-1350939 and University of Virginia’s SEAS Research Innovation Award.
Rights and permissions
About this article
Cite this article
Mahloujifar, S., Diochnos, D.I. & Mahmoody, M. Learning under p-tampering poisoning attacks. Ann Math Artif Intell 88, 759–792 (2020). https://doi.org/10.1007/s10472-019-09675-1
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10472-019-09675-1