Skip to main content
Log in

Learning under p-tampering poisoning attacks

  • Published:
Annals of Mathematics and Artificial Intelligence Aims and scope Submit manuscript

Abstract

Recently, Mahloujifar and Mahmoody (Theory of Cryptography Conference’17) studied attacks against learning algorithms using a special case of Valiant’s malicious noise, called p-tampering, in which the adversary gets to change any training example with independent probability p but is limited to only choose ‘adversarial’ examples with correct labels. They obtained p-tampering attacks that increase the error probability in the so called ‘targeted’ poisoning model in which the adversary’s goal is to increase the loss of the trained hypothesis over a particular test example. At the heart of their attack was an efficient algorithm to bias the expected value of any bounded real-output function through p-tampering. In this work, we present new biasing attacks for increasing the expected value of bounded real-valued functions. Our improved biasing attacks, directly imply improved p-tampering attacks against learners in the targeted poisoning model. As a bonus, our attacks come with considerably simpler analysis. We also study the possibility of PAC learning under p-tampering attacks in the non-targeted (aka indiscriminate) setting where the adversary’s goal is to increase the risk of the generated hypothesis (for a random test example). We show that PAC learning is possible under p-tampering poisoning attacks essentially whenever it is possible in the realizable setting without the attacks. We further show that PAC learning under ‘no-mistake’ adversarial noise is not possible, if the adversary could choose the (still limited to only p fraction of) tampered examples that she substitutes with adversarially chosen ones. Our formal model for such ‘bounded-budget’ tampering attackers is inspired by the notions of adaptive corruption in cryptography.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Awasthi, P., Balcan, M.F., Long, P.M.: The power of localization for efficiently learning linear separators with noise. In: Proceedings of the 46th Annual ACM Symposium on Theory of Computing, pp 449–458. ACM (2014)

  2. Austrin, P., Chung, K.-M., Mahmoody, M., Pass, R., Seth, K.: On the impossibility of cryptography with tamperable randomness. In: International Cryptology Conference, pp 462–479. Springer (2014)

  3. Angluin, D., Krikis, M., Sloan, R.H., Turán, G: Malicious omissions and errors in answers to membership queries. Mach. Learn. 28(2–3), 211–255 (1997)

    Article  Google Scholar 

  4. Aumann, Y., Lindell, Y.: Security against covert adversaries: Efficient protocols for realistic adversaries. Theory Cryptogr., 137–156 (2007)

  5. Angluin, D.: Queries and concept learning. Mach. Learn. 2(4), 319–342 (1987)

    MathSciNet  Google Scholar 

  6. Beigi, S., Etesami, O., Gohari, A.: Deterministic randomness extraction from generalized and distributed Santha–Vazirani sources. SIAM J. Comput. 46(1), 1–36 (2017)

    Article  MathSciNet  Google Scholar 

  7. Blumer, A., Ehrenfeucht, A., Haussler, D., Warmuth, M.K.: Occam’s razor. Inf. Process. Lett. 24(6), 377–380 (1987)

    Article  MathSciNet  Google Scholar 

  8. Blumer, A., Ehrenfeucht, A., Haussler, D., Warmuth, M.K.: Learnability and the Vapnik-Chervonenkis dimension. J. ACM 36(4), 929–965 (1989)

    Article  MathSciNet  Google Scholar 

  9. Bshouty, N.H., Eiron, N., Kushilevitz, E.: PAC learning with nasty noise. Theor. Comput. Sci. 288(2), 255–275 (2002)

    Article  MathSciNet  Google Scholar 

  10. Bentov, I., Gabizon, A., Zuckerman, D.: Bitcoin beacon. arXiv:1605.04559 (2016)

  11. Benedek, G.M., Itai, A.: Learnability with respect to fixed distributions. Theor. Comput. Sci. 86(2), 377–390 (1991)

    Article  MathSciNet  Google Scholar 

  12. Biggio, B., Nelson, B., Laskov, P.: Poisoning attacks against support vector machines. In: Proceedings of the 29th International Coference on International Conference on Machine Learning, pp 1467–1474. Omnipress (2012)

  13. Barreno, M., Nelson, B., Sears, R., Joseph, A.D., Tygar, J.D.: Can machine learning be secure?. In: Proceedings of the 2006 ACM Symposium on Information, Computer and Communications Security, pp 16–25. ACM (2006)

  14. Canetti, R., Feige, U., Goldreich, O., Naor, M.: Adaptively secure multi-party computation. In: 28th Annual ACM Symposium on Theory of Computing, pp 639–648. ACM Press, Philadephia (1996)

  15. Chor, B., Goldreich, O.: Unbiased bits from sources of weak randomness and probabilistic communication complexity. In: Proc. 26th FOCS, pp 429–442. IEEE (1985)

  16. Chernoff, H.: A measure of asymptotic efficiency for tests of a hypothesis based on the sum of observations. Ann. Math. Stat. 23(4), 493–507 (1952)

    Article  MathSciNet  Google Scholar 

  17. Charikar, M., Steinhardt, J., Valiant, G.: Learning from untrusted data. In: Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, pp 47–60. ACM (2017)

  18. Diochnos, D.I.: On the evolution of monotone conjunctions: drilling for best approximations. In: ALT, pp 98–112 (2016)

  19. Diakonikolas, I., Kamath, G., Kane, D.M., Li, J., Moitra, A, Stewart, A: Robust estimators in high dimensions without the computational intractability. In: 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS), pp 655–664. IEEE (2016)

  20. Diakonikolas, I, Kamath, G., Kane, D.M., Li, J., Steinhardt, J, Stewart, A: Sever: a robust meta-algorithm for stochastic optimization. arXiv:https://arxiv.org/abs/1803.02815 (2018)

  21. Diakonikolas, I., Kane, D.M., Stewart, A.: Statistical query lower bounds for robust estimation of high-dimensional Gaussians and Gaussian mixtures. In: 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS), pp 73–84. IEEE (2017)

  22. Diakonikolas, I, Kane, D.M., Stewart, A.: List-decodable robust mean estimation and learning mixtures of spherical Gaussians. In: Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing, pp 1047–1060. ACM (2018)

  23. Diakonikolas, I., Kong, W., Stewart, A.: Efficient algorithms and lower bounds for robust linear regression. arXiv:1806.00040 (2018)

  24. Dodis, Y., Ong, S.J., Prabhakaran, M., Sahai, A.: On the (Im)possibility of cryptography with imperfect randomness. In: FOCS IEEE Symposium on Foundations of Computer Science (FOCS) (2004)

  25. Dwork, C., Rothblum, G.N., Vadhan, S.: Boosting and differential privacy. In: 2010 51st Annual IEEE Symposium on Foundations of Computer Science (FOCS), pp 51–60. IEEE (2010)

  26. Dodis, Y., Yao, Y.: Privacy with imperfect randomness. In: Annual Cryptology Conference, pp 463–482. Springer (2015)

  27. Ehrenfeucht, A., Haussler, D., Kearns, M.J., Valiant, L.G.: A general lower bound on the number of examples needed for learning. Inf. Comput. 82(3), 247–261 (1989)

    Article  MathSciNet  Google Scholar 

  28. Etesami, O., Mahloujifar, S., Mahmoody, M.: Computational concentration of measure: Optimal bounds, reductions, and more. arXiv:1907.05401. To appear in SODA 2020 (2019)

  29. González, C.R., Abu-Mostafa, Y.S.: Mismatched training and test distributions can outperform matched ones. Neural Comput. 27(2), 365–387 (2015)

    Article  MathSciNet  Google Scholar 

  30. Garg, S., Jha, S., Mahloujifar, S., Mahmoody, M.: Adversarially robust learning could leverage computational hardness. arXiv:1905.11564(2019)

  31. Goldwasser, S., Kalai, Y.T., Park, S.: Adaptively secure coin-flipping, revisited (2015)

  32. Goldwasser, S., Kalai, Y.T., Park, S.: Adaptively secure coin-flipping, revisited. In: International Colloquium on Automata, Languages, and Programming, pp 663–674. Springer (2015)

  33. Haitner, I., Ishai, Y., Kushilevitz, E., Lindell, Y., Petrank, E.: Black-box constructions of protocols for secure computation. Cryptology ePrint Archive, Report 2010/164 http://eprint.iacr.org/2010/164 (2010)

  34. Hoeffding, W.: Probability inequalities for sums of bounded random variables. J. Am. Stat. Assoc. 58(301), 13–30 (1963)

    Article  MathSciNet  Google Scholar 

  35. Kearns, M.J., Li, M.: Learning in the presence of malicious errors. SIAM J. Comput. 22(4), 807–837 (1993)

    Article  MathSciNet  Google Scholar 

  36. Lai, K.A., Rao, A.B., Vempala, S.: Agnostic estimation of mean and covariance. In: 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS), pp 665–674. IEEE (2016)

  37. Mahloujifar, S., Mahmoody, M.: Blockwise p-tampering attacks on cryptographic primitives, extractors, and learners. In: Theory of Cryptography Conference, pp 245–279. Springer (2017)

  38. Mahloujifar, S, Mahmoody, M: Blockwise p-tampering attacks on cryptographic primitives, extractors, and learners. Cryptology ePrint Archive, Report 2017/950 https://eprint.iacr.org/2017/950 (2017)

  39. Mahloujifar, S., Mahmoody, M.: Can adversarially robust learning leveragecomputational hardness? In: Algorithmic Learning Theory, pp. 581–609 (2019)

  40. Nakamoto, S.: Bitcoin: A peer-to-peer electronic cash system (2008)

  41. Papernot, N., McDaniel, P., Sinha, A., Wellman, M.: Towards the science of security and privacy in machine learning. arXiv:1611.03814 (2016)

  42. Prasad, A., Suggala, A.S., Balakrishnan, S., Ravikumar, P.: Robust estimation via robust gradient estimation. arXiv:1802.06485 (2018)

  43. Pitt, L., Valiant, L.G.: Computational limitations on learning from examples. J. ACM 35(4), 965–984 (1988)

    Article  MathSciNet  Google Scholar 

  44. Rao, C.R.: Information and the accuracy attainable in the estimation of statistical parameters. In: Breakthroughs in Statistics, pp 235–247. Springer (1992)

  45. Rubinstein, B.I.P., Nelson, B., Huang, L., Joseph, A.D., Lau, S.-h., Rao, S., Taft, N., Tygar, J.D.: Antidote: understanding and defending against poisoning of anomaly detectors. In: Proceedings of the 9th ACM SIGCOMM Conference on Internet Measurement Conference, pp 1–14. ACM (2009)

  46. Rubinstein, B.I.P., Nelson, B., Huang, L., Joseph, A.D., Lau, S -h, Rao, S., Taft, N., Tygar, J.D.: Stealthy poisoning attacks on pca-based anomaly detectors. ACM SIGMETRICS Perform. Eval. Rev. 37(2), 73–74 (2009)

    Article  Google Scholar 

  47. Reingold, O., Vadhan, S, Wigderson, A: A note on extracting randomness from Santha-Vazirani sources. Unpublished manuscript (2004)

  48. Shafahi, A., Huang, W.R., Najibi, M., Suciu, O., Studer, C., Dumitras, T., Goldstein, T.: Poison frogs! targeted clean-label poisoning attacks on neural networks. In: Advances in Neural Information Processing Systems, pp 6103–6113 (2018)

  49. Sloan, R.H.: Four types of noise in data for PAC learning. Inf. Process. Lett. 54 (3), 157–162 (1995)

    Article  Google Scholar 

  50. Shen, S., Tople, S., Saxena, P.: A uror: Defending against poisoning attacks in collaborative deep learning systems. In: Proceedings of the 32nd Annual Conference on Computer Security Applications, pp 508–519. ACM (2016)

  51. Santha, M., Vazirani, U.V.: Generating quasi-random sequences from semi-random sources. J. Comput. Syst. Sci. 33(1), 75–87 (1986)

    Article  Google Scholar 

  52. Valiant, L.G.: A Theory of the Learnable. Commun. ACM 27(11), 1134–1142 (1984)

    Article  Google Scholar 

  53. Valiant, L.G.: Learning disjunctions of conjunctions. In: IJCAI, pp 560–566 (1985)

  54. Von Neumann, J.: 13 various techniques used in connection with random digits. Appl. Math. Ser 12, 36–38 (1951)

    Google Scholar 

  55. Wang, Y., Chaudhuri, K.: Data poisoning attacks against online learning. arXiv:1808.08994 (2018)

  56. Xiao, H., Biggio, B., Brown, G., Fumera, G., Eckert, C., Roli, F.: Is feature selection secure against training data poisoning? In: ICML, pp. 1689–1698 (2015)

  57. Xu, H., Mannor, S.: Robustness and generalization. Mach. Learn. 86(3), 391–423 (2012)

    Article  MathSciNet  Google Scholar 

  58. Yamazaki, K, Kawanabe, M., Watanabe, S., Sugiyama, M, Müller, K.-R.: Asymptotic Bayesian generalization error when training and test distributions are different. In: ICML, pp 1079–1086 (2007)

Download references

Acknowledgements

We thank the anonymous reviewers of the International Conference in Algorithmic Learning Theory (ALT) 2018 as well as of the International Symposium in Artificial Intelligence and Mathematics (ISAIM) 2018 for their useful comments on earlier versions of this work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Saeed Mahloujifar.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

A preliminary version titled “Learning under p-Tampering Attacks” appeared in Algorithmic Learning Theory (ALT) 2018.

Saeed Mahloujifar was supported by University of Virginia’s SEAS Research Innovation Award.

Mohammad Mahmoody was supported by NSF CAREER award CCF-1350939 and University of Virginia’s SEAS Research Innovation Award.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mahloujifar, S., Diochnos, D.I. & Mahmoody, M. Learning under p-tampering poisoning attacks. Ann Math Artif Intell 88, 759–792 (2020). https://doi.org/10.1007/s10472-019-09675-1

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10472-019-09675-1

Keywords

Mathematics Subject Classification (2010)

Navigation