Skip to main content

A Scalable Mixture Model Based Defense Against Data Poisoning Attacks on Classifiers

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12312))

Abstract

Classifiers, e.g., those based on Naive Bayes, a support vector machine, or even a neural network, are highly susceptible to a data-poisoning attack. The attack objective is to degrade classification accuracy by covertly embedding malicious (labeled) samples into the training set. Such attacks can be mounted by an insider, through an outsourcing process (for data acquisition or training), or conceivably during active learning. In some cases, a very small amount of poisoning can result in dramatic reduction in classification accuracy. Data poisoning attacks are successful mainly because the malicious injected samples significantly skew the data distribution of the corrupted class. Such attack samples are generally data outliers and in principle separable from the clean samples. We propose a generalized, scalable, and dynamic data driven defense system that: 1) uses a mixture model both to well-fit the (potentially multi-modal) data and to give potential to isolate attack samples in a small subset of the mixture components; 2) performs hypothesis testing to decide both which components and which samples within those components are poisoned, with the identified poisoned ones purged from the training set. Our approaches addresses the attack scenario where adversarial samples are an unknown subset embedded in the initial training set, and can be used to perform data sanitization as a precursor to the training of any type of classifier. The promising results for experiments on the TREC05 spam corpus and Amazon reviews polarity dataset demonstrate the effectiveness of our defense strategy.

This research is supported in part by an AFOSR DDDAS grant and a Cisco Systems URP gift.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Biggio, B., et al.: Evasion attacks against machine learning at test time. In: Blockeel, H., Kersting, K., Nijssen, S., Železný, F. (eds.) ECML PKDD 2013. LNCS (LNAI), vol. 8190, pp. 387–402. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-40994-3_25

    Chapter  Google Scholar 

  2. Blasch, E., Ravela, S., Aved, A.: Handbook of Dynamic Data Driven Applications Systems. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-95504-9

    Book  Google Scholar 

  3. Chen, X., Liu, C., Li, B., Lu, K., Song, D.: Targeted backdoor attacks on deep learning systems using data poisoning. Arxiv (2017). http://arxiv.org/abs/1712.05526

  4. Cormack, G.V., Lynam, T.R.: TREC 2005 spam public corpora (2005). https://plg.uwaterloo.ca/~gvcormac/trecspamtrack05

  5. Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Stat. Soc. Ser. B (Methodol.) 39, 1–22 (1977)

    Google Scholar 

  6. Graham, M.W., Miller, D.J.: Unsupervised learning of parsimonious mixtures on large spaces with integrated feature and component selection. IEEE Trans. Sig. Process. 54, 1289–1303 (2006)

    Article  Google Scholar 

  7. Gu, T., Liu, K., Dolan-Gavitt, B., Garg, S.: Badnets: evaluating backdooring attacks on deep neural networks. IEEE Access 7, 47230–47244 (2019)

    Article  Google Scholar 

  8. Huang, L., Joseph, A.D., Nelson, B., Rubinstein, B.I.P., Tygar, J.D.: Adversarial machine learning. In: Proceedings of the 4th ACM Workshop on Security and Artificial Intelligence (2011)

    Google Scholar 

  9. Laishram, R., Phoha, V.V.: Curie: a method for protecting SVM classifier from poisoning attack. Arxiv (2016). http://arxiv.org/abs/1606.01584

  10. Miller, D.J., Xiang, Z., Kesidis, G.: Adversarial learning targeting deep neural network classification: a comprehensive review of defenses against attacks. Proc. IEEE 108(3), 402–433 (2020)

    Article  Google Scholar 

  11. Miller, D.J., Browning, J.: A mixture model and EM-based algorithm for class discovery, robust classification, and outlier rejection in mixed labeled/unlabeled data sets. IEEE Trans. Pattern Anal. Mach. Intell. 25, 1468–1483 (2003)

    Article  Google Scholar 

  12. Miller, D.J., Wang, Y., Kesidis, G.: When not to classify: anomaly detection of attacks (ADA) on DNN classifiers at test time. Neural Comput. 31, 1624–1670 (2019)

    Article  MathSciNet  Google Scholar 

  13. Nelson, B., et al.: Misleading learners: co-opting your spam filter. In: Tsai, J.J., Philip, S.Y. (eds.) Machine Learning in Cyber Trust, pp. 17–51. Springer, Boston (2009). https://doi.org/10.1007/978-0-387-88735-7_2

    Chapter  Google Scholar 

  14. Newell, A., Potharaju, R., Xiang, L., Nita-Rotaru, C.: On the practicality of integrity attacks on document-level sentiment analysis. In: Proceedings of the 2014 Workshop on Artificial Intelligent and Security Workshop, AISec (2014)

    Google Scholar 

  15. Oh, S.J., Augustin, M., Fritz, M., Schiele, B.: Towards reverse-engineering black-box neural networks. In: 6th International Conference on Learning Representations, ICLR (2018)

    Google Scholar 

  16. Papernot, N., McDaniel, P.D., Jha, S., Fredrikson, M., Celik, Z.B., Swami, A.: The limitations of deep learning in adversarial settings. In: IEEE European Symposium on Security and Privacy, EuroS&P (2016)

    Google Scholar 

  17. Schwarz, G.: Estimating the dimension of a model. Ann. Stat. 6, 461–464 (1978)

    Article  MathSciNet  Google Scholar 

  18. Steinhardt, J., Koh, P.W., Liang, P.: Certified defenses for data poisoning attacks. In: Conference on Neural Information Processing Systems (2017)

    Google Scholar 

  19. Szegedy, C., et al.: Intriguing properties of neural networks. In: 2nd International Conference on Learning Representations, ICLR (2014)

    Google Scholar 

  20. Tramèr, F., Zhang, F., Juels, A., Reiter, M.K., Ristenpart, T.: Stealing machine learning models via prediction APIs. In: 25th USENIX Security Symposium, USENIX (2016)

    Google Scholar 

  21. Wang, Y., Miller, D.J., Kesidis, G.: When not to classify: detection of reverse engineering attacks on DNN image classifiers. In: IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP (2019)

    Google Scholar 

  22. Xiang, Z., Miller, D.J., Kesidis, G.: A benchmark study of backdoor data poisoning defenses for deep neural network classifiers and a novel defense. In: 29th IEEE International Workshop on Machine Learning for Signal Processing, MLSP (2019)

    Google Scholar 

  23. Xiao, H., Biggio, B., Nelson, B., Xiao, H., Eckert, C., Roli, F.: Support vector machines under adversarial label contamination. Neurocomputing 160, 53–62 (2015)

    Article  Google Scholar 

  24. Zhang, X., Zhao, J., LeCun, Y.: Character-level convolutional networks for text classification. In: Proceedings of the 28th International Conference on Neural Information Processing Systems (2015)

    Google Scholar 

  25. Zhao, Q., Miller, D.J.: Mixture modeling with pairwise, instance-level class constraints. Neural Comput. 17, 2482–2507 (2005)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to George Kesidis .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Li, X., Miller, D.J., Xiang, Z., Kesidis, G. (2020). A Scalable Mixture Model Based Defense Against Data Poisoning Attacks on Classifiers. In: Darema, F., Blasch, E., Ravela, S., Aved, A. (eds) Dynamic Data Driven Applications Systems. DDDAS 2020. Lecture Notes in Computer Science(), vol 12312. Springer, Cham. https://doi.org/10.1007/978-3-030-61725-7_31

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-61725-7_31

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-61724-0

  • Online ISBN: 978-3-030-61725-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics