Skip to main content
Log in

Domain invariant feature extraction against evasion attack

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

In the security application, an attacker might violate the data stationary assumption that is a common assumption in the most machine learning techniques. This problem named as the domain shift problem arises when training (source) and test (target) data follow different distributions. The inherent adversarial nature of the security applications considerably effects on the robustness of a learning system. For that, a classifier designer needs to evaluate the robustness of a learning system under potential attacks during the design phase. The previous studies investigate the effect of reduced feature vector on the security evaluation of a learning classifier. They demonstrate that traditional feature selection techniques lead to even worsen performance. Therefore, an adversary-aware feature selection algorithm is proposed to improve the robustness of the learning systems. However, prior studies in domain adaptation techniques which are fundamental in addressing domain shift problem demonstrate that original space may not be directly suitable for refining this distribution mismatch, because some features may have been distorted by the domain shift. In this paper, we propose domain invariant feature extraction model based on domain adaptation technique in order to address domain shift problem caused by an adversary. We conduct an experiment that graphically shows the effect of a successful attack on the MNIST handwritten digits classification task. After that, we design synthetic datasets to investigate the effect of reduced feature vector on the performance of a learning system under attack. Moreover, our proposed feature extraction model significantly outperforms the adversarial-aware feature selection and traditional feature selection models on the application of spam filtering

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  1. Barreno M, Nelson B, Joseph AD, Tygar J (2010) The security of machine learning. Mach Learn 81(2):121–148

    Article  MathSciNet  Google Scholar 

  2. Barreno M, Nelson B, Sears R, Joseph AD, Tygar JD (2006) Can machine learning be secure? In: Proceedings of the 2006 ACM symposium on information, computer and communications security. ACM, pp 16–25

  3. Basu T, Murthy C (2016) A supervised term selection technique for effective text categorization. Int J Mach Learn Cybern 7(5):877–892

    Article  Google Scholar 

  4. Biggio B, Corona I, Maiorca D, Nelson B, Šrndić N, Laskov P, Giacinto G, Roli F (2013) Evasion attacks against machine learning at test time. In: Machine learning and knowledge discovery in databases. Springer, pp 387–402

  5. Biggio B, Fumera G, Roli F (2014) Security evaluation of pattern classifiers under attack. Knowl Data Eng IEEE Trans 26(4):984–996

    Article  Google Scholar 

  6. Blitzer J, McDonald R, Pereira F (2006) Domain adaptation with structural correspondence learning. In: Proceedings of the 2006 conference on empirical methods in natural language processing. Association for Computational Linguistics, pp 120–128

  7. Brückner M, Kanzow C, Scheffer T (2012) Static prediction games for adversarial learning problems. J Mach Learn Res 13(1):2617–2654

    MathSciNet  MATH  Google Scholar 

  8. Brückner M, Scheffer T (2011) Stackelberg games for adversarial prediction problems. In: Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 547–555

  9. Byrd RH, Lu P, Nocedal J, Zhu C (1995) A limited memory algorithm for bound constrained optimization. SIAM J Sci Comput 16(5):1190–1208

    Article  MathSciNet  Google Scholar 

  10. Cao J, Chen T, Fan J (2016) Landmark recognition with compact bow histogram and ensemble ELM. Multimed Tools Appl 75(5):2839–2857

    Article  Google Scholar 

  11. Chen J, Guo M, Wang X, Liu B (2016) A comprehensive review and comparison of different computational methods for protein remote homology detection. Brief Bioinform. doi:10.1093/bib/bbw108

    Article  Google Scholar 

  12. Daume III H (2007) Frustratingly easy domain adaptation. In: Proceedings of the 45th annual meeting of the Association of Computational Linguistics, Prague, Czech Republic. pp 256–263

  13. Dekel O, Shamir O, Xiao L (2010) Learning to classify with missing and corrupted features. Mach Learn 81(2):149–178

    Article  MathSciNet  Google Scholar 

  14. Duan L, Tsang IW, Xu D, Maybank SJ (2009) Domain transfer svm for video concept detection. In: IEEE conference on computer vision and pattern recognition, 2009. CVPR 2009. IEEE, pp 1375–1381

  15. Gopalan R, Li R, Chellappa R (2014) Unsupervised adaptation across domain shifts by generating intermediate data representations. Pattern Ana Mach Intell IEEE Trans 36(11):2288–2302

    Article  Google Scholar 

  16. Huang J, Gretton A, Borgwardt KM, Schölkopf B, Smola AJ (2006) Correcting sample selection bias by unlabeled data. In: Advances in neural information processing systems, pp 601–608

  17. Huang L, Joseph AD, Nelson B, Rubinstein BI, Tygar J (2011) Adversarial machine learning. In: Proceedings of the 4th ACM workshop on Security and artificial intelligence. ACM, pp 43–58

  18. Jorgensen Z, Zhou Y, Inge M (2008) A multiple instance learning strategy for combating good word attacks on spam filters. J Mach Learn Res 9:1115–1146

    Google Scholar 

  19. Kołcz A, Teo CH (2009) Feature weighting for improved classifier robustness. In: CEAS09: sixth conference on email and anti-spam

  20. Li B, Vorobeychik Y (2014) Feature cross-substitution in adversarial classification. In: Advances in neural information processing systems, pp 2087–2095

  21. Liu B, Wang S, Dong Q, Li S, Liu X (2016) Identification of DNA-binding proteins by combining auto-cross covariance transformation and ensemble learning. IEEE Trans Nanobiosci 15(4):328–334

    Article  Google Scholar 

  22. Long M, Wang J, Ding G, Sun J, Yu P (2013) Transfer feature learning with joint distribution adaptation. In: Proceedings of the IEEE international conference on computer vision, pp 2200–2207

  23. Long M, Wang J, Ding G, Sun J, Yu P (2014) Transfer joint matching for unsupervised domain adaptation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1410–1417

  24. Lowd D, Meek C (2005) Adversarial learning. In: Proceedings of the eleventh ACM SIGKDD international conference on knowledge discovery in data mining. ACM, pp 641–647

  25. Macdonald C, Ounis I, Soboroff I (2007) Overview of the TREC 2007 blog track. In: TREC, vol 7. Citeseer, pp 31–43

  26. Nelson B, Barreno M, Chi FJ, Joseph AD, Rubinstein BI, Saini U, Sutton C, Tygar J, Xia K (2009) Misleading learners: co-opting your spam filter. In: Machine learning in cyber trust. Springer, pp 17–51

  27. Hearst MA et al (1998) Support vector machines. IEEE Intell Syst App 13(4):18–28

    Article  Google Scholar 

  28. Pan SJ, Ni X, Sun J-T, Yang Q, Chen Z (2010) Cross-domain sentiment classification via spectral feature alignment. In: Proceedings of the 19th international conference on World Wide Web. ACM, pp 751–760

  29. Pan SJ, Tsang IW, Kwok JT, Yang Q (2011) Domain adaptation via transfer component analysis. Neural Netw IEEE Trans 22(2):199–210

    Article  Google Scholar 

  30. Saenko K, Kulis B, Fritz M, Darrell T (2010) Adapting visual category models to new domains. In: Computer vision—ECCV 2010. Springer, pp 213–226

  31. Shah AR, Oehmen CS, Webb-Robertson B-J (2008) Svm-hustlean iterative semi-supervised machine learning approach for pairwise protein remote homology detection. Bioinformatics 24(6):783–790

    Article  Google Scholar 

  32. Uguroglu S, Carbonell J (2011) Feature selection for transfer learning. In: Machine learning and knowledge discovery in databases. Springer, pp 430–442

  33. Wang F, Liu W, Chawla S (2014) On sparse feature attacks in adversarial learning. In: 2014 IEEE international conference on data mining (ICDM). IEEE, pp 1013–1018

  34. Xiao H, Biggio B, Brown G, Fumera G, Eckert C, Roli F (2015) Is feature selection secure against training data poisoning? In: Proceedings of the 32nd international conference on machine learning (ICML-15), pp 1689–1698

  35. Zhang F et al (2016) Adversarial feature selection against evasion attacks. IEEE Trans Cybern 46(3):766–777

    Article  Google Scholar 

  36. Zhu C, Byrd RH, Lu P, Nocedal J (1997) Algorithm 778: L-bfgs-b: Fortran subroutines for large-scale bound-constrained optimization. ACM Trans Math Softw (TOMS) 23(4):550–560

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sattar Hashemi.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Khorshidpour, Z., Tahmoresnezhad, J., Hashemi, S. et al. Domain invariant feature extraction against evasion attack. Int. J. Mach. Learn. & Cyber. 9, 2093–2104 (2018). https://doi.org/10.1007/s13042-017-0692-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-017-0692-6

Keywords

Navigation