Skip to main content
Log in

Adversarial learning: the impact of statistical sample selection techniques on neural ensembles

  • Original Paper
  • Published:
Evolving Systems Aims and scope Submit manuscript

Abstract

Adversarial learning is a recently introduced term which refers to the machine learning process in the presence of an adversary whose main goal is to cause dysfunction to the learning machine. The key problem in adversarial learning is to determine when and how an adversary will launch its attacks. It is important to equip the deployed machine learning system with an appropriate defence strategy so that it can still perform adequately in an adversarial learning environment. In this paper we investigate artificial neural networks as the machine learning algorithm to operate in such an environment, owing to their ability to learn a complex and nonlinear function even with little prior knowledge about the underlying true function. Two types of adversarial attacks are investigated: targeted attacks, which are aimed at a specific group of instances, and random attacks, which are aimed at arbitrary instances. We hypothesise that a neural ensemble performs better than a single neural network in adversarial learning. We test this hypothesis using simulated adversarial attacks, based on artificial, UCI and spam data sets. The results demonstrate that an ensemble of neural networks trained on attacked data is more robust against both types of attack than a single network. While many papers have demonstrated that an ensemble of neural networks is more robust against noise than a single network, the significance of the current work lies in the fact that targeted attacks are not white noise.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  • Angelov P, Lughofer E, Zhou X (2008) Evolving fuzzy classifiers using different model architectures. Fuzzy Sets Syst 159(23):3160–3182 (Theme: Modeling)

    Article  MATH  MathSciNet  Google Scholar 

  • Angelov P, Zhou X (2006) Evolving fuzzy systems from data streams in real-time. In: Evolving fuzzy systems, 2006 international symposium, pp 29–35

  • Barreno M, Nelson B, Sears R, Joseph AD, Tygar JD (2006) Can machine learning be secure? In: ASIACCS ’06: proceedings of the 2006 ACM symposium on information, computer and communications security. ACM, New York, pp 16–25

  • Caulcott E (1973) Significance tests. Routledge and Kegan Paul, Ltd, USA

  • Clark RD (1997) Optisim: an extended dissimilarity selection method for finding diverse representative subsets. J Chem Inf Comput Sci 37:1181–1188

    Google Scholar 

  • Dalvi N, Domingos P, Mausam, Sanghai S, Verma D (2004) Adversarial classification. In: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, New York, pp 99–108

  • Daszykowski M, Walczak B, Massart DL (2002) Representative subset selection. Anal Chim Acta 468:91–103

    Article  Google Scholar 

  • José María Gómez Hidalgo (2010) Machine learning for spam detection resources. ftp://ftp.ics.uci.edu/pub/machine-learning-databases/spambase/

  • Jorgensen Z, Zhou Y, Inge M (2008) A multiple instance learning strategy for combating good word attacks on spam filters. J Mach Learn Res 9:1115–1146

    Google Scholar 

  • Keijzer M, Babovic V (2000) Genetic programming, ensemble methods and the bias/variance tradeoff—introductory investigations. In: Lecture notes on computer science, vol 1802. Springer, pp 76–90

  • Kocjančič R, Zupan J (2000) Modelling of the river flowrate: the influence of the training set selection. Chemomet Intell Lab Syst 54:21–34

    Article  Google Scholar 

  • Liu H, Motoda H (2002) On issue of instance selection. Data Min Knowl Discov 6:115–130

    Article  MathSciNet  Google Scholar 

  • Lowd D, Meek C (2005) Adversarial learning. In: Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining. ACM, New York, pp 641–647

  • Manly BFJ (1994) Multivariate statistical methods A Primer, 2nd edn. Chapman and Hall, London

  • Nelson B, Barreno M, Fuching JC, Joseph AD, Rubinstein BIP, Saini U, Sutton C, Tygar JD, Xia K (2009) Misleading a learner: co-opting your spam filter. In: Machine Learning in Cyber Trust. Springer US, New York, pp 17–51

  • Nelson B, Joseph AD (2006) Bounding an attack’s complexity for a simple learning model. In: Proceedings of the first workshop on tackling computer system problems with machine learning techniques (SysML), pp 1–5

  • Newsome J, Karp B, Song D (2006) Paragraph: thwarting signature learning by training maliciously. In: Recent advances in Intrusion detection. Springer, Berlin, pp 81–105

  • Reinartz T (2002) A unifying view on instance selection. Data Min Knowl Discov 6:191–210

    Article  MATH  MathSciNet  Google Scholar 

  • Rimbaud DJ, Massart DL, Saby CA, Puel C (1997) Characterisation of the representativity of selected sets of samples in multivariate calibration and pattern recognition. Anal Chim Acta 350:149–161

    Article  Google Scholar 

  • UCI Machine Learning Repository Team (2010) UCI machine learning repository. http://archive.ics.uci.edu/ml/

  • Veloso A, Meira W Jr (2006) Lazy associative classification for content-based spam detection. In: The proceedings of the Latin American Web Congress

  • Wu W, Walczak B, Massart DL, Heuerding S, Erni F, Last IR, Prebble KA (1996) Artificial neural networks in classification of NIR spectral data: design of the training set. Chemomet Intell Lab Syst 33:35–46

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shir Li Wang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, S.L., Shafi, K., Lokan, C. et al. Adversarial learning: the impact of statistical sample selection techniques on neural ensembles. Evolving Systems 1, 181–197 (2010). https://doi.org/10.1007/s12530-010-9013-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12530-010-9013-y

Keywords

Navigation