Abstract
The i.i.d assumption is the corner stone of most conventional machine learning algorithms. However, reducing the bias and variance of the learning model on the i.i.d dataset may not help the model to prevent from their failure on the adversarial samples, which are intentionally generated by either the malicious users or its rival programs. This paper gives a brief introduction of machine learning and adversarial learning, discussing the research frontier of the adversarial issues noticed by both the machine learning and network security field. We argue that one key reason of the adversarial issue is that the learning algorithms may not exploit the input feature set enough, so that the attackers can focus on a small set of features to trick the model. To address this issue, we consider two important classes of classifiers. For random forest, we propose a type of random forest called Weighted Random Forest (WRF) to encourage the model to give even credits to the input features. This approach can be further improved by careful selection of a subset of trees based on the clustering analysis during the run time. For neural networks, we propose to introduce extra soft constraints based on the weight variance to the objective function, such that the model would base the classification decision on more evenly distributed feature impact. Empirical experiments show that these approaches can effectively improve the robustness of the learnt model against their baseline systems.












Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
Since in the context of neural networks, the term “weight” is conventional reserved for \(w_i\) within the linear transformation, we choose the term “impact” to describe the importance of an input dimension on the output to avoid the conflict.
References
Alomari E, Manickam S, Gupta BB, Karuppayah S, Alfaris R (2012) Botnet-based distributed denial of service (DDoS) attacks on web servers: classification and art. Int J Comput Appl 49(7):24–32
Atat R, Liu L, Chen H, Wu J, Li H, Yi Y (2017) Enabling cyber-physical communication in 5g cellular networks: challenges, spatial spectrum sensing, and cyber-security. IET Cyber Phys Syst Theory Appl 2(1):49–54
Barreno M, Nelson B, Sears R, Joseph AD, Tygar JD (2006) Can machine learning be secure? In: ACM symposium on information, computer and communications security, pp 16–25
Chang X, Ma Z, Yang Y, Zeng Z, Hauptmann AG (2017a) Bi-level semantic representation analysis for multimedia event detection. IEEE Trans Cybern 47(5):1180–1197
Chang X, Yu Y, Yang Y, Xing EP (2017b) Semantic pooling for complex event analysis in untrimmed videos. IEEE Trans Pattern Anal Mach Intell 39(8):1617–1632
Dekel O, Shamir O (2008) Learning to classify with missing and corrupted features. In: International conference on machine learning, pp 216–223
Dua S, Du X (2011) Data mining and machine learning in cybersecurity. Auerbach Publications, Boca Raton
Fogla P, Sharif M, Perdisci R, Kolesnikov O, Lee W (2006) Polymorphic blending attacks. In: Conference on USENIX security symposium, pp 241–256
Galbally-Herrero J, Fierrez-Aguilar J, Rodriguez-Gonzalez JD, Alonso-Fernandez F (2006) On the vulnerability of fingerprint verification systems to fake fingerprints attacks. In: 2006 International Carnahan conference on security technology, pp 130–136
Globerson A, Roweis S (2006) Nightmare at test time: robust learning by feature deletion. In: International conference, pp 353–360
Goodfellow IJ, Shlens J, Szegedy C (2015) Explaining and harnessing adversarial examples. In: International conference on learning representations
Gupta B, Badve OP (2017) Taxonomy of DoS and DDoS attacks and desirable defense mechanism in a cloud computing environment. Neural Comput Appl 28(12):3655–3682
Gupta B, Arachchilage NA, Psannis KE (2018) Defending against phishing attacks: taxonomy of methods, current issues and future directions. Telecommun Syst 67(2):247–267
Hamedani K, Liu L, Rachad A, Wu J, Yi Y (2017) Reservoir computing meets smart grids: attack detection using delayed feedback networks. IEEE Trans Ind Inform PP(99):1–1
Hecht-Nielsen R (1989) Theory of backpropagation neural networks. In: International joint conference on neural networks, vol 1, pp 593–605
Ho TK (1998) The random subspace method for constructing decision forests. IEEE Trans Pattern Anal Mach Intell 20(8):832–844
Huang Z, Liu S, Mao X, Chen K, Li J (2017) Insight of the protection for data security under selective opening attacks. Inf Sci 412–413:223–241
Jain AK, Gupta B (2016) Comparative analysis of features based machine learning approaches for phishing detection. In: Computing for sustainable global development (INDIACom), 2016 3rd international conference on, IEEE, pp 2125–2130
Kołcz A, Teo CH (2009) Feature weighting for improved classifier robustness. In: CEAS09: sixth conference on email and anti-spam
Lanckriet GRG, Ghaoui LE, Bhattacharyya C, Jordan MI (2003) A robust minimax approach to classification. J Mach Learn Res 3(3):555–582
Lee W, Stolfo SJ et al (1998) Data mining approaches for intrusion detection. In: USENIX security symposium. San Antonio, TX, pp 79–93
Lee W, Stolfo SJ, Mok KW (1999) A data mining framework for building intrusion detection models. In: Proceedings of the IEEE symposium on security & privacy, p 0120
Lee W, Stolfo SJ, Mok KW (2000) Adaptive intrusion detection: a data mining approach. Kluwer Academic Publishers, Dordrecht
Li J, Huang X, Li J, Chen X, Xiang Y (2014) Securely outsourcing attribute-based encryption with checkability. IEEE Trans Parallel Distrib Syst 25(8):2201–2210
Li J, Chen X, Huang X, Tang S, Xiang Y, Hassan MM, Alelaiwi A (2015a) Secure distributed deduplication systems with improved reliability. IEEE Trans Comput 64(12):3569–3579
Li J, Li J, Chen X, Jia C, Lou W (2015b) Identity-based encryption with outsourced revocation in cloud computing. IEEE Trans Comput 64(2):425–437
Li J, Li YK, Chen X, Lee P, Lou W (2015c) A hybrid cloud approach for secure authorized deduplication. IEEE Trans Parallel Distrib Syst 26(5):1206–1216
Li P, Li J, Huang Z, Gao CZ, Chen WB, Chen K (2017a) Privacy-preserving outsourced classification in cloud computing. Clust Comput 1–10
Li P, Li J, Huang Z, Li T, Gao CZ, Yiu SM, Chen K (2017b) Multi-key privacy-preserving deep learning in cloud computing. Future Gener Comput Syst 74(Supplement C):76–85. https://doi.org/10.1016/j.future.2017.02.006
Li J, Zhang Y, Chen X, Xiang Y, Li J, Zhang Y, Chen X, Xiang Y (2018) Secure attribute-based data sharing for resource-limited users in cloud computing. Comput Secur 72:1–12
Lowd D, Meek C (2005) Adversarial learning. In: Eleventh ACM SIGKDD international conference on knowledge discovery in data mining, pp 641–647
Mitchell TM (1997) Machine learning. McGraw Hill, New York
Samuel AL (1995) Some studies in machine learning using the game of checkers. In: Computers & thought. MIT Press, Cambridge. pp 71–105. http://dl.acm.org/citation.cfm?id=216408.216415
Szegedy C, Zaremba W, Sutskever I, Bruna J, Erhan D, Goodfellow I, Fergus R (2014) Intriguing properties of neural networks. In: International conference on learning representations
Szegedy C, Liu W, Jia Y, Sermanet P, Reed SE, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. Comput Vis Pattern Recognit 1–9
Tavallaee M, Bagheri E, Lu W, Ghorbani AA (2009) A detailed analysis of the KDD CUP 99 data set. In: IEEE international conference on computational intelligence for security & defense applications, pp 1–6
Teo CH, Globerson A, Roweis S, Smola AJ (2008) Convex learning with invariances. In: International conference on neural information processing systems, pp 1489–1496
Uludag U, Jain AK (2004) Attacks on biometric systems: a case study in fingerprints. Proc SPIE Int Soc Opt Eng 6:622–633
Wen H, Tang J, Wu J, Song H, Wu T, Wu B, Ho P, Lv S, Sun L (2015) A cross-layer secure communication model based on discrete fractional fourier fransform (DFRFT). IEEE Tran Emerg Top Comput 3(1):119–126
Wolpert DH, Macready WG (1997) No free lunch theorems for optimization. IEEE Trans Evol Comput 1(1):67–82
Wu J, Guo S, Li J, Zeng D (2016a) Big data meet green challenges: big data toward green applications. IEEE Syst J 10(3):888–900
Wu J, Guo S, Li J, Zeng D (2016b) Big data meet green challenges: greening big data. IEEE Syst J 10(3):873–887
Zeiler MD, Fergus R (2013) Visualizing and understanding convolutional networks. In: European conference on computer vision, pp 818–833
Zhou Y, Kantarcioglu M, Thuraisingham B, Xi B (2012) Adversarial support vector machine learning. In: ACM SIGKDD international conference on knowledge discovery and data mining, pp 1059–1067
Acknowledgements
This work was supported by Grant Shandong education Department (J16LN73), Shanghai University Youth Teacher Training Funding Scheme (10-17-309-802), and Shandong independent innovation and achievements transformation project (2014ZZCX07106).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Cao, N., Li, G., Zhu, P. et al. Handling the adversarial attacks. J Ambient Intell Human Comput 10, 2929–2943 (2019). https://doi.org/10.1007/s12652-018-0714-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12652-018-0714-6