Abstract
Although great effort has been devoted to successfully detect Android malware, it still is a problem to be addressed. Its complexity increases due to the high number of features that can be obtained from Android apps in order to improve detection. Present paper proposes wrapper feature selection by applying a genetic algorithm and a Multilayer Perceptron. In order to validate this proposal, feature selection is performed on the well-known Drebin dataset on Apache Spark. Interesting results on the most informative features for the detection of existing Android malware have been obtained.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Global smartphone sales to end users from 1st quarter 2009. https://www.statista.com/statistics/266219/global-smartphone-sales-since-1st-quarter-2009-by-operating-system/
Yajin, Z., Xuxian, J.: Dissecting android malware: characterization and evolution. In: 2012 IEEE Symposium on Security and Privacy, pp. 95–109 (2012)
Apache Spark. https://spark.apache.org/
Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)
Larrañaga, P., Calvo, B., Santana, R., Bielza, C., Galdiano, J., Inza, I., Lozano, J.A., Armañanzas, R., Santafé, G., Pérez, A.: Machine learning in bioinformatics. Brief. Bioinform. 7(1), 86–112 (2006)
Ding, C., Peng, H.: Minimum redundancy feature selection from microarray gene expression data. J. Bioinform. Comput. Biol. 3(02), 185–205 (2005)
Liu, H., Liu, L., Zhang, H.: Ensemble gene selection by grouping for microarray data classification. J. Biomed. Inform. 43(1), 81–87 (2010)
Spreitzenbarth, M., Echtler, F., Schreck, T., Freling, F.C., Hoffmann, J.: Mobile-sandbox: having a deeper look into android applications. In: 28th International ACM Symposium on Applied Computing (SAC) (2013)
Arp, D., Spreitzenbarth, M., Hubner, M., Gascon, H., Rieck, K.: DREBIN: effective and explainable detection of android malware in your pocket. In: 21st Annual Network and Distributed System Security Symposium (2014)
Sánchez, R., Herrero, Á., Corchado, E.: Visualization and clustering for SNMP intrusion detection. Cybern. Syst. Int. J. 44(6–7), 505–532 (2013)
Pinzón, C., Herrero, Á., De Paz, J.F., Corchado, E., Bajo, J.: CBRid4SQL: a CBR intrusion detector for SQL injection attacks, pp. 510–519. Springer, Heidelberg (2010)
Corchado, E., Herrero, Á., Baruque, B., Sáiz, J.M.: Intrusion detection system based on a cooperative topology preserving method. In: Ribeiro, B., Albrecht, R.F., Dobnikar, A., Pearson, D.W., Steele, N.C. (eds.) International Conference on Adaptive and Natural Computing Algorithms (ICANNGA 2005), pp. 454–457. Springer, Vienna (2005)
Feizollah, A., Anuar, N.B., Salleh, R., Wahab, A.W.A.: A review on feature selection in mobile malware detection. Digit. Investig. 13, 22–37 (2015)
Hyo-Sik, H., Mi-Jung, C.: Analysis of android malware detection performance using machine learning classifiers. In: 2013 International Conference on ICT Convergence, pp. 490–495 (2013)
Shabtai, A., Elovici, Y.: Applying behavioral detection on android-based devices. In: Cai, Y., Magedanz, T., Li, M., Xia, J., Giannelli, C. (eds.) Mobile Wireless Middleware, Operating Systems, and Applications: Third International Conference, Mobilware 2010, Chicago, IL, USA, 30 June–2 July 2010, Revised Selected Papers, pp. 235–249. Springer, Heidelberg (2010)
Shabtai, A., Fledel, Y., Elovici, Y.: Automated static code analysis for classifying android applications using machine learning. In: 2010 International Conference on Computational Intelligence and Security, pp. 329–333 (2010)
Battista, P., Mercaldo, F., Nardone, V., Santone, A., Visaggio, C.: Identification of android malware families with model checking. In: 2nd International Conference on Information Systems Security and Privacy (2016)
Sedano, J., González, S., Chira, C., Herrero, A., Corchado, E., Villar, J.R.: Key features for the characterization of android malware families. Logic J. IGPL 25(1), 54–66 (2017)
John, G.H., Kohavi, R., Pfleger, K.: Irrelevant features and the subset selection problem. In: 11th International Conference on Machine Learning, pp. 121–129. Morgan Kauffman, San Francisco (1994)
Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artif. Intell. 97(1–2), 273–324 (1997)
Goldberg, D.E.: Genetic Algorithms in Search, Optimization, and Machine Learning. Addison-Wesley, Reading (1989)
Siedlecki, W., Sklansky, J.: A note on genetic algorithms for large-scale feature selection. Pattern Recogn. Lett. 10(5), 335–347 (1989)
Kramer, O.: Genetic Algorithm Essentials. Springer, Cham (2017)
Pal, S.K., Mitra, S.: Multilayer perceptron, fuzzy sets, and classification (2011)
Broyden, C.G., Dennis Jr., J.E., Moré, J.J.: On the local and superlinear convergence of Quasi-Newton methods. IMA J. Appl. Math. 12(3), 223–245 (1973)
Virus Total. https://www.virustotal.com
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
González, S., Herrero, Á., Sedano, J., Corchado, E. (2020). Neuro-Evolutionary Feature Selection to Detect Android Malware. In: Martínez Álvarez, F., Troncoso Lora, A., Sáez Muñoz, J., Quintián, H., Corchado, E. (eds) International Joint Conference: 12th International Conference on Computational Intelligence in Security for Information Systems (CISIS 2019) and 10th International Conference on EUropean Transnational Education (ICEUTE 2019). CISIS ICEUTE 2019 2019. Advances in Intelligent Systems and Computing, vol 951. Springer, Cham. https://doi.org/10.1007/978-3-030-20005-3_13
Download citation
DOI: https://doi.org/10.1007/978-3-030-20005-3_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-20004-6
Online ISBN: 978-3-030-20005-3
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)