Abstract
The use of mobile phones has increased in our lives because they offer nearly the same functionality as a personal computer. Besides, the number of applications available for Android-based mobile devices has also experienced a importat grow. Google offers to programmers the opportunity to upload and sell applications in the Android Market, but malware writers upload their malicious code there. In light of this background, we present here Malicious Android applications Detection through String analysis (MADS), a new method that extracts the contained strings from the Android applications to build machine-learning classifiers and detect malware.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Schultz, M., Eskin, E., Zadok, F., Stolfo, S.: Data mining methods for detection of new malicious executables. In: Proceedings of the 2001 IEEE Symposium on Security and Privacy, S&P, pp. 38ā49. IEEE (2001)
Santos, I., Devesa, J., Brezo, F., Nieves, J., Bringas, P.G.: OPEM: A static-dynamic approach for machine-learning-based malware detection. In: Herrero, Ć., SnĆ”Å”el, V., Abraham, A., Zelinka, I., Baruque, B., QuintiĆ”n, H., Calvo, J.L., Sedano, J., Corchado, E. (eds.) Int. Joint Conf. CISISā12-ICEUTEā12-SOCOā12. AISC, vol.Ā 189, pp. 271ā280. Springer, Heidelberg (2013)
Santos, I., Nieves, J., Bringas, P.G.: Semi-supervised learning for unknown malware detection. In: Abraham, A., Corchado, J.M., GonzĆ”lez, S.R., De Paz Santana, J.F. (eds.) International Symposium on DCAI. AISC, vol.Ā 91, pp. 415ā422. Springer, Heidelberg (2011)
Santos, I., Laorden, C., Bringas, P.G.: Collective classification for unknown malware detection. In: Proceedings of the 6th International Conference on Security and Cryptography (SECRYPT), pp. 251ā256 (2011)
Santos, I., Brezo, F., Ugarte-Pedrero, X., Bringas, P.G.: Opcode Sequences as Representation of Executables for Data-mining-based Unknown Malware Detection. Information SciencesĀ 231, 64ā82 (2013) ISSN: 0020-0255, doi:10.1016/j.ins.2011.08.020
Rieck, K., Holz, T., Willems, C., DĆ¼ssel, P., Laskov, P.: Learning and classification of malware behavior. In: Zamboni, D. (ed.) DIMVA 2008. LNCS, vol.Ā 5137, pp. 108ā125. Springer, Heidelberg (2008)
Tian, R., Batten, L., Islam, R., Versteeg, S.: An automated classification system based on the strings of trojan and virus families. In: Proceedings of the 4th International Conference on Malicious and Unwanted Software (MALWARE), pp. 23ā30 (2009)
Shabtai, A., Fledel, Y., Elovici, Y.: Automated static code analysis for classifying android applications using machine learning. In: Proceedings of the International Conference on Computational Intelligence and Security (CIS), pp. 329ā333 (2010)
Burguera, I., Zurutuza, U., Nadjm-Tehrani, S.: Crowdroid: behavior-based malware detection system for android. In: Proceedings of the 1st ACM Workshop on Security and Privacy in Smartphones and Mobile Devices, pp. 15ā26. ACM (2011)
Blasing, T., Batyuk, L., Schmidt, A., Camtepe, S., Albayrak, S.: An android application sandbox system for suspicious software detection. In: Proceedings of the 5th International Conference on Malicious and Unwanted Software (MALWARE), pp. 55ā62 (2010)
Shabtai, A., Elovici, Y.: Applying behavioral detection on android-based devices. In: Cai, Y., Magedanz, T., Li, M., Xia, J., Giannelli, C. (eds.) Mobilware 2010. LNICST, vol.Ā 48, pp. 235ā249. Springer, Heidelberg (2010)
Oberheide, J., Miller, J.: Dissecting the android bouncer. In: SUMERCON 2012 (2012), http://jon.oberheide.org/files/summercon12-bouncer.pdf
Santos, I., Penya, Y., Devesa, J., Bringas, P.G.: N-Grams-based file signatures for malware detection. In: Proceedings of the 11th International Conference on Enterprise Information Systems (ICEIS), vol. AIDSS, pp. 317ā320 (2009)
Baeza-Yates, R.A., Ribeiro-Neto, B.: Modern Information Retrieval. Addison-Wesley Longman Publishing Co., Inc, Boston (1999)
Salton, G., McGill, M.: Introduction to modern information retrieval. McGraw-Hill, New York (1983)
Bishop, C.: Pattern recognition and machine learning. Springer, New York (2006)
Kotsiantis, S., Zaharakis, I., Pintelas, P.: Supervised machine learning: A review of classification techniques. Frontiers in Artificial Intelligence and ApplicationsĀ 160, 3 (2007)
Kotsiantis, S., Pintelas, P.: Recent advances in clustering: A brief survey. WSEAS Transactions on Information Science and ApplicationsĀ 1(1), 73ā81 (2004)
Chapelle, O., Schƶlkopf, B., Zien, A.: Semi-supervised learning. MIT Press (2006)
Pearl, J.: Reverend bayes on inference engines: a distributed hierarchical approach. In: Proceedings of the National Conference on Artificial Intelligence, pp. 133ā136 (1982)
Castillo, E., GutiƩrrez, J.M., Hadi, A.S.: Expert Systems and Probabilistic Network Models, Erste edn., New York, NY, USA (1996)
Quinlan, J.: Induction of decision trees. Machine LearningĀ 1(1), 81ā106 (1986)
Breiman, L.: Random forests. Machine LearningĀ 45(1), 5ā32 (2001)
Garner, S.: Weka: The Waikato environment for knowledge analysis. In: Proceedings of the 1995 New Zealand Computer Science Research Students Conference, pp. 57ā64 (1995)
Quinlan, J.: C4.5 programs for machine learning. Morgan Kaufmann Publishers (1993)
Fix, E., Hodges, J.L.: Discriminatory analysis: Nonparametric discrimination: Small sample performance. Technical Report Project 21-49-004, Report Number 11 (1952)
Vapnik, V.: The nature of statistical learning theory. Springer (2000)
Amari, S., Wu, S.: Improving support vector machine classifiers by modifying kernel functions. Neural NetworksĀ 12(6), 783ā789 (1999)
Kohavi, R.: A study of cross-validation and bootstrap for accuracy estimation and model selection. In: International Joint Conference on Artificial Intelligence, vol.Ā 14, pp. 1137ā1145. Lawrence Erlbaum Associates Ltd. (1995)
Devijver, P., Kittler, J.: Pattern recognition: A statistical approach. Prentice/Hall International (1982)
Singh, Y., Kaur, A., Malhotra, R.: Comparative analysis of regression and machine learning methods for predicting fault proneness models. International Journal of Computer Applications in TechnologyĀ 35(2), 183ā193 (2009)
Elkan, C.: The foundations of cost-sensitive learning. In: Proceedings of the 2001 International Joint Conference on Artificial Intelligence, pp. 973ā978 (2001)
Shabtai, A., Kanonov, U., Elovici, Y., Glezer, C., Weiss, Y.: Andromaly: a behavioral malware detection framework for android devices. Journal of Intelligent Information Systems, 1ā30 (2012)
Peng, H., Gates, C., Sarma, B., Li, N., Qi, Y., Potharaju, R., Nita-Rotaru, C., Molloy, I.: Using probabilistic generative models for ranking risks of android apps. In: Proceedings of the 2012 ACM Conference on Computer and Communications Security, pp. 241ā252. ACM (2012)
Cano, J., Herrera, F., Lozano, M.: On the combination of evolutionary algorithms and stratified strategies for training set selection in data mining. Applied Soft Computing JournalĀ 6(3), 323ā332 (2006)
Czarnowski, I., Jedrzejowicz, P.: Instance reduction approach to machine learning and multi-database mining. In: Proceedings of the 2006 Scientific Session Organized during XXI Fall Meeting of the Polish Information Processing Society, Informatica, ANNALES Universitatis Mariae Curie-SkÅodowska, Lublin, pp. 60ā71 (2006)
Pyle, D.: Data preparation for data mining. Morgan Kaufmann (1999)
Tsang, E., Yeung, D., Wang, X.: OFFSS: optimal fuzzy-valued feature subset selection. IEEE Transactions on Fuzzy SystemsĀ 11(2), 202ā213 (2003)
Torkkola, K.: Feature extraction by non parametric mutual information maximization. The Journal of Machine Learning ResearchĀ 3, 1415ā1438 (2003)
Dash, M., Liu, H.: Consistency-based search in feature selection. Artificial IntelligenceĀ 151(1-2), 155ā176 (2003)
Liu, H., Motoda, H.: Instance selection and construction for data mining. Kluwer Academic Pub. (2001)
Liu, H., Motoda, H.: Computational methods of feature selection. Chapman & Hall/CRC (2008)
Blum, A., Langley, P.: Selection of relevant features and examples in machine learning. Artificial IntelligenceĀ 97(1-2), 245ā271 (1997)
Derrac, J., GarcĆa, S., Herrera, F.: A First Study on the Use of Coevolutionary Algorithms for Instance and Feature Selection. In: Corchado, E., Wu, X., Oja, E., Herrero, Ć., Baruque, B. (eds.) HAIS 2009. LNCS (LNAI), vol.Ā 5572, pp. 557ā564. Springer, Heidelberg (2009)
Dietterich, T., Lathrop, R., Lozano-PĆ©rez, T.: Solving the multiple instance problem with axis-parallel rectangles. Artificial IntelligenceĀ 89(1-2), 31ā71 (1997)
Maron, O., Lozano-PĆ©rez, T.: A framework for multiple-instance learning. In: Advances in Neural Information Processing Systems, pp. 570ā576 (1998)
Kang, M., Poosankam, P., Yin, H.: Renovo: A hidden code extractor for packed executables. In: Proceedings of the 2007 ACM Workshop on Recurring Malcode, pp. 46ā53 (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
Ā© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Sanz, B., Santos, I., Nieves, J., Laorden, C., Alonso-Gonzalez, I., Bringas, P.G. (2013). MADS: Malicious Android Applications Detection through String Analysis. In: Lopez, J., Huang, X., Sandhu, R. (eds) Network and System Security. NSS 2013. Lecture Notes in Computer Science, vol 7873. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38631-2_14
Download citation
DOI: https://doi.org/10.1007/978-3-642-38631-2_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-38630-5
Online ISBN: 978-3-642-38631-2
eBook Packages: Computer ScienceComputer Science (R0)