MADS: Malicious Android Applications Detection through String Analysis

Sanz, Borja; Santos, Igor; Nieves, Javier; Laorden, Carlos; Alonso-Gonzalez, Iñigo; Bringas, Pablo G.

doi:10.1007/978-3-642-38631-2_14

Borja Sanz¹⁹,
Igor Santos¹⁹,
Javier Nieves¹⁹,
Carlos Laorden¹⁹,
Iñigo Alonso-Gonzalez¹⁹ &
…
Pablo G. Bringas¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 7873))

Included in the following conference series:

International Conference on Network and System Security

3787 Accesses
3 Citations

Abstract

The use of mobile phones has increased in our lives because they offer nearly the same functionality as a personal computer. Besides, the number of applications available for Android-based mobile devices has also experienced a importat grow. Google offers to programmers the opportunity to upload and sell applications in the Android Market, but malware writers upload their malicious code there. In light of this background, we present here Malicious Android applications Detection through String analysis (MADS), a new method that extracts the contained strings from the Android applications to build machine-learning classifiers and detect malware.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Schultz, M., Eskin, E., Zadok, F., Stolfo, S.: Data mining methods for detection of new malicious executables. In: Proceedings of the 2001 IEEE Symposium on Security and Privacy, S&P, pp. 38–49. IEEE (2001)
Google Scholar
Santos, I., Devesa, J., Brezo, F., Nieves, J., Bringas, P.G.: OPEM: A static-dynamic approach for machine-learning-based malware detection. In: Herrero, Á., Snášel, V., Abraham, A., Zelinka, I., Baruque, B., Quintián, H., Calvo, J.L., Sedano, J., Corchado, E. (eds.) Int. Joint Conf. CISIS’12-ICEUTE’12-SOCO’12. AISC, vol. 189, pp. 271–280. Springer, Heidelberg (2013)
Chapter Google Scholar
Santos, I., Nieves, J., Bringas, P.G.: Semi-supervised learning for unknown malware detection. In: Abraham, A., Corchado, J.M., González, S.R., De Paz Santana, J.F. (eds.) International Symposium on DCAI. AISC, vol. 91, pp. 415–422. Springer, Heidelberg (2011)
Chapter Google Scholar
Santos, I., Laorden, C., Bringas, P.G.: Collective classification for unknown malware detection. In: Proceedings of the 6th International Conference on Security and Cryptography (SECRYPT), pp. 251–256 (2011)
Google Scholar
Santos, I., Brezo, F., Ugarte-Pedrero, X., Bringas, P.G.: Opcode Sequences as Representation of Executables for Data-mining-based Unknown Malware Detection. Information Sciences 231, 64–82 (2013) ISSN: 0020-0255, doi:10.1016/j.ins.2011.08.020
Google Scholar
Rieck, K., Holz, T., Willems, C., Düssel, P., Laskov, P.: Learning and classification of malware behavior. In: Zamboni, D. (ed.) DIMVA 2008. LNCS, vol. 5137, pp. 108–125. Springer, Heidelberg (2008)
Chapter Google Scholar
Tian, R., Batten, L., Islam, R., Versteeg, S.: An automated classification system based on the strings of trojan and virus families. In: Proceedings of the 4th International Conference on Malicious and Unwanted Software (MALWARE), pp. 23–30 (2009)
Google Scholar
Shabtai, A., Fledel, Y., Elovici, Y.: Automated static code analysis for classifying android applications using machine learning. In: Proceedings of the International Conference on Computational Intelligence and Security (CIS), pp. 329–333 (2010)
Google Scholar
Burguera, I., Zurutuza, U., Nadjm-Tehrani, S.: Crowdroid: behavior-based malware detection system for android. In: Proceedings of the 1st ACM Workshop on Security and Privacy in Smartphones and Mobile Devices, pp. 15–26. ACM (2011)
Google Scholar
Blasing, T., Batyuk, L., Schmidt, A., Camtepe, S., Albayrak, S.: An android application sandbox system for suspicious software detection. In: Proceedings of the 5th International Conference on Malicious and Unwanted Software (MALWARE), pp. 55–62 (2010)
Google Scholar
Shabtai, A., Elovici, Y.: Applying behavioral detection on android-based devices. In: Cai, Y., Magedanz, T., Li, M., Xia, J., Giannelli, C. (eds.) Mobilware 2010. LNICST, vol. 48, pp. 235–249. Springer, Heidelberg (2010)
Chapter Google Scholar
Oberheide, J., Miller, J.: Dissecting the android bouncer. In: SUMERCON 2012 (2012), http://jon.oberheide.org/files/summercon12-bouncer.pdf
Santos, I., Penya, Y., Devesa, J., Bringas, P.G.: N-Grams-based file signatures for malware detection. In: Proceedings of the 11th International Conference on Enterprise Information Systems (ICEIS), vol. AIDSS, pp. 317–320 (2009)
Google Scholar
Baeza-Yates, R.A., Ribeiro-Neto, B.: Modern Information Retrieval. Addison-Wesley Longman Publishing Co., Inc, Boston (1999)
Google Scholar
Salton, G., McGill, M.: Introduction to modern information retrieval. McGraw-Hill, New York (1983)
MATH Google Scholar
Bishop, C.: Pattern recognition and machine learning. Springer, New York (2006)
MATH Google Scholar
Kotsiantis, S., Zaharakis, I., Pintelas, P.: Supervised machine learning: A review of classification techniques. Frontiers in Artificial Intelligence and Applications 160, 3 (2007)
Google Scholar
Kotsiantis, S., Pintelas, P.: Recent advances in clustering: A brief survey. WSEAS Transactions on Information Science and Applications 1(1), 73–81 (2004)
Google Scholar
Chapelle, O., Schölkopf, B., Zien, A.: Semi-supervised learning. MIT Press (2006)
Google Scholar
Pearl, J.: Reverend bayes on inference engines: a distributed hierarchical approach. In: Proceedings of the National Conference on Artificial Intelligence, pp. 133–136 (1982)
Google Scholar
Castillo, E., Gutiérrez, J.M., Hadi, A.S.: Expert Systems and Probabilistic Network Models, Erste edn., New York, NY, USA (1996)
Google Scholar
Quinlan, J.: Induction of decision trees. Machine Learning 1(1), 81–106 (1986)
Google Scholar
Breiman, L.: Random forests. Machine Learning 45(1), 5–32 (2001)
Article MATH Google Scholar
Garner, S.: Weka: The Waikato environment for knowledge analysis. In: Proceedings of the 1995 New Zealand Computer Science Research Students Conference, pp. 57–64 (1995)
Google Scholar
Quinlan, J.: C4.5 programs for machine learning. Morgan Kaufmann Publishers (1993)
Google Scholar
Fix, E., Hodges, J.L.: Discriminatory analysis: Nonparametric discrimination: Small sample performance. Technical Report Project 21-49-004, Report Number 11 (1952)
Google Scholar
Vapnik, V.: The nature of statistical learning theory. Springer (2000)
Google Scholar
Amari, S., Wu, S.: Improving support vector machine classifiers by modifying kernel functions. Neural Networks 12(6), 783–789 (1999)
Article Google Scholar
Kohavi, R.: A study of cross-validation and bootstrap for accuracy estimation and model selection. In: International Joint Conference on Artificial Intelligence, vol. 14, pp. 1137–1145. Lawrence Erlbaum Associates Ltd. (1995)
Google Scholar
Devijver, P., Kittler, J.: Pattern recognition: A statistical approach. Prentice/Hall International (1982)
Google Scholar
Singh, Y., Kaur, A., Malhotra, R.: Comparative analysis of regression and machine learning methods for predicting fault proneness models. International Journal of Computer Applications in Technology 35(2), 183–193 (2009)
Article Google Scholar
Elkan, C.: The foundations of cost-sensitive learning. In: Proceedings of the 2001 International Joint Conference on Artificial Intelligence, pp. 973–978 (2001)
Google Scholar
Shabtai, A., Kanonov, U., Elovici, Y., Glezer, C., Weiss, Y.: Andromaly: a behavioral malware detection framework for android devices. Journal of Intelligent Information Systems, 1–30 (2012)
Google Scholar
Peng, H., Gates, C., Sarma, B., Li, N., Qi, Y., Potharaju, R., Nita-Rotaru, C., Molloy, I.: Using probabilistic generative models for ranking risks of android apps. In: Proceedings of the 2012 ACM Conference on Computer and Communications Security, pp. 241–252. ACM (2012)
Google Scholar
Cano, J., Herrera, F., Lozano, M.: On the combination of evolutionary algorithms and stratified strategies for training set selection in data mining. Applied Soft Computing Journal 6(3), 323–332 (2006)
Article Google Scholar
Czarnowski, I., Jedrzejowicz, P.: Instance reduction approach to machine learning and multi-database mining. In: Proceedings of the 2006 Scientific Session Organized during XXI Fall Meeting of the Polish Information Processing Society, Informatica, ANNALES Universitatis Mariae Curie-Skłodowska, Lublin, pp. 60–71 (2006)
Google Scholar
Pyle, D.: Data preparation for data mining. Morgan Kaufmann (1999)
Google Scholar
Tsang, E., Yeung, D., Wang, X.: OFFSS: optimal fuzzy-valued feature subset selection. IEEE Transactions on Fuzzy Systems 11(2), 202–213 (2003)
Article Google Scholar
Torkkola, K.: Feature extraction by non parametric mutual information maximization. The Journal of Machine Learning Research 3, 1415–1438 (2003)
MathSciNet MATH Google Scholar
Dash, M., Liu, H.: Consistency-based search in feature selection. Artificial Intelligence 151(1-2), 155–176 (2003)
Article MathSciNet MATH Google Scholar
Liu, H., Motoda, H.: Instance selection and construction for data mining. Kluwer Academic Pub. (2001)
Google Scholar
Liu, H., Motoda, H.: Computational methods of feature selection. Chapman & Hall/CRC (2008)
Google Scholar
Blum, A., Langley, P.: Selection of relevant features and examples in machine learning. Artificial Intelligence 97(1-2), 245–271 (1997)
Article MathSciNet MATH Google Scholar
Derrac, J., García, S., Herrera, F.: A First Study on the Use of Coevolutionary Algorithms for Instance and Feature Selection. In: Corchado, E., Wu, X., Oja, E., Herrero, Á., Baruque, B. (eds.) HAIS 2009. LNCS (LNAI), vol. 5572, pp. 557–564. Springer, Heidelberg (2009)
Chapter Google Scholar
Dietterich, T., Lathrop, R., Lozano-Pérez, T.: Solving the multiple instance problem with axis-parallel rectangles. Artificial Intelligence 89(1-2), 31–71 (1997)
Article MATH Google Scholar
Maron, O., Lozano-Pérez, T.: A framework for multiple-instance learning. In: Advances in Neural Information Processing Systems, pp. 570–576 (1998)
Google Scholar
Kang, M., Poosankam, P., Yin, H.: Renovo: A hidden code extractor for packed executables. In: Proceedings of the 2007 ACM Workshop on Recurring Malcode, pp. 46–53 (2007)
Google Scholar

Download references

Author information

Authors and Affiliations

S3 Lab, DeustoTech Computing, University of Deusto, Bilbao, Spain
Borja Sanz, Igor Santos, Javier Nieves, Carlos Laorden, Iñigo Alonso-Gonzalez & Pablo G. Bringas

Authors

Borja Sanz
View author publications
You can also search for this author in PubMed Google Scholar
Igor Santos
View author publications
You can also search for this author in PubMed Google Scholar
Javier Nieves
View author publications
You can also search for this author in PubMed Google Scholar
Carlos Laorden
View author publications
You can also search for this author in PubMed Google Scholar
Iñigo Alonso-Gonzalez
View author publications
You can also search for this author in PubMed Google Scholar
Pablo G. Bringas
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Computer Science Department, ETSI Informatica, University of Malaga, Campus de Teatinos, 29071, Malaga, Spain
Javier Lopez
School of Mathematics and Computer Science, Fujian Normal University, No. 32 Shangsan Road, 350007, Fuzhou, China
Xinyi Huang
Institute for Cyber Security,, University of Texas at San Antonio, One UTSA Circle, 78249, San Antonio, TX, USA
Ravi Sandhu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sanz, B., Santos, I., Nieves, J., Laorden, C., Alonso-Gonzalez, I., Bringas, P.G. (2013). MADS: Malicious Android Applications Detection through String Analysis. In: Lopez, J., Huang, X., Sandhu, R. (eds) Network and System Security. NSS 2013. Lecture Notes in Computer Science, vol 7873. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38631-2_14

Download citation

DOI: https://doi.org/10.1007/978-3-642-38631-2_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-38630-5
Online ISBN: 978-3-642-38631-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics