Authors:
Neeraj Chavan
;
Fabio Di Troia
and
Mark Stamp
Affiliation:
Department of Computer Science, San Jose State University, San Jose, California and U.S.A.
Keyword(s):
Malware, Android, Machine Learning, Random Forest, Logistic Model Tree, Artificial Neural Network.
Abstract:
In this paper, we present a comparative analysis of benign and malicious Android applications, based on static features. In particular, we focus our attention on the permissions requested by an application. We consider both binary classification of malware versus benign, as well as the multiclass problem, where we classify malware samples into their respective families. Our experiments are based on substantial malware datasets and we employ a wide variety of machine learning techniques, including decision trees and random forests, support vector machines, logistic model trees, AdaBoost, and artificial neural networks. We find that permissions are a strong feature and that by careful feature engineering, we can significantly reduce the number of features needed for highly accurate detection and classification.