Skip to main content
Log in

Leveraging the first line of defense: a study on the evolution and usage of android security permissions for enhanced android malware detection

  • Original Paper
  • Published:
Journal of Computer Virology and Hacking Techniques Aims and scope Submit manuscript

Abstract

Android security permissions are built-in security features that constrain what an app can do and access on the system, that is, its privileges. Permissions have been widely used for Android malware detection, mostly in combination with other relevant app attributes. The available set of permissions is dynamic, refined in every new Android OS version release. The refinement process adds new permissions and deprecates others. These changes directly impact the type and prevalence of permissions requested by malware and legitimate applications over time. Furthermore, malware trends and benign apps’ inherent evolution influence their requested permissions. Therefore, the usage of these features in machine learning-based malware detection systems is prone to concept drift issues. Despite that, no previous study related to permissions has taken into account concept drift. In this study, we demonstrate that when concept drift is addressed, permissions can generate long-lasting and effective malware detection systems. Furthermore, the discriminatory capabilities of distinct set of features are tested. We found that the initial set of permissions, defined in Android 1.0 (API level 1), are sufficient to build an effective detection model, providing an average 0.93 F1 score in data that spans seven years. In addition, we explored and characterized permissions evolution using local and global interpretation methods. In this regard, the varying importance of individual permissions for malware and benign software recognition tasks over time are analyzed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

References

  1. Hautala, L.: Android malware tries to trick you. Here’s how to spot it. https://www.cnet.com/tech/services-and-software/android-malware-tries-to-trick-you-heres-how-to-spot-it/ (2021)

  2. Palmer, D.: Sophisticated android malware spies on smartphones users and runs up their phone bill too. https://www.zdnet.com/article/sophisticated-android-malware-spies-on-smartphones-users-and-runs-up-their-phone-bill-too/ (2018)

  3. Yaswant, A.: New advanced android malware posing as “ystem update”. https://blog.zimperium.com/new-advanced-android-malware-posing-as-system-update/ (2021)

  4. O’Dea, S.: Mobile operating systems’ market share worldwide from January 2012 to June 2021. https://www.statista.com/statistics/272698/global-market-share-held-by-mobile-operating-systems-since-2009/ (2021)

  5. Kaspersky: Can you get viruses on android? every android user is at risk. https://www.kaspersky.com/resource-center/preemptive-safety/android-malware-risk (2021)

  6. Velzian, B.: Calling all threat hunters—mobile malware to look out for in 2021. https://www.wandera.com/calling-all-threat-hunters-mobile-malware-to-look-out-for-in-2021/ (2021)

  7. Android: App permissions best practices. https://developer.android.com/training/permissions/usage-notes (2021)

  8. Google: Google play protect. https://developers.google.com/android/play-protect (2021)

  9. Samsung: This is protection, samsung knox. https://www.samsungknox.com/en/secured-by-knox (2021)

  10. Withwam, R.: Android antivirus apps are useless—here’s what to do instead. https://www.extremetech.com/computing/104827-android-antivirus-apps-are-useless-heres-what-to-do-instead (2020)

  11. Lakshmanan, R.: Joker malware apps once again bypass Google’s security to spread via play store. https://thehackernews.com/2020/07/joker-android-mobile-virus.html (2020)

  12. Chebyshev, V.: Mobile malware evolution 2020. https://securelist.com/mobile-malware-evolution-2020/101029 (2021)

  13. Faruki, P., Ganmoor, V., Laxmi, V., Gaur, M.S., Bharmal, A.: Androsimilar: robust statistical feature signature for android malware detection. In: Proceedings of the 6th International Conference on Security of Information and Networks, pp. 152–159 (2013)

  14. Feizollah, A., Anuar, N.B., Salleh, R., Wahab, A.W.A.: A review on feature selection in mobile malware detection. Digit. Investig. 13, 22–37 (2015)

    Article  Google Scholar 

  15. Arp, D., Spreitzenbarth, M., Hubner, M., Gascon, H., Rieck, K., Siemens, C.: Drebin: effective and explainable detection of android malware in your pocket. In: Ndss, vol. 14, pp. 23–26 (2014)

  16. Lipovský, R., Štefanko, L., Braniša, G.: The rise of android ransomware. https://www.welivesecurity.com/wp-content/uploads/2016/02/Rise_of_Android_Ransomware.pdf (2016)

  17. Mathur, A., Podila, L.M., Kulkarni, K., Niyaz, Q., Javaid, A.Y.: Naticusdroid: a malware detection framework for android using native and custom permissions. J. Inf. Secur. Appl. 58, 102696 (2021)

    Google Scholar 

  18. Khariwal, K., Singh, J., Arora, A.: Ipdroid: android malware detection using intents and permissions. In: 2020 Fourth World Conference on Smart Trends in Systems, Security and Sustainability (WorldS4), pp. 197–202. IEEE (2020)

  19. Android: Request app permissions. https://developer.android.com/training/permissions/requesting (2021)

  20. Android: Permissions on android. https://developer.android.com/guide/topics/permissions/overview (2021)

  21. Android: Manifest.permission., https://developer.android.com/reference/android/Manifest.permission (2021)

  22. Android: Define a custom app permission. https://developer.android.com/guide/topics/permissions/defining (2021)

  23. Codepath: Understanding app permissions. https://guides.codepath.com/android/Understanding-App-Permissions (2021)

  24. Android: App permissions best practices. https://developer.android.com/training/permissions/usage-notes (2021)

  25. Android: Permissions updates in android 11. https://developer.android.com/about/versions/11/privacy/permissions (2021)

  26. JR, R.: Android versions: A living history from 1.0 to 12. https://www.computerworld.com/article/3235946/android-versions-a-living-history-from-1-0-to-today.html (2021)

  27. Milosevic, N., Dehghantanha, A., Choo, K.-K.R.: Machine learning aided android malware classification. Comput. Electr. Eng. 61, 266–274 (2017)

    Article  Google Scholar 

  28. Zhu, H.-J., You, Z.-H., Zhu, Z.-X., Shi, W.-L., Chen, X., Cheng, L.: Droiddet: effective and robust detection of android malware using static analysis along with rotation forest model. Neurocomputing 272, 638–646 (2018)

    Article  Google Scholar 

  29. Talha, K.A., Alper, D.I., Aydin, C.: Apk auditor: permission-based android malware detection system. Digit. Investig. 13, 1–14 (2015)

    Article  Google Scholar 

  30. Rovelli, P., Vigfússon, \(\acute{Y}\).: Pmds: permission-based malware detection system. In: International Conference on Information Systems Security. Springer, pp. 338–357 (2014)

  31. Sanz, B., Santos, I., Laorden, C., Ugarte-Pedrero, X., Bringas, P.G., Álvarez, G.: Puma: permission usage to detect malware in android. In: International Joint Conference CISIS’12-ICEUTE 12-SOCO 12 Special Sessions, pp. 289–298. Springer (2013)

  32. Zarni Aung, W.Z.: Permission-based android malware detection. Int. J. Sci. Technol. Res. 2, 228–234 (2013)

    Google Scholar 

  33. Wang, W., Wang, X., Feng, D., Liu, J., Han, Z., Zhang, X.: Exploring permission-induced risk in android applications for malicious application detection. IEEE Trans. Inf. Forensics Secur. 9, 1869–1882 (2014)

    Article  Google Scholar 

  34. Ghasempour, A., Sani, N.F.M., Abari, O.J.: Permission extraction framework for android malware detection. Int. J. Adv. Comput. Sci. Appl. 11(11) (2020)

  35. Arora, A., Peddoju, S.K., Conti, M.: Permpair: android malware detection using permission pairs. IEEE Trans. Inf. Forensics Secur. 15, 1968–1982 (2019)

    Article  Google Scholar 

  36. Liu, X., Liu, J.: A two-layered permission-based android malware detection scheme. In: 2014 2nd IEEE International Conference on Mobile Cloud Computing, Services, and Engineering, pp. 142–148 (2014). https://doi.org/10.1109/MobileCloud.2014.22

  37. Moonsamy, V., Rong, J., Liu, S.: Mining permission patterns for contrasting clean and malicious android applications. Futur. Gener. Comput. Syst. 36, 122–132 (2014)

    Article  Google Scholar 

  38. Sokolova, K., Perez, C., Lemercier, M.: Android application classification and anomaly detection with graph-based permission patterns. Decis. Support Syst. 93, 62–76 (2017)

    Article  Google Scholar 

  39. Wang, C., Xu, Q., Lin, X., Liu, S.: Research on data mining of permissions mode for android malware detection. Clust. Comput. 22, 13337–13350 (2019)

    Article  Google Scholar 

  40. Idrees, F., Rajarajan, M., Conti, M., Chen, T.M., Rahulamathavan, Y.: Pindroid: a novel android malware detection system using ensemble learning methods. Comput. Secur. 68, 36–46 (2017)

    Article  Google Scholar 

  41. Sanz, B., Santos, I., Laorden, C., Ugarte-Pedrero, X., Nieves, J., Bringas, P.G., Álvarez Marañón, G.: Mama: manifest analysis for malware detection in android. Cybern. Syst. 44, 469–488 (2013)

    Article  Google Scholar 

  42. Arslan, R.S., Ölmez, E., Er, O.: Afwdroid: deep feature extraction and weighting for android malware detection. Dicle üniversitesi MÜhendislik Fakültesi Mühendislik Dergisi 12, 237–245 (2021)

    Google Scholar 

  43. Alazab, M., Alazab, M., Shalaginov, A., Mesleh, A., Awajan, A.: Intelligent mobile malware detection using permission requests and API calls. Futur. Gener. Comput. Syst. 107, 509–521 (2020)

    Article  Google Scholar 

  44. Tao, G., Zheng, Z., Guo, Z., Lyu, M.R.: Malpat: mining patterns of malicious and benign android apps via permission-related APIs. IEEE Trans. Reliab. 67, 355–369 (2017)

    Article  Google Scholar 

  45. Kim, T., Kang, B., Rho, M., Sezer, S., Im, E.G.: A multimodal deep learning method for android malware detection using various features. IEEE Trans. Inf. Forensics Secur. 14, 773–788 (2018)

    Article  Google Scholar 

  46. Hu, D., Ma, Z., Zhang, X., Li, P., Ye, D., Ling, B.: The concept drift problem in android malware detection and its solution. Secur. Commun. Netw. 2017 (2017)

  47. Guerra-Manzanares, A., Nomm, S., Bahsi, H.: In-depth feature selection and ranking for automated detection of mobile malware. In: ICISSP, pp. 274–283 (2019)

  48. Zhou, Y., Wang, Z., Zhou, W., Jiang, X.: Hey, you, get off of my market: detecting malicious apps in official and alternative android markets. In: NDSS, vol. 25, pp. 50–52 (2012)

  49. Lindorfer, M., Neugschwandtner, M., Platzer, C.: Marvin: Efficient and comprehensive mobile app classification through static and dynamic analysis. In: IEEE 39th Annual Computer Software and Applications Conference, vol. 2, pp. 422–433. IEEE (2015)

  50. Arora, A., Peddoju, S.K.: Ntpdroid: a hybrid android malware detector using network traffic and system permissions. In: 17th IEEE International Conference on Trust, Security and Privacy in Computing and Communications/12th IEEE International Conference on Big Data Science and Engineering (TrustCom/BigDataSE), pp. 808–813. IEEE (2018)

  51. Arora, A., Peddoju, S.K., Chouhan, V., Chaudhary, A.: Hybrid android malware detection by combining supervised and unsupervised learning. In: Proceedings of the 24th Annual International Conference on Mobile Computing and Networking, pp. 798–800 (2018)

  52. Zhou, Y., Jiang, X.: Dissecting android malware: characterization and evolution. In: IEEE Symposium on Security and Privacy, pp. 95–109. IEEE (2012)

  53. Guerra-Manzanares, A., Bahsi, H., Nõmm, S.: Kronodroid: time-based hybrid-featured dataset for effective android malware detection and characterization. Comput. Secur. 110, 102399 (2021)

    Article  Google Scholar 

  54. Mila: Contagio mobile. http://contagiominidump.blogspot.com/ (2018)

  55. Arp, D., Quiring, E., Pendlebury, F., Warnecke, A., Pierazzi, F., Wressnegger, C., Cavallaro, L., Rieck, K.: Dos and don’ts of machine learning in computer security (2020). arXiv preprint arXiv:2010.09470

  56. Pendlebury, F., Pierazzi, F., Jordaney, R., Kinder, J., Cavallaro, L.: \(\{\)TESSERACT\(\}\): eliminating experimental bias in malware classification across space and time. In: 28th \(\{\)USENIX\(\}\) Security Symposium (\(\{\)USENIX\(\}\) Security 19), pp. 729–746 (2019)

  57. Allix, K., Bissyandé, T.F., Klein, J., Le Traon, Y.: Are your training datasets yet relevant? In: International Symposium on Engineering Secure Software and Systems, pp. 51–67. Springer (2015)

  58. Cen, L., Gates, C.S., Si, L., Li, N.: A probabilistic discriminative model for android malware detection with decompiled source code. IEEE Trans. Dependable Secure Comput. 12, 400–412 (2015)

    Article  Google Scholar 

  59. Xu, K., Li, Y., Deng, R., Chen, K., Xu, J.: Droidevolver: self-evolving android malware detection system. In: IEEE European Symposium on Security and Privacy (EuroS &P), pp. 47–62. IEEE (2019)

  60. Lei, T., Qin, Z., Wang, Z., Li, Q., Ye, D.: Evedroid: event-aware android malware detection against model degrading for ioT devices. IEEE Internet Things J. 6, 6668–6680 (2019)

    Article  Google Scholar 

  61. Guerra-Manzanares, A., Luckner, M., Bahsi, H.: Android malware concept drift using system calls: detection, characterization and challenges. Expert Syst. Appl. 117200, 117200 (2022)

    Article  Google Scholar 

  62. Lu, J., Liu, A., Dong, F., Gu, F., Gama, J., Zhang, G.: Learning under concept drift: a review. IEEE Trans. Knowl. Data Eng. 31, 2346–2363 (2018)

    Google Scholar 

  63. Lu, N., Zhang, G., Lu, J.: Concept drift detection via competence models. Artif. Intell. 209, 11–28 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  64. Jordaney, R., Sharad, K., Dash, S.K., Wang, Z., Papini, D., Nouretdinov, I., Cavallaro, L.: TranscEnd: detecting concept drift in malware classification models. In: Proceedings of the 26th USENIX Security Symposium, pp. 625–642 (2017)

  65. Hooker, G., Mentch, L.: Please stop permuting features: an explanation and alternatives (2019). arXiv preprint arXiv:1905.03151

  66. Samara, B., Randles, R.H.: A test for correlation based on Kendall’s tau. Commun. Stat. Theory Methods 17, 3191–3205 (1988)

  67. Aggarwal, C.C.: Data Mining: The Textbook. Springer, Berlin (2015)

    Book  MATH  Google Scholar 

  68. Zyblewski, P., Sabourin, R., Woźniak, M.: Preprocessed dynamic classifier ensemble selection for highly imbalanced drifted data streams. Inf. Fusion 66, 138–154 (2021)

    Article  Google Scholar 

  69. Guerra-Manzanares, A., Nõmm, S., Bahsi, H.: Time-frame analysis of system calls behavior in machine learning-based mobile malware detection. In: 2019 International Conference on Cyber Security for Emerging Technologies (CSET), pp. 1–8 (2019). https://doi.org/10.1109/CSET.2019.8904908

  70. Guerra-Manzanares, A., Bahsi, H., Nõmm, S.: Differences in android behavior between real device and emulator: a malware detection perspective. In: 2019 Sixth International Conference on Internet of Things: Systems, Management and Security (IOTSMS), pp. 399–404 (2019). https://doi.org/10.1109/IOTSMS48152.2019.8939268

  71. Maimon, O., Rokach, L. (eds.): Data Mining and Knowledge Discovery Handbook. A Complete Guide for Practitioners and Researchers. Springer, San Francisco (2005)

    Google Scholar 

  72. Breiman, L.: Random forests. Mach. Learn. 45, 5–32 (2001)

    Article  MATH  Google Scholar 

  73. Altmann, A., Toloşi, L., Sander, O., Lengauer, T.: Permutation importance: a corrected feature importance measure. Bioinformatics 26, 1340–1347 (2010)

    Article  Google Scholar 

  74. Biecek, P., Burzykowski, T.: Explanatory Model Analysis. Chapman and Hall, New York (2021)

    Book  Google Scholar 

  75. Molnar, C.: Interpretable machine learning. https://christophm.github.io/interpretable-ml-book/ (2019)

  76. Shapley, L.S.: 17. A Value for n-person Games. Princeton University Press, Princeton (2016)

    Google Scholar 

  77. Japkowicz, N., Shah, M.: Evaluating Learning Algorithms: A Classification Perspective. Cambridge University Press, New York (2011)

    Book  MATH  Google Scholar 

  78. Breunig, M.M., Kriegel, H.-P., Ng, R.T., Sander, J.: Lof: identifying density-based local outliers. In: Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, pp. 93–104 (2000)

  79. Wu, L.: Android mobile ransomware: bigger, badder, better? https://www.trendmicro.com/en_us/research/17/h/android-mobile-ransomware-evolution.html (2017)

  80. Seals, T.: Slocker android ransomware resurfaces in undetectable form. https://www.infosecurity-magazine.com/news/slocker-android-ransomware/ (2017)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alejandro Guerra-Manzanares.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A. Two-sample Kolmogorov-Smirnov statistical test

Fig. 15
figure 15

Kolmogorov-Smirnov test to compare F1 for the reduced and extended feature sets. The maximum distance between EDFs is 0.04 (marked in the graph with red dots).

To formally compare the performance of both feature sets (i.e., extended vs. reduced), the non-parametric Kolmogorov–Smirnov test was used to analyze the equality of both probability distributions (i.e., two-sample K–S test). The test statistic quantifies the distance between the empirical distribution functions (EDF) of two data samples. In our case, the distribution of F1 score data is compared for the reduced and extended feature sets. The results show that with a p-value = 0.7634, the test cannot be the basis for \(H_0\) rejection that the F1 score distributions come from the same distribution. Moreover, the K–S statistic value, which reflects the maximum distance between the EDFs, is relatively small (i.e., K–S = 0.04). The test cumulative density function (ECDF) is illustrated in Fig. 15. Thus, it can be concluded that there is no significant difference in detection performance between the compared feature sets.

Table 2 Significantly different (\(p<0.005\)) features used to maximize recall and specificity for the reduced feature set
Table 3 Android permission feature set evolution

Appendix B. Wilcoxon signed-rank test

Wilcoxon signed-rank test is a non-parametric statistical hypothesis test (i.e., no normality distribution is assumed) that enables the comparison of two data distributions (i.e., specificity and recall in this case). However, the test requires the same amount of data on both distributions and, in our particular case, as not all features were found important in all periods, there were missing values in quarters regarding specific features. The quantity of missing values is critical. The data for the reduced vector showed nearly 70% missing values while reaching 83% for the extended vector. Therefore, the statistical analysis was performed only on the reduced feature set data. In this regard, as the negative values of function (4) are unknown but are needed for the test, a solution could be replacing missing values with zero values. Yet, the large missing value ratio might bias the comparison results, groundlessly increasing the similarity of the vector with a large number of missing values. Therefore, a better approach is to replace the missing values with the mean importance of the feature, calculated for the given data set and taken as a negative value. Thus, vectors are compared using means when the number of missing values is high and using the distribution of importance otherwise.

The results of the Wilcoxon test are reported in Table 2, calculated for the reduced feature set, and used to distinguish permissions important to optimize specificity and recall tasks. The table reports the features with a p-value less than 0.005, which suggests a highly significant difference between the compared vectors. The occurrence refers to the number of not missing values in the compared vectors (i.e., the number of times the feature was found important in recall or specificity). The maximum and total importance are provided for each feature. The maximum importance reflects the maximum value of importance the feature had in any quarter, while the total importance is the cumulative importance of the specific feature in all quarters. The features are ordered based on the completeness of the data vectors (i.e., from a larger number of occurrences to a smaller number).

As reported in Table 2, among the 44 shared features found important for both recognition tasks, 29 showed statistically significant differences with a p-value \(< 0.005\). Among these important features showing significantly distinct importance distributions for malware detection and benign software recognition, there are relevant concept drift-related features such as READ_PHONE_STATE, SEND_SMS, and MOUNT_UNMOUNT_FILESYSTEMS. Apart from the large total importance, all these features have high occurrence and the largest maximum importance values. In all cases, the obtained importance is significantly higher for specificity. Therefore, based on this table, it can be concluded that important features for benign app recognition (i.e., specificity) are not equally relevant for the malware recognition task (i.e., recall).

Appendix C. Android permission set evolution

Table 3 provides the modifications that changed the available set of Android security permissions over time. This table gives a notion of the dynamic character of the permission set and its evolution throughout the whole Android lifetime. The permissions set was first defined for API level 1 (i.e., the first release of Android) and has been constantly modified since then. Table 3 provides the evolution of the available permission set from API level 1 to API level 30. The table is ordered chronologically based on API level release (i.e., from the oldest to the latest), and it provides API level-related information such as release date (i.e., Date) and OS version name. For each API level (i.e., rows), the added and deprecated permissions in the corresponding API level are provided in the Added Permissions and Deprecated Permissions columns. Furthermore, for each added permission, the protection level is provided in the column Type. Three protection levels are defined: dangerous, normal and others, reported as D, N and O, respectively. The others category refers to permissions that can be requested by third-party apps and that do not belong to the dangerous or normal category, as defined in the official documentation [21]. If the permission cannot be used by third-party apps, it is referenced with a hyphen (-). For each deprecated permission, the API level that introduced the permission is provided within parenthesis. Lastly, the column Set reports the number of available permissions (i.e., excluding deprecated) in each API level.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Guerra-Manzanares, A., Bahsi, H. & Luckner, M. Leveraging the first line of defense: a study on the evolution and usage of android security permissions for enhanced android malware detection. J Comput Virol Hack Tech 19, 65–96 (2023). https://doi.org/10.1007/s11416-022-00432-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11416-022-00432-3

Keywords

Navigation