Skip to main content
Log in

Audio signal processing for Android malware detection and family identification

  • Original Paper
  • Published:
Journal of Computer Virology and Hacking Techniques Aims and scope Submit manuscript

Abstract

Mobile malware is increasing in complexity and maliciousness, with particular regard to the malicious samples targeting the Android platform, currently the most widespread operating system for mobile devices. In this scenario antimalware technologies are not able to detect the so-called zero-day malware, because they are able to detect mobile malware only once their malicious signature is stored in the antimalware repository (i.e., the so-called signature based approach). From these considerations, in this paper an approach for detecting Android malware is proposed. Moreover the proposed approach aims to detect the belonging family of the malicious sample under analysis. We represent the executable of the application in term of audio file and, exploiting audio signal processing techniques, we extract a set of numerical features from each sample. Thus, we build several machine learning models and we evaluate their effectiveness in terms of malware detection and family identification. We experiment the method we propose on a data-set composed by 50,000 Android real-world samples (24,553 malicious among 71 families and 25,447 legitimate), by reaching an accuracy equal to 0.952 in Android malware detection and of 0.922 in family detection.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Notes

  1. https://www.zdnet.com/article/mobile-malware-attacks-are-booming-in-2019-these-are-the-most-common-threats/.

  2. https://www.darkreading.com/cloud/mobile-banking-malware-up-50--in-first-half-of-2019/d/d-id/1336834.

  3. https://docs.python.org/2/library/wave.html.

  4. https://mega.nz/file/Fd0jBKhb#ZVbd-buybGM6Anz-P3WapQ85kmwOdsOxRhWVXQL_ywk.

  5. http://amd.arguslab.org/.

  6. https://play.google.com/.

  7. https://www.virustotal.com.

  8. https://www.cs.waikato.ac.nz/ml/weka/.

  9. https://keras.io/.

  10. https://www.tensorflow.org/.

References

  1. Arzt, S., Rasthofer, S., Fritz, C., Bodden, E., Bartel, A., Klein, J., Le Traon, Y., Octeau, D., McDaniel, P.: Flowdroid: precise context, flow, field, object-sensitive and lifecycle-aware taint analysis for android apps. ACM SIGPLAN Not. 49(6), 259–269 (2014)

    Article  Google Scholar 

  2. Avino, P., Mercaldo, F., Nardone, V., Notardonato, I., Santone, A.: Machine learning to identify gender via hair elements. In: 2019 International Joint Conference on Neural Networks (IJCNN), pp. 1–7. IEEE (2019)

  3. Barbuti, R., De Francesco, N., Santone, A., Vaglini, G.: Reduced models for efficient CCS verification. Form. Methods Syst. Des. 26(3), 319–350 (2005)

    Article  Google Scholar 

  4. Birajdar, G.K., Patil, M.D.: Speech/music classification using visual and spectral chromagram features. J. Ambient Intell. Humaniz. Comput. 11(1), 329–347 (2020)

    Article  Google Scholar 

  5. Brunese, L., Mercaldo, F., Reginelli, A., Santone,A.: Formal methods for prostate cancer gleason score and treatment prediction using radiomic biomarkers. Magn. reson. imaging 66:165–175 (2020)

  6. Brunese, L., Mercaldo, F., Reginelli, A., Santone, A.: Neural net-works for lung cancer detection through radiomic features. In: 2019 International Joint Conference on Neural Networks (IJCNN), pp. 1–10. IEEE (2019)

  7. Brunese, L., Mercaldo, F., Reginelli, A., Santone, A.: An ensemble learning approach for brain cancer detection exploiting radiomic features. Comput. Methods Progr. Biomed. 185, 105134 (2020)

    Article  Google Scholar 

  8. Canfora, G., Di Sorbo, A., Mercaldo, F., Visaggio, C.A.: Obfuscation techniques against signature-based detection: a case study. In: 2015 Mobile Systems Technologies Workshop (MST), pp. 21–26. IEEE (2015)

  9. Canfora, G., Medvet, E., Mercaldo, F., Visaggio, C.A.: Detecting android malware using sequences of system calls. In: Proceedings of the 3rd International Workshop on Software Development Lifecycle for Mobile, pp. 13–20. ACM (2015)

  10. Canfora, G., Mercaldo, F., Moriano, G., Visaggio, C.A.: Composition-malware: building android malware at run time. In: 2015 10th International Conference on Availability, Reliability and Security (ARES), pp. 318–326. IEEE (2015)

  11. Canfora, G., Mercaldo, F., Visaggio, C.A.: A classifier of malicious android applications. In: Proceedings of the 2nd International Workshop on Security of Mobile Applications, in Conjunction with the International Conference on Availability, Reliability and Security (2013)

  12. Carfora, M.F., Martinelli, F., Mercaldo, F., Nardone, V., Orlando, A., Santone, A., Vaglini, G.: A “pay-how-you-drive” car insurance approach through cluster analysis. Soft Comput. 23(9), 2863–2875 (2019)

    Article  Google Scholar 

  13. Casolare, R., De Dominicis, C., Martinelli, F., Mercaldo, F., Santone, A.: Visualdroid: automatic triage and detection of android repackaged applications. In: Proceedings of the 15th International Conference on Availability, Reliability and Security, pp. 1–7 (2020)

  14. Ceccarelli, M., Cerulo, L., Santone, A.: De novo reconstruction of gene regulatory networks from time series data, an approach based on formal methods. Methods 69(3), 298–305 (2014). https://doi.org/10.1016/j.ymeth.2014.06.005

    Article  Google Scholar 

  15. Cimitile, A., Martinelli, F., Mercaldo, F., Nardone, V., Santone, A.: Formal methods meet mobile code obfuscation identification of code reordering technique. In: 2017 IEEE 26th International Conference on Enabling Technologies: Infrastructure for Collaborative Enterprises (WETICE), pp. 263–268. IEEE (2017)

  16. Cimitile, A., Mercaldo, F., Nardone, V., Santone, A., Visaggio, C.A.: Talos: no more ransomware victims with formal methods. Int. J. Inf. Secur. 17, 1–20 (2017)

    Google Scholar 

  17. Ciobanu, M.G., Fasano, F., Martinelli, F., Mercaldo, F., Santone, A.: Model checking for data anomaly detection. Proc. Comput. Sci. 159, 1277–1286 (2019)

    Article  Google Scholar 

  18. Fabio, M., Albina, O., Francecso, M., Vittoria, N., Santone, A.,, Arun, S.: Human behavior characterization for driving style recognition in vehicle system. Comput. Electr. Eng 83, 102504 (2018)

  19. Faiella, M., La Marra, A., Martinelli, F., Mercaldo, F., Saracino, A., Sheikhalishahi, M.: A distributed framework for collaborative and dynamic analysis of android malware. In: 2017 25th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP), pp. 321–328. IEEE (2017)

  20. Fan, L.: Audio example recognition and retrieval based on geometric incremental learning support vector machine system. IEEE Access 8, 78630–78638 (2020)

    Article  Google Scholar 

  21. Farrokhmanesh, M., Hamzeh, A.: Music classification as a new approach for malware detection. J. Comput. Virol. Hacking Tech. 15(2), 77–96 (2019)

    Article  Google Scholar 

  22. Ferrante, A., Medvet, E., Mercaldo, F., Milosevic, J., Visaggio, C.A.: Spotting the malicious moment: characterizing malware behavior using dynamic features. In: 2016 11th International Conference on Availability, Reliability and Security (ARES), pp. 372–381. IEEE (2016)

  23. Foote, J.: An overview of audio information retrieval. Multimed. Syst. 7(1), 2–10 (1999)

    Article  Google Scholar 

  24. Francesco, Nd, Lettieri, G., Santone, A., Vaglini, G.: Grease: a tool for efficient “nonequivalence” checking. ACM Trans. Softw. Eng. Methodol. (TOSEM) 23(3), 24 (2014)

    Article  Google Scholar 

  25. Hashemi, H., Hamzeh, A.: Visual malware detection using local malicious pattern. J. Comput. Virol. Hacking Tech. 15(1), 1–14 (2019)

    Article  Google Scholar 

  26. Iadarola, G., Martinelli, F., Mercaldo, F., Santone, A.: Formal methods for android banking malware analysis and detection. In: 2019 Sixth International Conference on Internet of Things: Systems, Management and Security (IOTSMS), pp. 331–336. IEEE (2019)

  27. Iadarola, G., Martinelli, F., Mercaldo, F., Santone, A.: Call graph and model checking for fine-grained android malicious behaviour detection. Appl. Sci. 10(22), 7975 (2020)

    Article  Google Scholar 

  28. Iadarola, G., Martinelli, F., Mercaldo, F., Santone, A.: Image-based malware family detection: an assessment between feature extraction and classification techniques. In: Proceedings of the 5th International Conference on Internet of Things, Big Data and Security, IoTBDS 2020, May 7–9, 2020, pp. 499–506, Prague, Czech Republic (2020)

  29. Juthi, J.H., Gomes, A., Bhuiyan, T., Mahmud, I.: Music emotion recognition with the extraction of audio features using machine learning approaches. In: Proceedings of ICETIT 2019, pp. 318–329. Springer (2020)

  30. Kohavi, R., et al.: A study of cross-validation and bootstrap for accuracy estimation and model selection. In: IJCAI, vol. 14, pp. 1137–1145, Montreal, Canada (1995)

  31. Lindorfer, M., Neugschwandtner, M., Platzer, C.: Marvin: Efficient and comprehensive mobile app classification through static and dynamic analysis. In: 2015 IEEE 39th Annual Computer Software and Applications Conference (COMPSAC), vol. 2, pp. 422–433. IEEE (2015)

  32. Martinelli, F., Mercaldo, F., Nardone, V., Santone, A.: Car hacking identification through fuzzy logic algorithms. In: 2017 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), pp. 1–7. IEEE (2017)

  33. Martinelli, F., Mercaldo, F., Nardone, V., Santone, A., Sangaiah, A.K., Cimitile, A.: Evaluating model checking for cyber threats code obfuscation identification. J. Parallel Distrib. Comput. 119, 203–218 (2018)

    Article  Google Scholar 

  34. Martinelli, F., Mercaldo, F., Saracino, A.: Bridemaid: An hybrid tool for accurate detection of android malware. In: Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security, pp. 899–901. ACM (2017)

  35. Mercaldo, F., Nardone, V., Santone, A., Visaggio, C.A.: Hey malware, I can find you! In: 2016 IEEE 25th International Conference on Enabling Technologies: Infrastructure for Collaborative Enterprises (WETICE), pp. 261–262. IEEE (2016)

  36. Mercaldo, F., Nardone, V., Santone, A., Visaggio, C.A.: Ransomware steals your phone. Formal methods rescue it. In: International Conference on Formal Techniques for Distributed Objects, Components, and Systems, pp. 212–221. Springer (2016)

  37. Mercaldo, F., Santone, A.: Deep learning for imagebased mobile malware detection. J. Comput. Virol. Hacking Tech. 1 –15 (2020)

  38. Mercaldo, F., Visaggio, C.A., Canfora, G., Cimitile, A.: Mobile malware detection in the real world. In: Proceedings of the 38th International Conference on Software Engineering Companion, pp. 744–746. ACM (2016)

  39. Nataraj, L., Karthikeyan, S., Jacob, G., Manjunath, B.: Malware images: visualization and automatic classification. In: Proceedings of the 8th International Symposium on Visualization for Cyber Security, p. 4. ACM (2011)

  40. Octeau, D., McDaniel, P., Jha, S., Bartel, A., Bodden, E., Klein, J., Le Traon, Y.: Effective inter-component communication mapping in android: an essential step towards holistic security analysis. In: Presented as Part of the 22nd USENIX Security Symposium (USENIX Security 13), pp. 543–558 (2013)

  41. Reina, A., Fattori, A., Cavallaro, L.: A system call-centric analysis and stimulation technique to automatically reconstruct android malware behaviors. In: Proceedings of EuroSec (2013)

  42. Santone, A.: Automatic verification of concurrent systems using a formula-based compositional approach. Acta Inform. 38(8), 531–564 (2002)

    Article  MathSciNet  Google Scholar 

  43. Santone, A.: Clone detection through process algebras and java bytecode. In: IWSC, pp. 73–74. Citeseer (2011)

  44. Scalas, M., Maiorca, D., Mercaldo, F., Visaggio, C.A., Martinelli, F., Giacinto, G.: On the effectiveness of system API-related information for Android ransomware detection. Comput. Secur. 86:168–182 (2019)

  45. Shabtai, A., Kanonov, U., Elovici, Y., Glezer, C., Weiss, Y.: “Andromaly” : a behavioral malware detection framework for android devices. J. Intell. Inf. Syst. 38(1), 161–190 (2012)

    Article  Google Scholar 

  46. Tymoczko, D.: The generalized tonnetz. J. Music Theory 56, 1–52 (2012)

    Article  Google Scholar 

  47. Vrysis, L., Tsipas, N., Thoidis, I., Dimoulas, C.: 1D/2D deep CNNs vs. temporal feature integration for general audio classification. J. Audio Eng. Soc. 68(1/2), 66–77 (2020)

    Article  Google Scholar 

  48. Wang, N., Cheng, Y.: Information Retrieval Technology. Springer, Berlin (2020)

    Book  Google Scholar 

  49. Wang, X., Jhi, V., Zhu, S., Liu, P.: Detecting software theft via system call based birthmarks. In: Proceedings of the Computer Security Applications Conference, pp. 149–158 (2009)

  50. Wei, F., Li, Y., Roy, S., Ou, X., Zhou, W.: Deep ground truth analysis of current android malware. In: International Conference on Detection of Intrusions and Malware and Vulnerability Assessment (DIMVA’17), pp. 252–276. Springer, Bonn, Germany (2017)

  51. Wei, F., Roy, S., Ou, X., et al.: Amandroid: a precise and general inter-component data flow analysis framework for security vetting of android apps. In: Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security, pp. 1329–1341. ACM (2014)

  52. Xiao, X., Zhang, S., Mercaldo, F., Hu, G., Sangaiah, A.K.: Android malware detection based on system call sequences and LSTM. Multimed. Tools Appl. 78(4), 3979–3999 (2019)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Francesco Mercaldo.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mercaldo, F., Santone, A. Audio signal processing for Android malware detection and family identification. J Comput Virol Hack Tech 17, 139–152 (2021). https://doi.org/10.1007/s11416-020-00376-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11416-020-00376-6

Keywords

Navigation