Skip to main content

Ensemble Machine Learning Approach for Android Malware Classification Using Hybrid Features

  • Conference paper
  • First Online:

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 578))

Abstract

Feature-based learning plays a crucial role at building and sustaining the security. Determination of a software based on its extracted features whether a benign or malign process, and particularly classification into a correct malware family improves the security of the operating system and protects critical user’s information. In this paper, we present a novel hybrid feature-based classification system for Android malware samples. Static features such as permissions requested by mobile applications, hidden payload, and dynamic features such as API calls, installed services, network connections are extracted for classification. We apply machine learning and evaluate the level in classification accuracy of different classifiers by extracting Android malware features using a fairly large set of 3339 samples belonging to 20 malware families. The evaluation study has been scalable with 5 guest machines and took 8 days of processing. The testing accuracy is reached at 92%.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Scikit-learn: Machine Learning in Python. http://scikit-learn.org/stable/index.html. Accessed 15 Jan 2017

  2. Virusshare: Malware Sharing Platform. https://virusshare.com/. Accessed 15 Jan 2017

  3. Virustotal: Free Online Virus, Malware and URL Scanner. https://www.virustotal.com/. Accessed 15 Jan 2017

  4. Aung, Z., Zaw, W.: Permission-based android malware detection. Int. J. Sci. Technol. Res. 2, 228–234 (2013)

    Google Scholar 

  5. Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)

    Article  MATH  Google Scholar 

  6. Dhaya, R., Poongodi, M.: Detecting software vulnerabilities in android using static analysis. In: 2014 IEEE International Conference on Advanced Communications, Control and Computing Technologies, pp. 915–918, May 2014

    Google Scholar 

  7. Faruki, P., Bharmal, A., Laxmi, V., Ganmoor, V., Gaur, M.S., Conti, M., Rajarajan, M.: Android security: a survey of issues, malware penetration, and defenses. IEEE Commun. Surv. Tutorials 17(2), 998–1022 (2015). (Secondquarter)

    Article  Google Scholar 

  8. Geurts, P., Ernst, D., Wehenkel, L.: Extremely randomized trees. Mach. Learn. 63(1), 3–42 (2006)

    Article  MATH  Google Scholar 

  9. John, G.H., Langley, P.: Estimating continuous distributions in Bayesian classifiers. In: Proceedings of the Eleventh conference on Uncertainty in artificial intelligence, pp. 338–345. Morgan Kaufmann Publishers Inc. (1995)

    Google Scholar 

  10. McWilliams, G.: Analysis of Bayesian classification-based approaches for android malware detection. IET Inf. Secur. 8(1), 25–36 (2014). http://digital-library.theiet.org/content/journals/10.1049/iet-ifs.2013.0095

    Article  Google Scholar 

  11. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)

    MathSciNet  MATH  Google Scholar 

  12. Peiravian, N., Zhu, X.: Machine learning for android malware detection using permission and API calls. In: Proceedings of the 2013 IEEE 25th International Conference on Tools with Artificial Intelligence, ICTAI 2013, pp. 300–305 (2013). http://dx.doi.org/10.1109/ICTAI.2013.53

  13. Sahs, J., Khan, L.: A machine learning approach to android malware detection. In: 2012 European Intelligence and Security Informatics Conference, pp. 141–147, August 2012

    Google Scholar 

  14. Schmidt, M., Le Roux, N., Bach, F.: Minimizing finite sums with the stochastic average gradient. Math. Program. 162(1), 83–112 (2017). http://dx.doi.org/10.1007/s10107-016-1030-6

    Article  MathSciNet  MATH  Google Scholar 

  15. Suarez-Tangil, G., Tapiador, J.E., Peris-Lopez, P., Blasco, J.: Dendroid: a text mining approach to analyzing and classifying code structures in android malware families. Expert Syst. Appl. 41(4), 1104–1117 (2014). http://dx.doi.org/10.1016/j.eswa.2013.07.106

    Article  Google Scholar 

  16. Symantec: Internet security threat report (2016). https://www.symantec.com/content-/dam/symantec/docs/reports/istr-21-2016-en.pdf

  17. Yang, Y., Wei, Z., Xu, Y., He, H., Wang, W.: Droidward: an effective dynamic analysis method for vetting android applications. Cluster Comput. 19, 1–11 (2016)

    Article  Google Scholar 

  18. Yu, H.F., Huang, F.L., Lin, C.J.: Dual coordinate descent methods for logistic regression and maximum entropy models. Mach. Learn. 85(1–2), 41–75 (2011)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

The authors gratefully acknowledge the support of Galatasaray University, scientific research support program under grant #16.401.004.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tankut Acarman .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG

About this paper

Cite this paper

Pektaş, A., Acarman, T. (2018). Ensemble Machine Learning Approach for Android Malware Classification Using Hybrid Features. In: Kurzynski, M., Wozniak, M., Burduk, R. (eds) Proceedings of the 10th International Conference on Computer Recognition Systems CORES 2017. CORES 2017. Advances in Intelligent Systems and Computing, vol 578. Springer, Cham. https://doi.org/10.1007/978-3-319-59162-9_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-59162-9_20

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-59161-2

  • Online ISBN: 978-3-319-59162-9

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics