Skip to main content

DroidClassifier: Efficient Adaptive Mining of Application-Layer Header for Classifying Android Malware

  • Conference paper
  • First Online:
Security and Privacy in Communication Networks (SecureComm 2016)

Abstract

A recent report has shown that there are more than 5,000 malicious applications created for Android devices each day. This creates a need for researchers to develop effective and efficient malware classification and detection approaches. To address this need, we introduce DroidClassifier: a systematic framework for classifying network traffic generated by mobile malware. Our approach utilizes network traffic analysis to construct multiple models in an automated fashion using a supervised method over a set of labeled malware network traffic (the training dataset). Each model is built by extracting common identifiers from multiple HTTP header fields. Adaptive thresholds are designed to capture the disparate characteristics of different malware families. Clustering is then used to improve the classification efficiency. Finally, we aggregate the multiple models to construct a holistic model to conduct cluster-level malware classification. We then perform a comprehensive evaluation of DroidClassifier by using 706 malware samples as the training set and 657 malware samples and 5,215 benign apps as the testing set. Collectively, these malicious and benign apps generate 17,949 network flows. The results show that DroidClassifier successfully identifies over 90% of different families of malware with more than 90% accuracy with accessible computational cost. Thus, DroidClassifier can facilitate network management in a large network, and enable unobtrusive detection of mobile malware. By focusing on analyzing network behaviors, we expect DroidClassifier to work with reasonable accuracy for other mobile platforms such as iOS and Windows Mobile as well.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Afonso, V.M., de Amorim, M.F., Grégio, A.R.A., Junquera, G.B., de Geus, P.L.: Identifying android malware using dynamically obtained features. J. Comput. Virol. Hacking Tech. 11(1), 9–17 (2015)

    Article  Google Scholar 

  2. Aresu, M., Ariu, D., Ahmadi, M., Maiorca, D., Giacinto, G.: Clustering android malware families by http traffic. In: 10th International Conference on Malicious and Unwanted Software (MALWARE). IEEE (2015)

    Google Scholar 

  3. Arora, A., Garg, S., Peddoju, S.K.: Malware detection using network traffic analysis in android based mobile devices. In: 2014 Eighth International Conference on Next Generation Mobile Apps, Services and Technologies (NGMAST), pp. 66–71. IEEE (2014)

    Google Scholar 

  4. Arp, D., Spreitzenbarth, M., Hübner, M., Gascon, H., Rieck, K., Siemens, C.: Drebin: effective and explainable detection of android malware in your pocket. In: Annual Symposium on Network and Distributed System Security (NDSS) (2014)

    Google Scholar 

  5. Chen, Z., Han, H., Yan, Q., Yang, B., Peng, L., Zhang, L., Li, J.: A first look at android malware traffic in first few minutes. In: IEEE TrustCom 2015 (Aug. 2015)

    Google Scholar 

  6. G Data. GData mobile malware report, July 2015. https://public.gdatasoftware.com/Presse/Publikationen/Malware_Reports/G_DATA_MobileMWR_Q2_2015_EN.pdf

  7. Enck, W., Gilbert, P., Han, S., Tendulkar, V., Chun, B.-G., Cox, L.P., Jung, J., Mc-Daniel, P., Sheth, A.N.: Taintdroid: an information-flow tracking system for realtime privacy monitoring on smartphones. ACM Trans. Comput. Syst. (TOCS) 32(2), 5 (2014)

    Article  Google Scholar 

  8. F-Secure. F-secure mobile threat report, March, 2014. https://www.f-secure.com/documents/996508/1030743/Mobile_Threat_Report_Q1_2014.pdf

  9. Gill, P., Erramilli, V., Chaintreau, A., Krishnamurthy, B., Papagiannaki, K., Rodriguez, P.: Best paper–follow the money: understanding economics of online aggregation and advertising. In: Conference on Internet Measurement Conference, pp. 141–148. ACM (2013)

    Google Scholar 

  10. Hornyack, P., Han, S., Jung, J., Schechter, S., Wetherall, D.: These aren’t the droids you’re looking for: retrofitting android to protect data from imperious applications. In: ACM Conference on Computer and Communications Security, pp. 639–652. ACM (2011)

    Google Scholar 

  11. Jaccard, P.: Etude comparative de la distribution florale dans une portion des Alpes et du Jura. Impr, Corbaz (1901)

    Google Scholar 

  12. Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: a review. ACM Comput. Surv. (CSUR) 31(3), 264–323 (1999)

    Article  Google Scholar 

  13. Jensen, C.S., Prasad, M.R., Møller, A.: Automated testing with targeted event sequence generation. In: International Symposium on Software Testing and Analysis, ISSTA 2013, Lugano, Switzerland, pp. 67–77 (2013)

    Google Scholar 

  14. Kheir, N.: Analyzing HTTP user agent anomalies for malware detection. In: Pietro, R., Herranz, J., Damiani, E., State, R. (eds.) DPM/SETOP -2012. LNCS, vol. 7731, pp. 187–200. Springer, Heidelberg (2013). doi:10.1007/978-3-642-35890-6_14

    Chapter  Google Scholar 

  15. Le, A., Varmarken, J., Langhoff, S., Shuba, A., Gjoka, M., Markopoulou, A.: Antmonitor: a system for monitoring from mobile devices. In: SIGCOMM Workshop on Crowdsourcing and Crowdsharing of Big (Internet) Data, pp. 15–20. ACM (2015)

    Google Scholar 

  16. Levenshtein, V.I.: Binary codes capable of correcting deletions, insertions, and reversals. Forschungsbericht, 707–710 S (1966)

    Google Scholar 

  17. Nari, S., Ghorbani, A.A.: Automated malware classification based on network behavior. In: 2013 International Conference on Computing, Networking and Communications (ICNC), pp. 642–647. IEEE (2013)

    Google Scholar 

  18. Narudin, F.A., Feizollah, A., Anuar, N.B., Gani, A.: Evaluation of machine learning classifiers for mobile malware detection. Soft. Comput. 20(1), 343–357 (2016)

    Article  Google Scholar 

  19. Pelleg, D., Moore, A.W., et al.: X-means: extending k-means with efficient estimation of the number of clusters. In: ICML, vol. 1, 2000

    Google Scholar 

  20. Perdisci, R., Lee, W., Feamster, N.: Behavioral clustering of http-based malware and signature generation using malicious network traces. In: NSDI, pp. 391–404 (2010)

    Google Scholar 

  21. Rafique, M.Z., Caballero, Juan: FIRMA: malware clustering and network signature generation with mixed network behaviors. In: Stolfo, S.J., Stavrou, Angelos, Wright, C.V. (eds.) RAID 2013. LNCS, vol. 8145, pp. 144–163. Springer, Heidelberg (2013). doi:10.1007/978-3-642-41284-4_8

    Chapter  Google Scholar 

  22. Rao, A., Sherry, J., Legout, A., Krishnamurthy, A., Dabbous, W., Choffnes, D.: Meddle: middleboxes for increased transparency and control of mobile traffic. In: ACM Conference on CoNEXT Student Workshop, pp. 65–66. ACM (2012)

    Google Scholar 

  23. Razaghpanah, A., Vallina-Rodriguez, N., Sundaresan, S., Kreibich, C., Gill, P., Allman, M., Paxson, V.: Haystack: In situ mobile traffic analysis in user space. In arXiv preprint arXiv:1510.01419 (2015)

  24. Rokach, L., Maimon, O.: Clustering methods. In: Maimon, O., Rokach, L. (eds.) Data Mining and Knowledge Discovery Handbook, pp. 321–352. Springer, USA (2005)

    Chapter  Google Scholar 

  25. Shabtai, A., Kanonov, U., Elovici, Y., Glezer, C., Weiss, Y.: andromaly: a behavioral malware detection framework for android devices. J. Intell. Inf. Syst. 38(1), 161–190 (2012)

    Article  Google Scholar 

  26. Symantec Corporation. Internet Security Threat Report 2014. http://www.symantec.com/content/en/us/enterprise/other_resources/b-istr_main_report_v19_21291018.en-us.pdf. Accessed 21 June 2016

  27. Tutorialspoint. HTTP header. http://www.tutorialspoint.com/http/http_header_fields.htm. Accessed 21 June 2016

  28. Vallina-Rodriguez, N., Shah, J., Finamore, A., Grunenberger, Y., Papagiannaki, K., Haddadi, H., Crowcroft, J.: Breaking for commercials: characterizing mobile advertising. In: ACM Conference on Internet Measurement Conference, pp. 343–356. ACM (2012)

    Google Scholar 

  29. Winsniewski, R.: Android–apktool: a tool for reverse engineering android apk files. http://ibotpeaches.github.io/Apktool/, Accessed 21 June 2016

  30. Wurzinger, P., Bilge, L., Holz, T., Goebel, J., Kruegel, C., Kirda, E.: Automatically generating models for botnet detection. In: Backes, M., Ning, P. (eds.) ESORICS 2009. LNCS, vol. 5789, pp. 232–249. Springer, Heidelberg (2009). doi:10.1007/978-3-642-04444-1_15

    Chapter  Google Scholar 

  31. Xu, Q., Liao, Y., Miskovic, S., Mao, Z.M., Baldi, M., Nucci, A., Andrews, T.: Automatic generation of mobile app signatures from traffic observations. In: IEEE INFOCOM, April 2015

    Google Scholar 

  32. Xu, W., Qi, Y., Evans, D.: Automatically evading classifiers, a case study on pdf malware classifiers. In: Annual Symposium on Network and Distributed System Security (NDSS) (2016)

    Google Scholar 

  33. Yao, H., Ranjan, G., Tongaonkar, A., Liao, Y., Mao, Z.M.: Samples: self adaptive mining of persistent lexical snippets for classifying mobile application traffic. In: Annual International Conference on Mobile Computing and Networking (2015)

    Google Scholar 

  34. Zhang, J., Saha, S., Gu, G., Lee, S.-J., Mellia, M.: Systematic mining of associated server herds for malware campaign discovery. In: 2015 IEEE 35th International Conference on Distributed Computing Systems (ICDCS), pp. 630–641. IEEE (2015)

    Google Scholar 

  35. Zhang, Y., Yang, M., Xu, B., Yang, Z., Gu, G., Ning, P., Wang, X.S., Zang, B.: Vetting undesirable behaviors in android apps with permission use analysis. In: ACM SIGSAC Conference on Computer & Communications Security, pp. 611–622. ACM (2013)

    Google Scholar 

  36. Zhou, Y., Jiang, X.: Dissecting android malware: characterization and evolution In: 2012 IEEE Symposium on Security and Privacy (SP), pp. 95–109. IEEE (2012)

    Google Scholar 

Download references

Acknowledgment

We thank the anonymous reviewers and our shepherd, Aziz Mohaisen, for their insightful feedback on our work. This work is supported in part by NSF grant CNS-1566388. This work is also based on research sponsored by DARPA and Maryland Procurement Office under agreement numbers FA8750-14-2-0053 and H98230-14-C-0140, respectively. Any opinions, findings, conclusions, or recommendations expressed here are those of the authors and do not necessarily reflect the views of the funding agencies or the U.S. Government.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhiqiang Li .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering

About this paper

Cite this paper

Li, Z., Sun, L., Yan, Q., Srisa-an, W., Chen, Z. (2017). DroidClassifier: Efficient Adaptive Mining of Application-Layer Header for Classifying Android Malware. In: Deng, R., Weng, J., Ren, K., Yegneswaran, V. (eds) Security and Privacy in Communication Networks. SecureComm 2016. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 198. Springer, Cham. https://doi.org/10.1007/978-3-319-59608-2_33

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-59608-2_33

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-59607-5

  • Online ISBN: 978-3-319-59608-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics