DroidClassifier: Efficient Adaptive Mining of Application-Layer Header for Classifying Android Malware

Li, Zhiqiang; Sun, Lichao; Yan, Qiben; Srisa-an, Witawas; Chen, Zhenxiang

doi:10.1007/978-3-319-59608-2_33

Zhiqiang Li¹⁹,
Lichao Sun¹⁹,
Qiben Yan¹⁹,
Witawas Srisa-an¹⁹ &
…
Zhenxiang Chen²⁰

Part of the book series: Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering ((LNICST,volume 198))

Included in the following conference series:

International Conference on Security and Privacy in Communication Systems

1820 Accesses
17 Citations

Abstract

A recent report has shown that there are more than 5,000 malicious applications created for Android devices each day. This creates a need for researchers to develop effective and efficient malware classification and detection approaches. To address this need, we introduce DroidClassifier: a systematic framework for classifying network traffic generated by mobile malware. Our approach utilizes network traffic analysis to construct multiple models in an automated fashion using a supervised method over a set of labeled malware network traffic (the training dataset). Each model is built by extracting common identifiers from multiple HTTP header fields. Adaptive thresholds are designed to capture the disparate characteristics of different malware families. Clustering is then used to improve the classification efficiency. Finally, we aggregate the multiple models to construct a holistic model to conduct cluster-level malware classification. We then perform a comprehensive evaluation of DroidClassifier by using 706 malware samples as the training set and 657 malware samples and 5,215 benign apps as the testing set. Collectively, these malicious and benign apps generate 17,949 network flows. The results show that DroidClassifier successfully identifies over 90% of different families of malware with more than 90% accuracy with accessible computational cost. Thus, DroidClassifier can facilitate network management in a large network, and enable unobtrusive detection of mobile malware. By focusing on analyzing network behaviors, we expect DroidClassifier to work with reasonable accuracy for other mobile platforms such as iOS and Windows Mobile as well.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Afonso, V.M., de Amorim, M.F., Grégio, A.R.A., Junquera, G.B., de Geus, P.L.: Identifying android malware using dynamically obtained features. J. Comput. Virol. Hacking Tech. 11(1), 9–17 (2015)
Article Google Scholar
Aresu, M., Ariu, D., Ahmadi, M., Maiorca, D., Giacinto, G.: Clustering android malware families by http traffic. In: 10th International Conference on Malicious and Unwanted Software (MALWARE). IEEE (2015)
Google Scholar
Arora, A., Garg, S., Peddoju, S.K.: Malware detection using network traffic analysis in android based mobile devices. In: 2014 Eighth International Conference on Next Generation Mobile Apps, Services and Technologies (NGMAST), pp. 66–71. IEEE (2014)
Google Scholar
Arp, D., Spreitzenbarth, M., Hübner, M., Gascon, H., Rieck, K., Siemens, C.: Drebin: effective and explainable detection of android malware in your pocket. In: Annual Symposium on Network and Distributed System Security (NDSS) (2014)
Google Scholar
Chen, Z., Han, H., Yan, Q., Yang, B., Peng, L., Zhang, L., Li, J.: A first look at android malware traffic in first few minutes. In: IEEE TrustCom 2015 (Aug. 2015)
Google Scholar
G Data. GData mobile malware report, July 2015. https://public.gdatasoftware.com/Presse/Publikationen/Malware_Reports/G_DATA_MobileMWR_Q2_2015_EN.pdf
Enck, W., Gilbert, P., Han, S., Tendulkar, V., Chun, B.-G., Cox, L.P., Jung, J., Mc-Daniel, P., Sheth, A.N.: Taintdroid: an information-flow tracking system for realtime privacy monitoring on smartphones. ACM Trans. Comput. Syst. (TOCS) 32(2), 5 (2014)
Article Google Scholar
F-Secure. F-secure mobile threat report, March, 2014. https://www.f-secure.com/documents/996508/1030743/Mobile_Threat_Report_Q1_2014.pdf
Gill, P., Erramilli, V., Chaintreau, A., Krishnamurthy, B., Papagiannaki, K., Rodriguez, P.: Best paper–follow the money: understanding economics of online aggregation and advertising. In: Conference on Internet Measurement Conference, pp. 141–148. ACM (2013)
Google Scholar
Hornyack, P., Han, S., Jung, J., Schechter, S., Wetherall, D.: These aren’t the droids you’re looking for: retrofitting android to protect data from imperious applications. In: ACM Conference on Computer and Communications Security, pp. 639–652. ACM (2011)
Google Scholar
Jaccard, P.: Etude comparative de la distribution florale dans une portion des Alpes et du Jura. Impr, Corbaz (1901)
Google Scholar
Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: a review. ACM Comput. Surv. (CSUR) 31(3), 264–323 (1999)
Article Google Scholar
Jensen, C.S., Prasad, M.R., Møller, A.: Automated testing with targeted event sequence generation. In: International Symposium on Software Testing and Analysis, ISSTA 2013, Lugano, Switzerland, pp. 67–77 (2013)
Google Scholar
Kheir, N.: Analyzing HTTP user agent anomalies for malware detection. In: Pietro, R., Herranz, J., Damiani, E., State, R. (eds.) DPM/SETOP -2012. LNCS, vol. 7731, pp. 187–200. Springer, Heidelberg (2013). doi:10.1007/978-3-642-35890-6_14
Chapter Google Scholar
Le, A., Varmarken, J., Langhoff, S., Shuba, A., Gjoka, M., Markopoulou, A.: Antmonitor: a system for monitoring from mobile devices. In: SIGCOMM Workshop on Crowdsourcing and Crowdsharing of Big (Internet) Data, pp. 15–20. ACM (2015)
Google Scholar
Levenshtein, V.I.: Binary codes capable of correcting deletions, insertions, and reversals. Forschungsbericht, 707–710 S (1966)
Google Scholar
Nari, S., Ghorbani, A.A.: Automated malware classification based on network behavior. In: 2013 International Conference on Computing, Networking and Communications (ICNC), pp. 642–647. IEEE (2013)
Google Scholar
Narudin, F.A., Feizollah, A., Anuar, N.B., Gani, A.: Evaluation of machine learning classifiers for mobile malware detection. Soft. Comput. 20(1), 343–357 (2016)
Article Google Scholar
Pelleg, D., Moore, A.W., et al.: X-means: extending k-means with efficient estimation of the number of clusters. In: ICML, vol. 1, 2000
Google Scholar
Perdisci, R., Lee, W., Feamster, N.: Behavioral clustering of http-based malware and signature generation using malicious network traces. In: NSDI, pp. 391–404 (2010)
Google Scholar
Rafique, M.Z., Caballero, Juan: FIRMA: malware clustering and network signature generation with mixed network behaviors. In: Stolfo, S.J., Stavrou, Angelos, Wright, C.V. (eds.) RAID 2013. LNCS, vol. 8145, pp. 144–163. Springer, Heidelberg (2013). doi:10.1007/978-3-642-41284-4_8
Chapter Google Scholar
Rao, A., Sherry, J., Legout, A., Krishnamurthy, A., Dabbous, W., Choffnes, D.: Meddle: middleboxes for increased transparency and control of mobile traffic. In: ACM Conference on CoNEXT Student Workshop, pp. 65–66. ACM (2012)
Google Scholar
Razaghpanah, A., Vallina-Rodriguez, N., Sundaresan, S., Kreibich, C., Gill, P., Allman, M., Paxson, V.: Haystack: In situ mobile traffic analysis in user space. In arXiv preprint arXiv:1510.01419 (2015)
Rokach, L., Maimon, O.: Clustering methods. In: Maimon, O., Rokach, L. (eds.) Data Mining and Knowledge Discovery Handbook, pp. 321–352. Springer, USA (2005)
Chapter Google Scholar
Shabtai, A., Kanonov, U., Elovici, Y., Glezer, C., Weiss, Y.: andromaly: a behavioral malware detection framework for android devices. J. Intell. Inf. Syst. 38(1), 161–190 (2012)
Article Google Scholar
Symantec Corporation. Internet Security Threat Report 2014. http://www.symantec.com/content/en/us/enterprise/other_resources/b-istr_main_report_v19_21291018.en-us.pdf. Accessed 21 June 2016
Tutorialspoint. HTTP header. http://www.tutorialspoint.com/http/http_header_fields.htm. Accessed 21 June 2016
Vallina-Rodriguez, N., Shah, J., Finamore, A., Grunenberger, Y., Papagiannaki, K., Haddadi, H., Crowcroft, J.: Breaking for commercials: characterizing mobile advertising. In: ACM Conference on Internet Measurement Conference, pp. 343–356. ACM (2012)
Google Scholar
Winsniewski, R.: Android–apktool: a tool for reverse engineering android apk files. http://ibotpeaches.github.io/Apktool/, Accessed 21 June 2016
Wurzinger, P., Bilge, L., Holz, T., Goebel, J., Kruegel, C., Kirda, E.: Automatically generating models for botnet detection. In: Backes, M., Ning, P. (eds.) ESORICS 2009. LNCS, vol. 5789, pp. 232–249. Springer, Heidelberg (2009). doi:10.1007/978-3-642-04444-1_15
Chapter Google Scholar
Xu, Q., Liao, Y., Miskovic, S., Mao, Z.M., Baldi, M., Nucci, A., Andrews, T.: Automatic generation of mobile app signatures from traffic observations. In: IEEE INFOCOM, April 2015
Google Scholar
Xu, W., Qi, Y., Evans, D.: Automatically evading classifiers, a case study on pdf malware classifiers. In: Annual Symposium on Network and Distributed System Security (NDSS) (2016)
Google Scholar
Yao, H., Ranjan, G., Tongaonkar, A., Liao, Y., Mao, Z.M.: Samples: self adaptive mining of persistent lexical snippets for classifying mobile application traffic. In: Annual International Conference on Mobile Computing and Networking (2015)
Google Scholar
Zhang, J., Saha, S., Gu, G., Lee, S.-J., Mellia, M.: Systematic mining of associated server herds for malware campaign discovery. In: 2015 IEEE 35th International Conference on Distributed Computing Systems (ICDCS), pp. 630–641. IEEE (2015)
Google Scholar
Zhang, Y., Yang, M., Xu, B., Yang, Z., Gu, G., Ning, P., Wang, X.S., Zang, B.: Vetting undesirable behaviors in android apps with permission use analysis. In: ACM SIGSAC Conference on Computer & Communications Security, pp. 611–622. ACM (2013)
Google Scholar
Zhou, Y., Jiang, X.: Dissecting android malware: characterization and evolution In: 2012 IEEE Symposium on Security and Privacy (SP), pp. 95–109. IEEE (2012)
Google Scholar

Download references

Acknowledgment

We thank the anonymous reviewers and our shepherd, Aziz Mohaisen, for their insightful feedback on our work. This work is supported in part by NSF grant CNS-1566388. This work is also based on research sponsored by DARPA and Maryland Procurement Office under agreement numbers FA8750-14-2-0053 and H98230-14-C-0140, respectively. Any opinions, findings, conclusions, or recommendations expressed here are those of the authors and do not necessarily reflect the views of the funding agencies or the U.S. Government.

Author information

Authors and Affiliations

University of Nebraska–Lincoln, Lincoln, NE, 68588, USA
Zhiqiang Li, Lichao Sun, Qiben Yan & Witawas Srisa-an
University of Jinan, Jinan, 250022, Shandong, China
Zhenxiang Chen

Authors

Zhiqiang Li
View author publications
You can also search for this author in PubMed Google Scholar
Lichao Sun
View author publications
You can also search for this author in PubMed Google Scholar
Qiben Yan
View author publications
You can also search for this author in PubMed Google Scholar
Witawas Srisa-an
View author publications
You can also search for this author in PubMed Google Scholar
Zhenxiang Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhiqiang Li .

Editor information

Editors and Affiliations

Singapore Management University, Singapore, Singapore
Robert Deng
Jinan University, Guangzhou, Guangdong, China
Jian Weng
University at Buffalo, Buffalo, New York, USA
Kui Ren
SRI International, Menlo Park, California, USA
Vinod Yegneswaran

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Li, Z., Sun, L., Yan, Q., Srisa-an, W., Chen, Z. (2017). DroidClassifier: Efficient Adaptive Mining of Application-Layer Header for Classifying Android Malware. In: Deng, R., Weng, J., Ren, K., Yegneswaran, V. (eds) Security and Privacy in Communication Networks. SecureComm 2016. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 198. Springer, Cham. https://doi.org/10.1007/978-3-319-59608-2_33

Download citation

DOI: https://doi.org/10.1007/978-3-319-59608-2_33
Published: 14 June 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-59607-5
Online ISBN: 978-3-319-59608-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics