Abstract
The prevalence of mobile malware has become a growing issue given the tight integration of mobile systems with our daily life. Most malware programs use URLs inside network traffic to forward commands to launch malicious activities. Therefore, the detection of malicious URLs can be essential in deterring such malicious activities. Traditional methods construct blacklists with verified URLs to identify malicious URLs, but their effectiveness is impaired by unknown malicious URLs. Recently, machine learning-based methods have been proposed for malware detection with improved performance. In this paper, we propose a novel URL detection method based on Floating Centroids Method (FCM), which integrates supervised classification and unsupervised clustering in a coherent manner. The proposed method uses the lexical features of a URL to effectively identify malicious URLs while grouping similar URLs into the same cluster. Our experimental results show that a URL cluster exhibits unique behavioral patterns that can be used for malware detection with high accuracy. Moreover, the proposed behavioral clustering method facilitates the identification of malicious URL categories and unseen malware variants.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Security threat report 2014. http://www.sophos.com/en-us/medialibrary/PDFs/other/sophossecurity-threat-report-2014.pdf
Wang, L., et al.: Improvement of neural network classifier using floating centroids. Knowl. Inf. Syst. 31(3), 433–454 (2012)
Specification of malicious url 2013. http://www.antiy.net/p/specification-of-malicious-url
Canopy clustering algorithm. https://en.wikipedia.org/wiki/Canopy_clustering_algorithm
Wu, D.J., Mao, C.H., Lee, H.M., Wu, K.P.: Droidmat: android malware detection through manifest and api calls tracing. In: Information Security, pp. 62–69 (2012)
Arp, D., Spreitzenbarth, M., Hubner, M., Gascon, H., Rieck, K.: DREBIN: effective and explainable detection of android malware in your pocket. In: Proceedings of the Network and Distributed System Security Symposium (NDSS) (2014)
Yang, C., Xu, Z., Gu, G., Yegneswaran, V., Porras, P.: DroidMiner: automated mining and characterization of fine-grained malicious behaviors in android applications. In: Kutyłowski, M., Vaidya, J. (eds.) ESORICS 2014. LNCS, vol. 8712, pp. 163–182. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11203-9_10
Yan, L.K., Yin, H.: DroidScope: seamlessly reconstructing the OS and Dalvik semantic views for dynamic android malware analysis. In: Proceedings of the 21st USENIX Conference on Security Symposium, p. 29 (2013)
Rastogi, V., Chen, Y., Enck, W.: AppsPlayground: automatic security analysis of smartphone applications. In: ACM Conference on Data and Application Security and Privacy, pp. 209–220 (2013)
Narudin, F.A., Feizollah, A., Anuar, N.B., Gani, A.: Evaluation of machine learning classifiers for mobile malware detection. Soft Comput. 20(1), 1–15 (2016)
Xu, Q., et al.: Automatic generation of mobile app signatures from traffic observations. In: Computer Communications, pp. 1481–1489 (2015)
Wang, S., Chen, Z., Zhang, L., Yan, Q., Yang, B.: Trafficav: an effective and explainable detection of mobile malware behavior using network traffic. In: Proceedings of IEEE/ACM International Symposium on Quality of Service (IWQOS), pp. 1–6 (2016)
Pizzato, L., Rej, T., Chung, T., Koprinska, I., Kay, J.: RECON: a reciprocal recommender for online dating. In: ACM Conference on Recommender Systems, pp. 207–214 (2010)
Wei, X., Neamtiu, I., Faloutsos, M.: Whom does your android app talk to? In: Global Communications Conference (GLOBECOM), pp. 1–6. IEEE (2015)
Shabtai, A., Tenenboim-Chekina, L., Mimran, D., Rokach, L., Shapira, B., Elovici, Y.: Mobile malware detection through analysis of deviations in application network behavior. Comput. Secur. 43(6), 1–18 (2014)
Gorla, A., Tavecchia, I., Gross, F., Zeller, A.: Checking app behavior against app descriptions. In: Proceedings of the 36th International Conference on Software Engineering, pp. 1025–1035. ACM (2014)
Android monkey tool. http://developer.android.com/tools/help/monkey.html
Tshark - dump and analyze network traffic. https://www.wireshark.org/docs/man-pages/tshark.html
Yang, Y., Pedersen, J.O.: A comparative study on feature selection in text categorization. In: Fourteenth International Conference on Machine Learning, pp. 412–420 (1997)
PSO tutorial. http://www.swarmintelligence.org/tutorials.php
Virusshare.com - because sharing is caring. https://virusshare.com/
Virustotal. https://www.virustotal.com/
Aranganayagi, S., Thangavel, K.: Clustering categorical data using silhouette coefficient as a relocating measure. In: Conference on Computational Intelligence and Multimedia Applications. International Conference on, vol. 2, pp. 13–17. IEEE (2007)
Perdisci, R., Lee, W., Feamster, N.: Behavioral clustering of HTTP-based malware and signature generation using malicious network traces. In: Usenix Conference on Networked Systems Design and Implementation, p. 26 (2010)
Wang, S., Yan, Q., Chen, Z., Yang, B., Zhao, C., Conti, M.: Detecting android malware leveraging text semantics of network flows. IEEE Trans. Inf. Forensics Secur. PP(99), 1 (2017)
Acknowledgement
This work was supported by the National Natural Science Foundation of China under Grants No. 61672262, No. 61573166 and No. 61572230, the Shandong Provincial Key R&D Program under Grant No. 2016GGX101001 and No. 2018CXGC0706, CERNET Next Generation Internet Technology Innovation Project under Grant No. NGII20160404. This work is also supported in part by NSF grant CNS-1566388.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering
About this paper
Cite this paper
Wang, S. et al. (2018). Lexical Mining of Malicious URLs for Classifying Android Malware. In: Beyah, R., Chang, B., Li, Y., Zhu, S. (eds) Security and Privacy in Communication Networks. SecureComm 2018. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 254. Springer, Cham. https://doi.org/10.1007/978-3-030-01701-9_14
Download citation
DOI: https://doi.org/10.1007/978-3-030-01701-9_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-01700-2
Online ISBN: 978-3-030-01701-9
eBook Packages: Computer ScienceComputer Science (R0)