Abstract
Android malware has become a serious threat in our daily digital life, and thus there is a pressing need to effectively detect or defend against them. Recent techniques have relied on the extraction of lightweight syntactic features that are suitable for machine learning classification, but despite of their promising results, the features they extract are often too simple to characterise Android applications, and thus may be insufficient when used to detect Android malware. In this paper, we propose CDGDroid, an effective approach for Android malware detection based on deep learning. We use the semantics graph representations, that is, control flow graph, data flow graph, and their possible combinations, as the features to characterise Android applications. We encode the graphs into matrices, and use them to train the classification model via Convolutional Neural Network (CNN). We have conducted some experiments on Marvin, Drebin, VirusShare and ContagioDump datasets to evaluate our approach and have identified that the classification model taking the horizontal combination of CFG and DFG as features offers the best performance in terms of accuracy among all combinations. We have also conducted experiments to compare our approach against Yeganeh Safaei et al.’s approach, Allix et al.’s approach, Drebin and many anti-virus tools gathered in VirusTotal, and the experimental results have confirmed that our classification model gives a better performance than the others.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
There may be several additional dex files with the name “classesi.dex” in large APKs.
- 2.
For simplicity, Java reflection, callbacks and multi-threading are not considered at present.
- 3.
The alternative extension with size \(222\times 444\) is fine as well.
- 4.
Only the malware samples whose creation dates are in 2018 are collected.
- 5.
For convenience, we do not consider the inter-procedural analysis for traditional CFGs, since the instructions for method calling are abstracted as the special node.
References
Report from IDC. http://www.idc.com/promo/smartphone-market-share/os
Report from G DATA (2017). https://www.gdatasoftware.com/blog/2017/04/29712-8-400-new-android-malware-samples-every-day
Arzt, S., et al.: Flowdroid: precise context, flow, field, object-sensitive and lifecycle-aware taint analysis for android apps. In: PLDI 2014, pp. 259–269 (2014)
Wei, F., Roy, S., Ou, X.: Amandroid: a precise and general inter-component data flow analysis framework for security vetting of android apps. In: CCS 2014, pp. 1329–1341 (2014)
Enck, W., et al.: TaintDroid: an information-flow tracking system for realtime privacy monitoring on smartphones. In: OSDI 2014, pp. 393–407 (2014)
Enck, W., Ongtang, M., Mcdaniel, P.: On lightweight mobile phone application certification. In: CCS 2009, pp. 235–245 (2009)
Felt, A., et al.: Android permissions demystified. In: CCS 2011, pp. 627–638 (2011)
Grace, M., et al.: Riskranker: scalable and accurate zero-day android malware detection. In: MobiSys 2012, pp. 281–294 (2012)
Sahs, J., Khan, L.: A machine learning approach to android malware detection. In: EISIC 2012, pp. 141–147 (2012)
Yang, C., Xu, Z., Gu, G., Yegneswaran, V., Porras, P.: DroidMiner: automated mining and characterization of fine-grained malicious behaviors in android applications. In: Kutyłowski, M., Vaidya, J. (eds.) ESORICS 2014. LNCS, vol. 8712, pp. 163–182. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11203-9_10
Li, Y., Shen, T., Sun, X., Pan, X., Mao, B.: Detection, classification and characterization of android malware using API data dependency. In: Thuraisingham, B., Wang, X.F., Yegneswaran, V. (eds.) SecureComm 2015. LNICST, vol. 164, pp. 23–40. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-28865-9_2
Allix, K., et al.: Empirical assessment of machine learning-based malware detectors for android. Empirical Softw. Eng. 21(1), 183–211 (2016)
Narayanan, A., Liu, Y., Chen, L., Liu, J.: Adaptive and scalable android malware detection through online learning. In: IJCNN 2016, pp. 157–175 (2016)
Mclaughlin, N., et al.: Deep android malware detection. In: CODASPY 2017, pp. 301–308 (2017)
Chen, S., et al.: Automated poisoning attacks and defenses in malware detection systems: an adversarial machine learning approach. Comput. Secur. 73, 326–344 (2017)
Arp, D., et al.: DREBIN: effective and explainable detection of android malware in your pocket. In: NDSS 2014 (2014)
VirusTotal. https://www.virustotal.com
Wi\(\acute{s}\)niewski, R., Tumbleson, C.: Apktool: a tool for reverse engineering Android APK files. https://ibotpeaches.github.io/Apktool/
Dalvik Bytecode. https://source.android.com/devices/tech/dalvik/dalvik-bytecode
Narayanan, A., Chandramohan, M., Chen, L., Liu, Y.: A multi-view context-aware approach to android malware detection and malicious code localization. Empirical Softw. Eng. 23(3), 1222–1274 (2017)
Lindorfer, M., Neugschwandtner, M., Platzer, C.: Marvin: Efficient and comprehensive mobile app classification through static and dynamic analysis. In: ComSAC 2015, pp. 422–433 (2015)
VirusShare. https://virusshare.com/
Contagiodump. http://contagiodump.blogspot.com/
Mi App Store. https://dev.mi.com/en
Narayanan, A., et al.: Contextual Weisfeiler-Lehman graph kernel for malware detection. In: IJCNN 2016, pp. 4701–4708 (2016)
Yang, W., et al.: Appcontext: differentiating malicious and benign mobile app behaviors using context. In: ICSE 2015, pp. 303–313 (2015)
Andriatsimandefitra, R., Tong, V.V.T.: Capturing android malware behaviour using system flow graph. In: Au, M.H., Carminati, B., Kuo, C.-C.J. (eds.) NSS 2014. LNCS, vol. 8792, pp. 534–541. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11698-3_43
Zhang, M., Duan, Y., Yin, H., Zhao, Z.: Semantics-aware android malware classification using weighted contextual API dependency graphs. In: CCS 2014, pp. 1105–1116 (2014)
Feng, Y., Anand, S., Dillig, L., Aiken, A.: Apposcopy: semantics-based detection of android malware through static analysis. In: FSE 2014, pp. 576–587 (2014)
Feng, Y., et al.: Automated synthesis of semantic malware signatures using maximum satisfiability. CoRR, abs/1608.06254 (2016)
Narayanan, A., Chandramohan, M., Chen, L., Liu, Y.: Context-aware, adaptive and scalable android malware detection through online learning (extended version). CoRR, abs/1706.00947 (2017)
Gascon, H., Yamaguchi, F., Arp, D., Rieck, K.: Structural detection of android malware using embedded call graphs. In: AISec 2013, pp. 45–54 (2013)
Du, Y., Wang, J., Li, Q.: An android malware detection approach using community structures of weighted function call graphs. IEEE Access PP(99), 1 (2017)
Fan, M., et al.: Frequent subgraph based familial classification of android malware. In: ISSRE 2016, pp. 24–35 (2016)
Chen, K., et al.: Contextual policy enforcement in android applications with permission event graphs. Heredity 110(6), 586 (2013)
Shen, T., et al.: Detect android malware variants using component based topology graph. In: TrustCom 2014, pp. 406–413 (2014)
Yuan, Z., Lu, Y., Wang, Z., Xue, Y.: Droid-Sec: deep learning in android malware detection. In: SIGCOMM 2014, pp. 371–372 (2014)
Yuan, Z., Lu, Y., Xue, Y.: Droiddetector: android malware characterization and detection using deep learning. Tsinghua Sci. Technol. 21(1), 114–123 (2016)
Su, X., Zhang, D., Li, W., Zhao, K.: A deep learning approach to android malware feature learning and detection. In: TrustCom 2016, pp. 244–251 (2016)
Hou, S., Saas, A., Ye, Y., Chen, L.: DroidDelver: an android malware detection system using deep belief network based on API call blocks. In: Song, S., Tong, Y. (eds.) WAIM 2016. LNCS, vol. 9998, pp. 54–66. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-47121-1_5
Wang, Z., Cai, J., Cheng, S., Li, W.: Droiddeeplearner: identifying android malware using deep learning. In: Sarnoff 2016, pp. 160–165 (2016)
Nix, R., Zhang, J.: Classification of android apps and malware using deep neural networks. In: IJCNN 2017, pp. 1871–1878 (2017)
Karbab, E., Debbabi, M., Derhab, A., Mouheb, D.: Maldozer: automatic framework for android malware detection using deep learning. Digit. Invest. 24, S48–S59 (2018)
Hou, S., Saas, A., Chen, L., Ye, Y.: Deep4maldroid: a deep learning framework for android malware detection based on Linux kernel system call graphs. In: WIW 2017, pp. 104–111 (2017)
Nauman, M., Tanveer, T., Khan, S., Syed, T.: Deep neural architectures for large scale android malware analysis. Cluster Comput. 1–20 (2017)
Zhu, D., et al.: Deepflow: deep learning-based malware detection by mining android application for abnormal usage of sensitive data. In: ISCC 2017, pp. 438–443, July 2017
Acknowledgements
The authors would like to thank the anonymous reviewers for their helpful comments. This work was partially supported by the National Natural Science Foundation of China under Grants No. 61502308 and 61772347, Science and Technology Foundation of Shenzhen City under Grant No. JCYJ20170302153712968, Project 2016050 supported by SZU R/D Fund and Natural Science Foundation of SZU (Grant No. 827-000200).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Xu, Z., Ren, K., Qin, S., Craciun, F. (2018). CDGDroid: Android Malware Detection Based on Deep Learning Using CFG and DFG. In: Sun, J., Sun, M. (eds) Formal Methods and Software Engineering. ICFEM 2018. Lecture Notes in Computer Science(), vol 11232. Springer, Cham. https://doi.org/10.1007/978-3-030-02450-5_11
Download citation
DOI: https://doi.org/10.1007/978-3-030-02450-5_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-02449-9
Online ISBN: 978-3-030-02450-5
eBook Packages: Computer ScienceComputer Science (R0)