Abstract
The Android platform is highly targeted by malware developers, which aim to infect the maximum number of mobile devices by uploading their malicious applications to different app markets. In order to keep a healthy Android ecosystem, app-markets check the maliciousness of newly submitted apps. These markets need to (a) correctly detect malicious app, and (b) speed up the detection process of the most likely dangerous applications among an overwhelming flow of submitted apps, to quickly mitigate their potential damages. To address these challenges, we propose TriDroid, a market-scale triage and classification system for Android apps. TriDroid prioritizes apps analysis according to their risk likelihood. To this end, we categorize the submitted apps as: botnet, general malware, and benign. TriDroid starts by performing a (1) Triage process, which applies a fast coarse-grained and less-accurate analysis on a continuous stream of the submitted apps to identify their corresponding queue in a three-class priority queuing system. Then, (2) the Classification process extracts fine-grained static features from the apps in the priority queue, and applies three-class machine learning classifiers to confirm with high accuracy the classification decisions of the triage process. In addition to the priority queuing model, we also propose a multi-server queuing model where the classification of each app category is run on a different server. Experiments on a dataset with more than 24K malicious and 3K benign applications show that the priority model offers a trade-off between waiting time and processing overhead, as it requires only one server compared to the multi-server model. Also it successfully prioritizes malicious apps analysis, which allows a short waiting time for dangerous applications compared to the FIFO policy.
Similar content being viewed by others
References
Agrawal R, Imieliński T, Swami A (1993) Mining association rules between sets of items in large databases. Acm sigmod record. ACM 22:207–216
Ahmed AA, Jabbar WA, Sadiq AS, Patel H (2020) Deep learning-based classification model for botnet attack detection. J Ambient Intell Hum Comput. https://doi.org/10.1007/s12652-020-01848-9
Anwar S, Zain JM, Inayat Z, Haq RU, Karim A, Jabir AN (2016) A static approach towards mobile botnet detection. In: 2016 3rd International conference on electronic design (ICED), IEEE, pp 563–567. https://doi.org/10.1109/ICED.2016.7804708
Arp D, Spreitzenbarth M, Hubner M, Gascon H, Rieck K, Siemens C (2014) Drebin: effective and explainable detection of android malware in your pocket. In: NDSS
Bergstra JS, Bardenet R, Bengio Y, Kégl B (2011) Algorithms for hyper-parameter optimization. In: Advances in neural information processing systems, pp 2546–2554
Breiman L (2001) Random forests. Mach Learn 45(1):5–32
Cawley GC, Talbot NL (2010) On over-fitting in model selection and subsequent selection bias in performance evaluation. J Mach Learn Res 11:2079–2107
Chakradeo S, Reaves B, Traynor P, Enck W (2013) Mast: Triage for market-scale mobile malware analysis. In: Proceedings of the sixth ACM conference on security and privacy in wireless and mobile networks, ACM, pp 13–24. https://doi.org/10.1145/2462096.2462100
Chen S, Xue M, Tang Z, Xu L, Zhu H (2016) Stormdroid: A streaminglized machine learning-based system for detecting android malware. In: Proceedings of the 11th ACM on Asia conference on computer and communications security, ACM, pp 377–388. https://doi.org/10.1145/2897845.2897860
Chicco D (2017) Ten quick tips for machine learning in computational biology. BioData Min 10(1):35
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297
da Costa VG, Barbon S, Miani RS, Rodrigues JJ, Zarpelão BB (2017) Detecting mobile botnets through machine learning and system calls analysis. In: 2017 IEEE international conference on communications (ICC), IEEE, pp 1–6. https://doi.org/10.1109/ICC.2017.7997390
Dalziel H, Abraham A (2015) Automated security analysis of android and iOS applications with mobile security framework, 1st edn. Syngress Publishing
Dharmalingam VP, Palanisamy V (2020) A novel permission ranking system for android malware detection-the permission grader. J Ambient Intell Hum Comput. https://doi.org/10.1007/s12652-020-01957-5
Digitaltrends (2019) Google insists it’s doing what it can to purge Play Store of malicious apps. https://tinyurl.com/y4bte92s. Accessed 13 June 2020
Ding Y, Zhang X, Hu J, Xu W (2020) Android malware detection method based on bytecode image. J Ambient Intell Hum Comput. https://doi.org/10.1007/s12652-020-02196-4
Fournier-Viger P, Wu CW, Tseng VS (2012) Mining top-k association rules. In: Canadian conference on artificial intelligence, Springer, pp 61–73. https://doi.org/10.1007/978-3-642-30353-1_6
Fournier-Viger P, Lin JCW, Gomariz A, Gueniche T, Soltani A, Deng Z, Lam HT (2016) The spmf open-source data mining library version 2. In: Joint European conference on machine learning and knowledge discovery in databases, Springer, pp 36–40. https://doi.org/10.1007/978-3-319-46131-1_8
Freund Y, Schapire RE (1997) A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 55(1):119–139
Fu H, Zheng Z, Bose S, Bishop M, Mohapatra P (2017) Leaksemantic: Identifying abnormal sensitive network transmissions in mobile applications. In: IEEE INFOCOM 2017-IEEE conference on computer communications, IEEE, pp 1–9. https://doi.org/10.1109/INFOCOM.2017.8057221
Gdata (2018) Cyber attacks on Android devices on the rise. https://www.gdatasoftware.com/blog/2018/11/31255-cyber-attacks-on-android-devices-on-the-rise. Accessed 13 June 2020
Girei DA, Shah MA, Shahid MB (2016) An enhanced botnet detection technique for mobile devices using log analysis. In: 2016 22nd International conference on automation and computing (ICAC), IEEE, pp 450–455
Google (2020) Google Play Store. https://play.google.com. Accessed 13 June 2020
Heaton J (2016) Comparing dataset characteristics that favor the apriori, eclat or fp-growth frequent itemset mining algorithms. In: SoutheastCon 2016, IEEE, pp 1–7
Itpro (2018) Hackers building a botnet out of five million compromised Android devices. https://tinyurl.com/y3wu7p88. Accessed 13 June 2020
Karbab EB, Debbabi M, Derhab A, Mouheb D (2016) Cypider: building community-based cyber-defense infrastructure for android malware detection. In: Proceedings of the 32nd annual conference on computer security applications, ACM, pp 348–362. https://doi.org/10.1145/2991079.2991124
Karim A, Salleh R, Khan MK (2016) Smartbot: a behavioral analysis framework augmented with machine learning to identify mobile botnet applications. PLoS one 11(3):e0150077. https://dx.doi.org/10.1371/journal.pone.0150077
Khan A, Baharudin B, Lee LH, Khan K (2010) A review of machine learning algorithms for text-documents classification. J Adv Inf Technol 1(1):4–20
Kohavi R, Quinlan JR (2002) Data mining tasks and methods: classification: decision-tree discovery. In: Handbook of data mining and knowledge discovery. Oxford University Press, Oxford, pp 267–276
Kothari S (2020) Real time analysis of android applications by calculating risk factor to identify botnet attack. In: ICCCE 2019. Springer, pp 55–62. https://doi.org/10.1007/978-981-13-8715-9_7
Lakovic V (2020) Crisis management of android botnet detection using adaptive neuro-fuzzy inference system. Ann Data Sci. https://doi.org/10.1007/s40745-020-00265-1
Liao Y, Vemuri VR (2002) Use of k-nearest neighbor classifier for intrusion detection. Comput Secur 21(5):439–448
Lin D, Patrick J, Labeau F (2014) Estimating the waiting time of multi-priority emergency patients with downstream blocking. Health Care Manag Sci 17(1):88–99
Liu P, Wang W, Luo X, Wang H, Liu C (2020) Nsdroid: efficient multi-classification of android malware using neighborhood signature in local function call graphs. Int J Inf Secur. https://doi.org/10.1007/s10207-020-00489-5
Matloff N (2008) Introduction to discrete-event simulation and the simpy language. Davis, CA Dept of Computer Science University of California at Davis Retrieved on August 2(2009):1–33
Mehtab A, Shahid WB, Yaqoob T, Amjad MF, Abbas H, Afzal H, Saqib MN (2020) Addroid: rule-based machine learning framework for android malware analysis. Mob Netw Appl 25(1):180–192. https://doi.org/10.1007/s11036-019-01248-0
Mirzaei O, Suarez-Tangil G, Tapiador J, de Fuentes JM (2017) Triflow: Triaging android applications using speculative information flows. In: Proceedings of the 2017 ACM on Asia conference on computer and communications security, ACM, pp 640–651. https://doi.org/10.1145/3052973.3053001
Moodi M, Ghazvini M (2019) A new method for assigning appropriate labels to create a 28 standard android botnet dataset (28-sabd). J Ambient Intell Hum Comput 10(11):4579–4593. https://doi.org/10.1007/s12652-018-1140-5
Moodi M, Ghazvini M, Moodi H, Ghavami B (2020) A smart adaptive particle swarm optimization–support vector machine: android botnet detection application. J Supercomput. https://doi.org/10.1007/s11227-020-03233-x
Nolan G (2012) Decompiling android, 1st edn. Apress, New York
Oberheide J, Miller C (2012) Dissecting the android bouncer. SummerCon2012, New York
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825–2830
Rasheed MM, Faieq AK, Hashim AA (2020) Android botnet detection using machine learning. Ingénierie des Systèmes d’Information 25. https://doi.org/10.18280/isi.250117
Rasthofer S, Arzt S, Kolhagen M, Pfretzschner B, Huber S, Bodden E, Richter P (2015) Droidsearch: a tool for scaling android app triage to real-world app stores. In: Science and information conference (SAI), 2015, IEEE, pp 247–256. https://doi.org/10.1109/SAI.2015.7237151
Sandeep H (2019) Static analysis of android malware detection using deep learning. In: 2019 International conference on intelligent computing and control systems (ICCS), IEEE, pp 841–845. https://doi.org/10.1109/ICCS45141.2019.9065765
Saracino A, Sgandurra D, Dini G, Martinelli F (2016) Madam: effective and efficient behavior-based android malware detection and prevention. IEEE Trans Dependable Secure Comput. https://doi.org/10.1109/TDSC.2016.2536605
Sartea R, Farinelli A, Murari M (2020) Secur-ama: active malware analysis based on monte carlo tree search for android systems. Eng Appl Artif Intell 87:103303. https://doi.org/10.1016/j.engappai.2019.103303
Securityaffairs (2018) HiddenMiner Android Cryptocurrency miner can brick your device. https://securityaffairs.co/wordpress/70968/malware/hiddenminer-android-miner.html. Accessed 13 June 2020
Sheen S, Anitha R, Natarajan V (2015) Android based malware detection using a multifeature collaborative decision fusion approach. Neurocomputing 151:905–912
Sokolova M, Lapalme G (2009) A systematic analysis of performance measures for classification tasks. Inf Process Manag 45(4):427–437
Srivastava S, Gupta MR, Frigyik BA (2007) Bayesian quadratic discriminant analysis. J Mach Learn Res 8:1277–1305
Statcounter (2020) Mobile Os market share. http://gs.statcounter.com/os-market-share/mobile/worldwide. Accessed 13 June 2020
Symantec (2012) Android.Bmaster: a million-dollar mobile botnet. https://tinyurl.com/yyrnb289. Accessed 13 June 2020
Thakkar A, Lohiya R (2020) Attack classification using feature selection techniques: a comparative study. J Ambient Intell Hum Comput. https://doi.org/10.1007/s12652-020-02167-9
Vij D, Balachandran V, Thomas T, Surendran R (2020) Gramac: A graph based android malware classification mechanism. In: Proceedings of the tenth ACM conference on data and application security and privacy, pp 156–158. https://doi.org/10.1145/3374664.3379530
Wang G, Liu Z (2020) Android malware detection model based on lightgbm. In: Recent trends in intelligent computing, communication and devices. Springer, pp 237–243. https://doi.org/10.1007/978-981-13-9406-5_29
Wang W, Zhao M, Wang J (2019) Effective android malware detection with a hybrid model based on deep autoencoder and convolutional neural network. J Ambient Intell Hum Comput 10(8):3035–3043. https://doi.org/10.1007/s12652-018-0803-6
Wang W, Shang Y, He Y, Li Y, Liu J (2020) Botmark: automated botnet detection with hybrid analysis of flow-based and graph-based traffic behaviors. Inf Sci 511:284–296. https://doi.org/10.1016/j.ins.2019.09.024
Wei F, Li Y, Roy S, Ou X, Zhou W (2017) Deep ground truth analysis of current android malware. In: International conference on detection of intrusions and malware, and vulnerability assessment. Springer, pp 252–276. https://doi.org/10.1007/978-3-319-60876-1_12
Yuan B, Wang J, Liu D, Guo W, Wu P, Bao X (2020) Byte-level malware classification based on Markov images and deep learning. Comput Secur 92:101740. https://doi.org/10.1016/j.cose.2020.101740
Acknowledgements
The authors extend their appreciation to the Deanship of Scientific Research at King Saud University for funding this work through research group No (RG-1439-021).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Amira, A., Derhab, A., Karbab, E.B. et al. TriDroid: a triage and classification framework for fast detection of mobile threats in android markets. J Ambient Intell Human Comput 12, 1731–1755 (2021). https://doi.org/10.1007/s12652-020-02243-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12652-020-02243-0