Abstract
This paper proposes applying and experimentally assessing machine learning tools to solve security issues in complex environments, specifically identifying and analyzing malicious behaviors. To evaluate the effectiveness of machine learning algorithms to detect anomalies, we consider the following three real-world case studies: (i) detecting and analyzing Tor traffic, on the basis of a machine learning-based discrimination technique; (ii) identifying and analyzing CAN bus attacks via deep learning; (iii) detecting and analyzing mobile malware, with particular regard to ransomware in Android environments, by means of structural entropy-based classification. Derived observations confirm the effectiveness of machine learning in supporting security of complex environments.













Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.Notes
References
Dissecting the android bouncer. https://jon.oberheide.org/files/summercon12-bouncer.pdf. Accessed 30 Jan 2015
Addision PS (2002) The illustrated wavelet transform handbook: introductory theory and applications in science, engineering, medicine and finance. Taylor & Francis Group, Abingdon
Al-Kahtani MS (2012) Survey on security attacks in vehicular ad hoc networks (vanets). In: 6th international conference on signal processing and communication systems (ICSPCS), 2012, pp 1–9.,IEEE
Al-rimy BAS, Maarof MA, Shaid SZM. (2018) Ransomware threat success factors, taxonomy, and countermeasures: a survey and research directions. Comput Secur
Andronio N, Zanero S, Maggi F (2015) Heldroid: dissecting and detecting mobile ransomware. In: International workshop on recent advances in intrusion detection, pp 382–404. Springer
Athanasiadis IN, Kaburlasos VG, Mitkas PA, Petridis V (2003) Applying machine learning techniques on air quality data for real-time decision support. In: ITEE. Citeseer
Baccouche M, Mamalet F, Wolf C, Garcia C, Baskurt A (2011) Sequential deep learning for human action recognition. In: International workshop on human behavior understanding, pp 29–39. Springer
Barker J, Hannay P, Szewczyk P (2011) Using traffic analysis to identify the second generation onion router. In: IFIP 9th international Conference on embedded and ubiquitous computing (EUC), 2011, pp 72–78. IEEE
Battista P, Mercaldo F, Nardone V, Santone A, Visaggio CA (2016) Identification of android malware families with model checking. In: Proceedings of the 2nd international conference on information systems security and privacy, ICISSP 2016, Rome, Italy, February 19–21, 2016, pp 542–547. SciTePress
Baysa D, Low RM, Stamp M (2013) Structural entropy and metamorphic malware. J Comput Virol Hacking Tech 9(4):179–192
Bernardi ML, Cimitile M, Martinelli F, Mercaldo F (2018) Driver and path detection through time-series classification. J Adv Transp
Borda M (2011) Fundamentals in information theory and coding. Springer
Bouckaert RR. (2004) Bayesian network classifiers in weka
Braun P, Cameron J, Cuzzocrea A, Jiang F, Leung C (2014) Effectively and efficiently mining frequent patterns from dense graph streams on disk. Proc Comput Sci 35:338–347
Canfora G, Mercaldo F, Visaggio CA (2013) A classifier of malicious android applications. In: Eighth international conference on availability, reliability and security (ARES), 2013, pp 607–614. IEEE
Canfora G, Mercaldo F, Visaggio CA (2016) An hmm and structural entropy based detector for android malware: an empirical study. Comput Secur 61:1–18
Canfora G, Mercaldo F, Visaggio CA, Di Notte P (2014) Metamorphic malware detection using code metrics. Inform Secur J Glob Perspect 23(3):57–67
Cannataro M, Cuzzocrea A, Mastroianni C, Ortale R, Pugliese A (2002) Modeling adaptive hypermedia with an object-oriented approach and xml. In: Second international workshop on web dynamics
Cannataro M, Cuzzocrea A, Pugliese A (2001) A probabilistic approach to model adaptive hypermedia systems. In: Proceedings of the international workshop for web dynamics, pp 12–30
Chaabane A, Manils P, Kaafar MA (2010) Digging into anonymous traffic: a deep analysis of the tor anonymizing network. In: 4th International conference on network and system security (NSS), 2010, pp 167–174. IEEE
Chakravarty S, Barbera MV, Portokalidis G, Polychronakis M, Keromytis AD (2014) On the effectiveness of traffic analysis against anonymity networks using flow records. In: PAM, pp 247–257. Springer
Cimitile A, Martinelli F, Mercaldo F (2017) Machine learning meets ios malware: identifying malicious applications on apple environment. In: Proceedings of the 3rd international conference on information systems security and privacy, pp 487–492
Cimitile A, Mercaldo F, Nardone V, Santone A, Visaggio CA (2017) Talos: no more ransomware victims with formal methods. Int J Inform Secur
Cuzzocrea A (2006) Accuracy control in compressed multidimensional data cubes for quality of answer-based OLAP tools. In: 18th International conference on scientific and statistical database management, SSDBM 2006, 3–5 July 2006, Vienna, Austria, Proceedings, pp 301–310
Cuzzocrea A (2006) Combining multidimensional user models and knowledge representation and management techniques for making web services knowledge-aware. Web Intell Agent Syst 4(3):289–312
Cuzzocrea A (2006) Improving range-sum query evaluation on data cubes via polynomial approximation. Data Knowl Eng 56(2):85–121
Cuzzocrea A, Fortino G, Rana OF (2013) Managing data and processes in cloud-enabled large-scale sensor networks: state-of-the-art and future research directions. In: 13th IEEE/ACM international symposium on cluster, cloud, and grid computing, CCGrid 2013, Delft, Netherlands, May 13–16, 2013, pp 583–588
Cuzzocrea A, Furfaro F, Greco S, Masciari E, Mazzeo GM, Saccà D (2005) A distributed system for answering range queries on sensor network data. In: 3rd IEEE conference on pervasive computing and communications workshops (PerCom 2005 Workshops), 8–12 March 2005, Kauai Island, HI, USA, pp 369–373
Cuzzocrea A, Furfaro F, Saccà D (2009) Enabling OLAP in mobile environments via intelligent data cube compression techniques. J Intell Inf Syst 33(2):95–143
De Francesco N, Lettieri G, Santone A, Vaglini G (2014) Grease: a tool for efficient nonequivalence checking. ACM Trans Softw Eng Methodol 23(3):24
De Francesco N, Lettieri G, Santone A, Vaglini G (2016) Heuristic search for equivalence checking. Softw Syst Model 15(2):513–530
Ding L, Fang W, Luo H, Love PE, Zhong B, Ouyang X (2018) A deep hybrid learning model to detect unsafe behavior: integrating convolution neural networks and long short-term memory. Autom Constr 86:118–124
Dingledine R, Mathewson N, Syverson P (2004) Tor: the second-generation onion router. Tech. rep, DTIC Document
Draper-Gil G, Lashkari AH, Mamun MSI, Ghorbani AA (2016) Characterization of encrypted and vpn traffic using time-related
Ferrante A, Malek M, Martinelli F, Mercaldo F, Milosevic J (2017) Extinguishing ransomware-a hybrid approach to android ransomware detection. In: The 10th international symposium on foundations practice of security
Friedman N, Geiger D, Goldszmidt M (1997) Bayesian network classifiers. Mach Learn 29(2–3):131–163
Gharib A, Ghorbani A (2017) Dna-droid: a real-time android ransomware detection framework. In: Yan Z, Molva R, Mazurczyk W, Kantola R (eds) Network and system security: 11th International conference, NSS 2017, Helsinki, Finland, August 21–23, 2017, Proceedings
Goldszmidt M (2010) Bayesian network classifiers. Wiley encyclopedia of operations research and management science
Gradara S, Santone A, Villani M, Vaglini G (2004) Model checking multithreaded programs by means of reduced models. Electr Notes Theor Comput Sci 110:55–74
He G, Yang M, Luo J, Gu X (2014) Inferring application type information from tor encrypted traffic. In: Second international Conference on advanced cloud and big data (CBD), 2014, pp 220–227. IEEE
Holte RC (1993) Very simple classification rules perform well on most commonly used datasets. Mach Learn 11(1):63–90
Hühn J, Hüllermeier E (2009) Furia: an algorithm for unordered fuzzy rule induction. Data Min Knowl Discov 19(3):293–319
Ilisei I, Inkpen D, Pastor GC, Mitkov, R (2010) Identification of translationese: a machine learning approach. In: CICLing, vol 6008, pp 503–511. Springer
Ishibuchi H, Yamamoto T (2004) Fuzzy rule selection by multi-objective genetic local search algorithms and rule evaluation measures in data mining. Fuzzy Sets Syst 141(1):59–88
Jayanthi S, Sasikala S (2013) Reptree classifier for identifying link spam in web search engines. IJSC 3(2):498–505
Jensen R, Cornelis C (2008) A new approach to fuzzy-rough nearest neighbour classification. In: International conference on rough sets and current trends in computing, pp 310–319. Springer
Jensen R, Cornelis C (2011) Fuzzy-rough nearest neighbour classification. In: Transactions on rough sets XIII, pp 56–72. Springer
Kwak BI, Woo J, Kim HK (2016) Know your master: driver profiling-based anti-theft method. In: PST 2016
Lashkari AH, Gil GD, Mamun MSI, Ghorbani AA (2017) Characterization of tor traffic using time based features. In: Proceedings of the 3rd international conference on information systems security and privacy, vol 1, ICISSP,, pp 253–262. INSTICC, SciTePress
Li KC, Jiang H, Yang LT, Cuzzocrea A (2015) Big data: algorithms, analytics, and applications, 1st edn. Chapman & Hall/CRC, Boca Raton
Lyda R, Hamrock J (2007) Using entropy analysis to find encrypted and packed malware. Secur Priv IEEE 5(2):40–45
Maiorca D, Mercaldo F, Giacinto G, Visaggio CA, Martinelli F (2017) R-packdroid: Api package-based characterization and detection of mobile ransomware. In: Proceedings of the symposium on applied computing, pp 1718–1723. ACM
Martinelli F, Marulli F, Mercaldo F (2017) Evaluating convolutional neural network for effective mobile malware detection. Proc Comput Sci 112:2372–2381
Martinelli F, Mercaldo F, Nardone V, Orlando A, Santone A (2018) Whos driving my car? a machine learning based approach to driver identification. In: ICISSP
Martinelli F, Mercaldo F, Nardone V, Santone A (2017) Car hacking identification through fuzzy logic algorithms. In: IEEE International Conference on fuzzy systems (FUZZ-IEEE), IEEE
McCoy D, Bauer K, Grunwald D, Kohno T, Sicker D (2008) Shining light in dark places: understanding the tor network. In: International symposium on privacy enhancing technologies symposium, pp 63–76. Springer
Mercaldo F, Nardone V, Santone A (2016) Ransomware inside out. In: 11th International Conference on availability, reliability and security (ARES), 2016, pp 628–637. IEEE
Mercaldo F, Nardone V, Santone A (2017) Diabetes mellitus affected patients classification and diagnosis through machine learning techniques. Proc Comput Sci 112(C):2519–2528
Mercaldo F, Nardone V, Santone A, Visaggio CA (2016) Ransomware steals your phone. Formal methods rescue it. In: International conference on formal techniques for distributed objects, components, and systems, pp 212–221. Springer
Mercaldo F, Visaggio CA, Canfora G, Cimitile A (2016) Mobile malware detection in the real world. In: IEEE/ACM international conference on software engineering companion (ICSE-C), pp 744–746. IEEE
Mitchell TM (1999) Machine learning and data mining. Commun ACM 42(11):30–36
Pérez JM, Muguerza J, Arbelaitz O, Gurrutxaga I, Martín JI (2007) Combining multiple class distribution modified subsamples in a single tree. Pattern Recognit Lett 28(4):414–422
Quinlan R (1993) C4.5: programs for machine learning. Morgan Kaufmann Publishers, San Mateo
Rajput A, Aharwal RP, Dubey M, Saxena S, Raghuvanshi M (2011) J48 and jrip rules for e-governance data. Int J Comput Sci Secur 5(2):201
Samara G, Al-Salihy WA, Sures R (2010) Security issues and challenges of vehicular ad hoc networks (vanet). In: 4th International Conference on new trends in information science and service science (NISS), 2010, pp 393–398. IEEE
Schmidhuber J (2015) Deep learning in neural networks: an overview. Neural Netw 61:85–117
Shahzad W, Asad S, Khan MA (2013) Feature subset selection using association rule mining and jrip classifier. Int J Phys Sci 8(18):885–896
Snader R, Borisov N (2008) A tune-up for tor: improving security and performance in the tor network. In: ndss, vol 8, p 127
Song S, Kim B, Lee S (2016) The effective ransomware prevention technique using process monitoring on android platform. Mobile Inform Syst
Sorokin I (2011) Comparing files using structural entropy. J Comput Virol Hacking Tech 7(4):259–265
Srinivasan DB, Mekala P (2014) Mining social networking data for classification using reptree. Int J Adv Res Comput Sci Manag Stud 2(10)
Syverson P, Tsudik G, Reed M, Landwehr C (2001) Towards an analysis of onion routing security. In: Designing privacy enhancing technologies. Springer, pp 96–114
Ugarte-Pedrero X, Santos I, Sanz B, Laorden C, Bringas PG (2012) Countering entropy measure attacks on packed software detection. In: The 9th annual IEEE consumer communications and networking conference—security and content protection, pp 164–168
Villarrubia G, De Paz JF, Chamoso P, De la Prieta F (2018) Artificial neural networks used in optimization problems. Neurocomputing 272:10–16
Webb G (1999) Decision tree grafting from the all-tests-but-one partition. Morgan Kaufmann, San Francisco
Wu X, Kumar V, Quinlan JR, Ghosh J, Yang Q, Motoda H, McLachlan GJ, Ng A, Liu B, Philip SY (2008) Top 10 algorithms in data mining. Knowl Inform Syst 14(1):1–37
Xiao X, Zhang S, Mercaldo F, Hu G, Sangaiah AK (2017) Android malware detection based on system call sequences and lstm. Multimed Tools Appl 1–21
Yang T, Yang Y, Qian K, Lo DCT, Qian Y, Tao L (2015) Automated detection and analysis for android ransomware. In: IEEE 17th international conference on high performance computing and communications, IEEE 7th international symposium on cyberspace safety and security, IEEE 12th international conference on embedded software and systems, pp 1338–1343. IEEE
Zhou Y, Jiang X (2012) Dissecting android malware: characterization and evolution. In: IEEE symposium on security and privacy (SP), 2012, pp 95–109. IEEE
Acknowledgements
This work has been partially supported by H2020 EU-funded projects NeCS and C3ISP and EIT-Digital Project HII.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Cuzzocrea, A., Martinelli, F., Mercaldo, F. et al. Experimenting and assessing machine learning tools for detecting and analyzing malicious behaviors in complex environments. J Reliable Intell Environ 4, 225–245 (2018). https://doi.org/10.1007/s40860-018-0072-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40860-018-0072-3