Experimenting and assessing machine learning tools for detecting and analyzing malicious behaviors in complex environments

Cuzzocrea, Alfredo; Martinelli, Fabio; Mercaldo, Francesco; Grasso, Giorgio Mario

doi:10.1007/s40860-018-0072-3

Experimenting and assessing machine learning tools for detecting and analyzing malicious behaviors in complex environments

Original Article
Published: 31 October 2018

Volume 4, pages 225–245, (2018)
Cite this article

Journal of Reliable Intelligent Environments Aims and scope Submit manuscript

Alfredo Cuzzocrea^1,2,
Fabio Martinelli³,
Francesco Mercaldo³ &
…
Giorgio Mario Grasso⁴

290 Accesses
4 Citations
Explore all metrics

Abstract

This paper proposes applying and experimentally assessing machine learning tools to solve security issues in complex environments, specifically identifying and analyzing malicious behaviors. To evaluate the effectiveness of machine learning algorithms to detect anomalies, we consider the following three real-world case studies: (i) detecting and analyzing Tor traffic, on the basis of a machine learning-based discrimination technique; (ii) identifying and analyzing CAN bus attacks via deep learning; (iii) detecting and analyzing mobile malware, with particular regard to ransomware in Android environments, by means of structural entropy-based classification. Derived observations confirm the effectiveness of machine learning in supporting security of complex environments.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Machine Learning: Algorithms, Real-World Applications and Research Directions

Article 22 March 2021

Iqbal H. Sarker

AI-Driven Cybersecurity: An Overview, Security Intelligence Modeling and Research Directions

Article 26 March 2021

Iqbal H. Sarker, Md Hasan Furhad & Raza Nowrozy

Cybersecurity data science: an overview from machine learning perspective

Article Open access 01 July 2020

Iqbal H. Sarker, A. S. M. Kayes, … Alex Ng

Notes

References

Dissecting the android bouncer. https://jon.oberheide.org/files/summercon12-bouncer.pdf. Accessed 30 Jan 2015
Addision PS (2002) The illustrated wavelet transform handbook: introductory theory and applications in science, engineering, medicine and finance. Taylor & Francis Group, Abingdon
Google Scholar
Al-Kahtani MS (2012) Survey on security attacks in vehicular ad hoc networks (vanets). In: 6th international conference on signal processing and communication systems (ICSPCS), 2012, pp 1–9.,IEEE
Al-rimy BAS, Maarof MA, Shaid SZM. (2018) Ransomware threat success factors, taxonomy, and countermeasures: a survey and research directions. Comput Secur
Andronio N, Zanero S, Maggi F (2015) Heldroid: dissecting and detecting mobile ransomware. In: International workshop on recent advances in intrusion detection, pp 382–404. Springer
Athanasiadis IN, Kaburlasos VG, Mitkas PA, Petridis V (2003) Applying machine learning techniques on air quality data for real-time decision support. In: ITEE. Citeseer
Baccouche M, Mamalet F, Wolf C, Garcia C, Baskurt A (2011) Sequential deep learning for human action recognition. In: International workshop on human behavior understanding, pp 29–39. Springer
Barker J, Hannay P, Szewczyk P (2011) Using traffic analysis to identify the second generation onion router. In: IFIP 9th international Conference on embedded and ubiquitous computing (EUC), 2011, pp 72–78. IEEE
Battista P, Mercaldo F, Nardone V, Santone A, Visaggio CA (2016) Identification of android malware families with model checking. In: Proceedings of the 2nd international conference on information systems security and privacy, ICISSP 2016, Rome, Italy, February 19–21, 2016, pp 542–547. SciTePress
Baysa D, Low RM, Stamp M (2013) Structural entropy and metamorphic malware. J Comput Virol Hacking Tech 9(4):179–192
Article Google Scholar
Bernardi ML, Cimitile M, Martinelli F, Mercaldo F (2018) Driver and path detection through time-series classification. J Adv Transp
Borda M (2011) Fundamentals in information theory and coding. Springer
Bouckaert RR. (2004) Bayesian network classifiers in weka
Braun P, Cameron J, Cuzzocrea A, Jiang F, Leung C (2014) Effectively and efficiently mining frequent patterns from dense graph streams on disk. Proc Comput Sci 35:338–347
Article Google Scholar
Canfora G, Mercaldo F, Visaggio CA (2013) A classifier of malicious android applications. In: Eighth international conference on availability, reliability and security (ARES), 2013, pp 607–614. IEEE
Canfora G, Mercaldo F, Visaggio CA (2016) An hmm and structural entropy based detector for android malware: an empirical study. Comput Secur 61:1–18
Article Google Scholar
Canfora G, Mercaldo F, Visaggio CA, Di Notte P (2014) Metamorphic malware detection using code metrics. Inform Secur J Glob Perspect 23(3):57–67
Article Google Scholar
Cannataro M, Cuzzocrea A, Mastroianni C, Ortale R, Pugliese A (2002) Modeling adaptive hypermedia with an object-oriented approach and xml. In: Second international workshop on web dynamics
Cannataro M, Cuzzocrea A, Pugliese A (2001) A probabilistic approach to model adaptive hypermedia systems. In: Proceedings of the international workshop for web dynamics, pp 12–30
Chaabane A, Manils P, Kaafar MA (2010) Digging into anonymous traffic: a deep analysis of the tor anonymizing network. In: 4th International conference on network and system security (NSS), 2010, pp 167–174. IEEE
Chakravarty S, Barbera MV, Portokalidis G, Polychronakis M, Keromytis AD (2014) On the effectiveness of traffic analysis against anonymity networks using flow records. In: PAM, pp 247–257. Springer
Cimitile A, Martinelli F, Mercaldo F (2017) Machine learning meets ios malware: identifying malicious applications on apple environment. In: Proceedings of the 3rd international conference on information systems security and privacy, pp 487–492
Cimitile A, Mercaldo F, Nardone V, Santone A, Visaggio CA (2017) Talos: no more ransomware victims with formal methods. Int J Inform Secur
Cuzzocrea A (2006) Accuracy control in compressed multidimensional data cubes for quality of answer-based OLAP tools. In: 18th International conference on scientific and statistical database management, SSDBM 2006, 3–5 July 2006, Vienna, Austria, Proceedings, pp 301–310
Cuzzocrea A (2006) Combining multidimensional user models and knowledge representation and management techniques for making web services knowledge-aware. Web Intell Agent Syst 4(3):289–312
Google Scholar
Cuzzocrea A (2006) Improving range-sum query evaluation on data cubes via polynomial approximation. Data Knowl Eng 56(2):85–121
Article Google Scholar
Cuzzocrea A, Fortino G, Rana OF (2013) Managing data and processes in cloud-enabled large-scale sensor networks: state-of-the-art and future research directions. In: 13th IEEE/ACM international symposium on cluster, cloud, and grid computing, CCGrid 2013, Delft, Netherlands, May 13–16, 2013, pp 583–588
Cuzzocrea A, Furfaro F, Greco S, Masciari E, Mazzeo GM, Saccà D (2005) A distributed system for answering range queries on sensor network data. In: 3rd IEEE conference on pervasive computing and communications workshops (PerCom 2005 Workshops), 8–12 March 2005, Kauai Island, HI, USA, pp 369–373
Cuzzocrea A, Furfaro F, Saccà D (2009) Enabling OLAP in mobile environments via intelligent data cube compression techniques. J Intell Inf Syst 33(2):95–143
Article Google Scholar
De Francesco N, Lettieri G, Santone A, Vaglini G (2014) Grease: a tool for efficient nonequivalence checking. ACM Trans Softw Eng Methodol 23(3):24
Article Google Scholar
De Francesco N, Lettieri G, Santone A, Vaglini G (2016) Heuristic search for equivalence checking. Softw Syst Model 15(2):513–530
Article Google Scholar
Ding L, Fang W, Luo H, Love PE, Zhong B, Ouyang X (2018) A deep hybrid learning model to detect unsafe behavior: integrating convolution neural networks and long short-term memory. Autom Constr 86:118–124
Article Google Scholar
Dingledine R, Mathewson N, Syverson P (2004) Tor: the second-generation onion router. Tech. rep, DTIC Document
Draper-Gil G, Lashkari AH, Mamun MSI, Ghorbani AA (2016) Characterization of encrypted and vpn traffic using time-related
Ferrante A, Malek M, Martinelli F, Mercaldo F, Milosevic J (2017) Extinguishing ransomware-a hybrid approach to android ransomware detection. In: The 10th international symposium on foundations practice of security
Friedman N, Geiger D, Goldszmidt M (1997) Bayesian network classifiers. Mach Learn 29(2–3):131–163
Article Google Scholar
Gharib A, Ghorbani A (2017) Dna-droid: a real-time android ransomware detection framework. In: Yan Z, Molva R, Mazurczyk W, Kantola R (eds) Network and system security: 11th International conference, NSS 2017, Helsinki, Finland, August 21–23, 2017, Proceedings
Goldszmidt M (2010) Bayesian network classifiers. Wiley encyclopedia of operations research and management science
Gradara S, Santone A, Villani M, Vaglini G (2004) Model checking multithreaded programs by means of reduced models. Electr Notes Theor Comput Sci 110:55–74
Article Google Scholar
He G, Yang M, Luo J, Gu X (2014) Inferring application type information from tor encrypted traffic. In: Second international Conference on advanced cloud and big data (CBD), 2014, pp 220–227. IEEE
Holte RC (1993) Very simple classification rules perform well on most commonly used datasets. Mach Learn 11(1):63–90
Article MathSciNet Google Scholar
Hühn J, Hüllermeier E (2009) Furia: an algorithm for unordered fuzzy rule induction. Data Min Knowl Discov 19(3):293–319
Article MathSciNet Google Scholar
Ilisei I, Inkpen D, Pastor GC, Mitkov, R (2010) Identification of translationese: a machine learning approach. In: CICLing, vol 6008, pp 503–511. Springer
Ishibuchi H, Yamamoto T (2004) Fuzzy rule selection by multi-objective genetic local search algorithms and rule evaluation measures in data mining. Fuzzy Sets Syst 141(1):59–88
Article Google Scholar
Jayanthi S, Sasikala S (2013) Reptree classifier for identifying link spam in web search engines. IJSC 3(2):498–505
Article Google Scholar
Jensen R, Cornelis C (2008) A new approach to fuzzy-rough nearest neighbour classification. In: International conference on rough sets and current trends in computing, pp 310–319. Springer
Jensen R, Cornelis C (2011) Fuzzy-rough nearest neighbour classification. In: Transactions on rough sets XIII, pp 56–72. Springer
Kwak BI, Woo J, Kim HK (2016) Know your master: driver profiling-based anti-theft method. In: PST 2016
Lashkari AH, Gil GD, Mamun MSI, Ghorbani AA (2017) Characterization of tor traffic using time based features. In: Proceedings of the 3rd international conference on information systems security and privacy, vol 1, ICISSP,, pp 253–262. INSTICC, SciTePress
Li KC, Jiang H, Yang LT, Cuzzocrea A (2015) Big data: algorithms, analytics, and applications, 1st edn. Chapman & Hall/CRC, Boca Raton
MATH Google Scholar
Lyda R, Hamrock J (2007) Using entropy analysis to find encrypted and packed malware. Secur Priv IEEE 5(2):40–45
Article Google Scholar
Maiorca D, Mercaldo F, Giacinto G, Visaggio CA, Martinelli F (2017) R-packdroid: Api package-based characterization and detection of mobile ransomware. In: Proceedings of the symposium on applied computing, pp 1718–1723. ACM
Martinelli F, Marulli F, Mercaldo F (2017) Evaluating convolutional neural network for effective mobile malware detection. Proc Comput Sci 112:2372–2381
Article Google Scholar
Martinelli F, Mercaldo F, Nardone V, Orlando A, Santone A (2018) Whos driving my car? a machine learning based approach to driver identification. In: ICISSP
Martinelli F, Mercaldo F, Nardone V, Santone A (2017) Car hacking identification through fuzzy logic algorithms. In: IEEE International Conference on fuzzy systems (FUZZ-IEEE), IEEE
McCoy D, Bauer K, Grunwald D, Kohno T, Sicker D (2008) Shining light in dark places: understanding the tor network. In: International symposium on privacy enhancing technologies symposium, pp 63–76. Springer
Mercaldo F, Nardone V, Santone A (2016) Ransomware inside out. In: 11th International Conference on availability, reliability and security (ARES), 2016, pp 628–637. IEEE
Mercaldo F, Nardone V, Santone A (2017) Diabetes mellitus affected patients classification and diagnosis through machine learning techniques. Proc Comput Sci 112(C):2519–2528
Article Google Scholar
Mercaldo F, Nardone V, Santone A, Visaggio CA (2016) Ransomware steals your phone. Formal methods rescue it. In: International conference on formal techniques for distributed objects, components, and systems, pp 212–221. Springer
Mercaldo F, Visaggio CA, Canfora G, Cimitile A (2016) Mobile malware detection in the real world. In: IEEE/ACM international conference on software engineering companion (ICSE-C), pp 744–746. IEEE
Mitchell TM (1999) Machine learning and data mining. Commun ACM 42(11):30–36
Article Google Scholar
Pérez JM, Muguerza J, Arbelaitz O, Gurrutxaga I, Martín JI (2007) Combining multiple class distribution modified subsamples in a single tree. Pattern Recognit Lett 28(4):414–422
Article Google Scholar
Quinlan R (1993) C4.5: programs for machine learning. Morgan Kaufmann Publishers, San Mateo
Google Scholar
Rajput A, Aharwal RP, Dubey M, Saxena S, Raghuvanshi M (2011) J48 and jrip rules for e-governance data. Int J Comput Sci Secur 5(2):201
Google Scholar
Samara G, Al-Salihy WA, Sures R (2010) Security issues and challenges of vehicular ad hoc networks (vanet). In: 4th International Conference on new trends in information science and service science (NISS), 2010, pp 393–398. IEEE
Schmidhuber J (2015) Deep learning in neural networks: an overview. Neural Netw 61:85–117
Article Google Scholar
Shahzad W, Asad S, Khan MA (2013) Feature subset selection using association rule mining and jrip classifier. Int J Phys Sci 8(18):885–896
Article Google Scholar
Snader R, Borisov N (2008) A tune-up for tor: improving security and performance in the tor network. In: ndss, vol 8, p 127
Song S, Kim B, Lee S (2016) The effective ransomware prevention technique using process monitoring on android platform. Mobile Inform Syst
Sorokin I (2011) Comparing files using structural entropy. J Comput Virol Hacking Tech 7(4):259–265
Article MathSciNet Google Scholar
Srinivasan DB, Mekala P (2014) Mining social networking data for classification using reptree. Int J Adv Res Comput Sci Manag Stud 2(10)
Syverson P, Tsudik G, Reed M, Landwehr C (2001) Towards an analysis of onion routing security. In: Designing privacy enhancing technologies. Springer, pp 96–114
Ugarte-Pedrero X, Santos I, Sanz B, Laorden C, Bringas PG (2012) Countering entropy measure attacks on packed software detection. In: The 9th annual IEEE consumer communications and networking conference—security and content protection, pp 164–168
Villarrubia G, De Paz JF, Chamoso P, De la Prieta F (2018) Artificial neural networks used in optimization problems. Neurocomputing 272:10–16
Article Google Scholar
Webb G (1999) Decision tree grafting from the all-tests-but-one partition. Morgan Kaufmann, San Francisco
Google Scholar
Wu X, Kumar V, Quinlan JR, Ghosh J, Yang Q, Motoda H, McLachlan GJ, Ng A, Liu B, Philip SY (2008) Top 10 algorithms in data mining. Knowl Inform Syst 14(1):1–37
Article Google Scholar
Xiao X, Zhang S, Mercaldo F, Hu G, Sangaiah AK (2017) Android malware detection based on system call sequences and lstm. Multimed Tools Appl 1–21
Yang T, Yang Y, Qian K, Lo DCT, Qian Y, Tao L (2015) Automated detection and analysis for android ransomware. In: IEEE 17th international conference on high performance computing and communications, IEEE 7th international symposium on cyberspace safety and security, IEEE 12th international conference on embedded software and systems, pp 1338–1343. IEEE
Zhou Y, Jiang X (2012) Dissecting android malware: characterization and evolution. In: IEEE symposium on security and privacy (SP), 2012, pp 95–109. IEEE

Download references

Acknowledgements

This work has been partially supported by H2020 EU-funded projects NeCS and C3ISP and EIT-Digital Project HII.

Author information

Authors and Affiliations

DIA Department, University of Trieste and ICAR-CNR, Trieste, Italy
Alfredo Cuzzocrea
ICAR-CNR, Rende, Italy
Alfredo Cuzzocrea
IIT Institute, National Research Council, Pisa, Italy
Fabio Martinelli & Francesco Mercaldo
COSPECS Department, University of Messina, Messina, Italy
Giorgio Mario Grasso

Authors

Alfredo Cuzzocrea
View author publications
You can also search for this author in PubMed Google Scholar
Fabio Martinelli
View author publications
You can also search for this author in PubMed Google Scholar
Francesco Mercaldo
View author publications
You can also search for this author in PubMed Google Scholar
Giorgio Mario Grasso
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alfredo Cuzzocrea.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cuzzocrea, A., Martinelli, F., Mercaldo, F. et al. Experimenting and assessing machine learning tools for detecting and analyzing malicious behaviors in complex environments. J Reliable Intell Environ 4, 225–245 (2018). https://doi.org/10.1007/s40860-018-0072-3

Download citation

Received: 02 October 2018
Accepted: 15 October 2018
Published: 31 October 2018
Issue Date: December 2018
DOI: https://doi.org/10.1007/s40860-018-0072-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Experimenting and assessing machine learning tools for detecting and analyzing malicious behaviors in complex environments

Abstract

Access this article

Similar content being viewed by others

Machine Learning: Algorithms, Real-World Applications and Research Directions

AI-Driven Cybersecurity: An Overview, Security Intelligence Modeling and Research Directions

Cybersecurity data science: an overview from machine learning perspective

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Experimenting and assessing machine learning tools for detecting and analyzing malicious behaviors in complex environments

Abstract

Access this article

Similar content being viewed by others

Machine Learning: Algorithms, Real-World Applications and Research Directions

AI-Driven Cybersecurity: An Overview, Security Intelligence Modeling and Research Directions

Cybersecurity data science: an overview from machine learning perspective

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation