Abstract
In this paper we tackle the problem of performing graph based network forensics analysis at a large scale. To this end, we propose a novel distributed version of a popular network forensics analysis algorithm, the one by Wang and Daniels [18].
Our version of the Wang and Daniels algorithm has been formulated according to the MapReduce paradigm and implemented using the Apache Spark framework. The resulting code is able to analyze in a scalable way graphs of arbitrary size thanks to its distributed nature. We also present the results of an experimental study where we assessed both the time performance and the scalability of our algorithm when run on a distributed system of increasing size.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Alabdulsalam, S.K., Duong, T.Q., Choo, K.-K.R., Le-Khac, N.-A.: evidence identification and acquisition based on network link in an internet of things environment. In: Herrero, Á., Cambra, C., Urda, D., Sedano, J., Quintián, H., Corchado, E. (eds.) CISIS 2019. AISC, vol. 1267, pp. 163–173. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-57805-3_16
Apache Software Foundation: Apache Spark (2016). http://spark.apache.org
Bompiani, E., Ferraro Petrillo, U., Jona Lasinio, G., Palini, F.: High-performance computing with TeraStat. In: Proceedings of the 2020 IEEE International Conference on Dependable, Autonomic and Secure Computing, October 2020. https://doi.org/10.1109/DASC-PICom-CBDCom-CyberSciTech49142.2020.00088
Cattaneo, G., Ferraro Petrillo, U., Nappi, M., Narducci, F., Roscigno, G.: An efficient implementation of the algorithm by Lukáš et al. on Hadoop. In: Au, M.H.A., Castiglione, A., Choo, K.-K.R., Palmieri, F., Li, K.-C. (eds.) GPC 2017. LNCS, vol. 10232, pp. 475–489. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-57186-7_35
Corey, V., Peterman, C., Shearin, S., Greenberg, M.S., Van Bokkelen, J.: Network forensics analysis. IEEE Internet Comput. 6(6), 60–66 (2002)
Cybercrime Magazine: Global Cybercrime Damages Predicted To Reach \$6 Trillion Annually By 2021 (2018). cybersecurityventures.com/cybercrime-damages-6-trillion-by-2021
Dave, A., Jindal, A., Li, L.E., Xin, R., Gonzalez, J., Zaharia, M.: GraphFrames: an integrated API for mixing graph and relational queries. In: Proceedings of the Fourth International Workshop on Graph Data Management Experiences and Systems, pp. 1–8 (2016)
Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM 51, 107–113 (2008)
Dijkstra, E.W., et al.: A note on two problems in connexion with graphs. Numerische mathematik 1(1), 269–271 (1959)
Ferraro Petrillo, U., Roscigno, G., Cattaneo, G., Giancarlo, R.: Informational and linguistic analysis of large genomic sequence collections via efficient Hadoop cluster algorithms. Bioinformatics 34(11), 1826–1833 (2018)
Ferraro Petrillo, U., Sorella, M., Cattaneo, G., Giancarlo, R., Rombo, S.E.: Analyzing big datasets of genomic sequences: fast and scalable collection of k-mer statistics. BMC Bioinform. 20(4), 1–14 (2019)
Garcia, S., Grill, M., Stiborek, J., Zunino, A.: An empirical comparison of botnet detection methods. Comput. Secur. 45, 100–123 (2014)
He, J., Chang, C., He, P., Pathan, M.S.: Network forensics method based on evidence graph and vulnerability reasoning. Future Internet 8(4), 54 (2016)
Liu, C., Singhal, A., Wijesekera, D.: Creating integrated evidence graphs for network forensics. In: Peterson, G., Shenoi, S. (eds.) DigitalForensics 2013. IAICT, vol. 410, pp. 227–241. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-41148-9_16
Lynch, N.A.: Distributed Algorithms. Morgan Kaufmann, San Francisco (1996)
Malewicz, G., et al.: Pregel: a system for large-scale graph processing. In: Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data, pp. 135–146 (2010)
Pelaez, J.C., Fernandez, E.B.: VoIP network forensic patterns. In: 2009 Fourth International Multi-Conference on Computing in the Global Information Technology, pp. 175–180. IEEE (2009)
Wang, W., Daniels, T.E.: A graph based approach toward network forensics analysis. ACM Trans. Inf. Syst. Secur. 12(1), October 2008. https://doi.org/10.1145/1410234.1410238
Xin, R.S., Gonzalez, J.E., Franklin, M.J., Stoica, I.: GraphX: a resilient distributed graph system on spark. In: First International Workshop on Graph Data Management Experiences and Systems, pp. 1–6 (2013)
Acknowledgements
All authors would like to thank the Department of Statistical Sciences of University of Rome - La Sapienza for computing time on the TeraStat [3] cluster and for other computing resources, and the GARR Consortium for having made available a cutting edge OpenStack Virtual Datacenter for this research.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Di Rocco, L., Petrillo, U.F., Palini, F. (2021). Large Scale Graph Based Network Forensics Analysis. In: Del Bimbo, A., et al. Pattern Recognition. ICPR International Workshops and Challenges. ICPR 2021. Lecture Notes in Computer Science(), vol 12665. Springer, Cham. https://doi.org/10.1007/978-3-030-68821-9_39
Download citation
DOI: https://doi.org/10.1007/978-3-030-68821-9_39
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-68820-2
Online ISBN: 978-3-030-68821-9
eBook Packages: Computer ScienceComputer Science (R0)