Abstract
Network traffic analysis is applied to detect intrusions and manage application traffic. Continuous batch network traffic analysis is a computationally demanding task. Because of traffic intensity variations due to the natural peaks and crests of network traffic intensity, a network analysis cluster may have to be severely over-dimensioned to support 24/7 continuous packet block capture and processing. In this paper, we characterize the computational requirements of the network traffic packets for several conditions, which constitute a useful tool for generating a network workload in simulated scenarios. Our target MapReduce jobs are map-intensive, including string matching-based virus and malware detection. We present an architecture for a Hadoop-based network analysis solution including a scheduler, report on using this approach in a small cluster, and show scheduling performance results obtained through simulation. The scheduler considers a cloud-based traffic analysis solution that bursts traffic to the cloud to overcome local resource limitations. The results show that we are able to reduce the amount of the traffic to burst out by up to 50 % and still accomplish a continuous batch traffic analysis with single-job comparable run times.
Similar content being viewed by others
References
Stephen McGough A, Forshaw M, Gerrard C, Wheater S, Allen B, Robinson P (2014) Comparison of a cost-effective virtual cloud cluster with an existing campus cluster. Future Gen Comput Syst 41:65–78
Guo T, Sharma U, Shenoy P, Wood T, Sahu S (2014) Cost-aware cloud bursting for enterprise applications. ACM Trans Internet Technol 13(3):1–24
Nair SK et al (2010) Towards secure cloud bursting, brokerage and aggregation. In: Proceedings of the 8th IEEE European conference on web services, ECOWS 2010, pp 189–196
Lee Y, Lee Y (2012) Toward scalable internet traffic measurement and analysis with Hadoop. ACM SIGCOMM Comput Commun Rev 43(1):5–13
RIPE (2012) Large-scale PCAP data analysis using Apache Hadoop. https://github.com/RIPE-NCC/hadoop-pcap
Pallavi A, Hemlata P (2012) Network traffic analysis using packet sniffer. Int J Eng Res Appl 2(3):854–856
Bicer T, Chiu D, Agrawal G (2011) A framework for data-intensive computing with cloud bursting. 2011 IEEE international conference on cluster computing, pp 169–177
Kailasam S, Dhawalia P, Balaji SJ, Iyer G, Dharanipragada J (2014) Extending MapReduce across clouds with BStream. IEEE Trans Cloud Comput 2(3):362–376
Chang H, Kodialam M, Kompella RR, Lakshman TV, Lee M, Mukherjee S (2011) Scheduling in mapreduce-like systems for fast completion time. IEEE INFOCOM, pp 3074–3082
Mattess M, Calheiros RN, Buyya R (2013) Scaling MapReduce applications across hybrid clouds to meet soft deadlines. International conference on advanced information networking and applications, pp 629–636
Verma A, Cherkasova L, Kumar VS, Campbell RH (2012) Deadline-based workload management for MapReduce environments: pieces of the performance puzzle. In: Proceedings of network operations and management symposium, pp 900–905
Dong X, Wang Y, Liao H (2011) Scheduling mixed real-time and non-real-time applications in MapReduce environment. International conference on parallel and distributed systems, pp 9–16
Hwang E, Kim KH (2012) Minimizing cost of virtual machines for deadline-constrained MapReduce applications in the cloud international conference on grid computing, pp 130–138
Kc K, Anyanwu K (2010) Scheduling hadoop jobs to meet deadlines. In: Proceedings of IEEE second international conference on cloud computing technology and science, Indianapolis, pp 388–392
Lim N, Majumdar S, Ashwood-Smith P (2014) A constraint programming-based resource management technique for processing MapReduce jobs with SLAs on clouds. International conference on parallel processing (ICPP), pp 411–421
Gaj P, Kwiecie A, Stera P (2015) Estimating the intensity of long-range dependence in real and synthetic traffic traces. Springer Comput Netw 522:11–22
Acknowledgments
Work (partially) funded by the Operational Programme for Competitiveness and Internationalisation - COMPETE 2020 within project POCI-01-0145-FEDER-006961, and by FCT – Portuguese Foundation for Science and Technology as part of projects UID/EEA/50014/2013 and UID/CEC/00027/2013.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Morla, R., Gonçalves, P. & Barbosa, J.G. High-performance network traffic analysis for continuous batch intrusion detection. J Supercomput 72, 4107–4128 (2016). https://doi.org/10.1007/s11227-016-1743-6
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-016-1743-6