ABSTRACT
Network datasets are an essential part of understanding, managing, and operating modern wide-area, data-center, and cellular networks. They are involved throughout the various stages of network development, from simulations, stress testing, to machine-learning training (for anomaly-based intrusion detection systems) and more. Despite the need, network datasets are rare due to concerns related to information privacy and sensitivity.
In this paper, we aim to tackle this challenge and put forth a method, based on Generative Adversarial Networks (GANs), for generating new (and timely) datasets, automatically, that are provisioned as complete raw packets traces of a network and not just feature values.
- Al-Hadhrami, Y., and Hussain, F. K. Real Time Dataset Generation Framework for Intrusion Detection Systems in IoT. Future Generation Computer Systems 108 (2020), 414–423.Google Scholar
- Al-Kasassbeh, M., Al-Naymat, G., and Al-Hawari, E. Towards Generating Realistic SNMP-MIB Dataset for Network Anomaly Detection. International Journal of Computer Science and Information Security 14, 9 (2016), 1162.Google Scholar
- Alhaidari, F. A., and Alrehan, A. M. A Simulation Work for Generating a Novel Dataset to Detect Distributed Denial of Service Attacks on Vehicular Ad-hoc Network Systems. International Journal of Distributed Sensor Networks 17, 3 (2021), 15501477211000287.Google ScholarCross Ref
- Barradas, D., Santos, N., and Rodrigues, L. E. DeltaShaper: Enabling Unobservable Censorship-resistant TCP Tunneling over Videoconferencing Streams. Proceedings on Privacy Enhancing Technologies 2017, 4 (2017), 5–22.Google ScholarCross Ref
- Becker, S., and Plumbley, M. Unsupervised Neural Network Learning Procedures for Feature Extraction and Classification. Applied Intelligence 6, 3 (1996), 185–203.Google Scholar
- Bullard, C. ARGUS: The Network Audit Record Generation and Utilization System. http://www.qosient.com/argushttp://www.qosient.com/argus. Accessed on 02/03/2021.Google Scholar
- Cai, H., Bai, C., Tai, Y.-W., and Tang, C.-K. Deep Video Generation, Prediction and Completion of Human Action Sequences. In Proceedings of the European Conference on Computer Vision (ECCV) (2018).Google ScholarCross Ref
- Chen, L., Dai, S., Tao, C., Shen, D., Gan, Z., Zhang, H., Zhang, Y., and Carin, L. Adversarial Text Generation via Feature-Mover's Distance. arXiv preprint arXiv:1809.06297 (2018).Google Scholar
- Creech, G., and Hu, J. Generation of a new ids test dataset: Time to retire the kdd collection. In 2013 IEEE Wireless Communications and Networking Conference (WCNC) (2013).Google ScholarCross Ref
- Damasevicius, R., Venckauskas, A., Grigaliunas, S., Toldinas, J., Morkevicius, N., Aleliunas, T., and Smuikys, P. LITNET-2020: An Annotated Real-world Network Flow Dataset for Network Intrusion Detection. Electronics 9, 5 (2020), 800.Google Scholar
- Gers, F. A., Schmidhuber, J., and Cummins, F. Learning to Forget: Continual Prediction with LSTM. Neural computation 12, 10 (2000), 2451–2471.Google Scholar
- Gogoi, P., Bhuyan, M. H., Bhattacharyya, D., and Kalita, J. K. Packet and Flow Based Network Intrusion Dataset. In International Conference on Contemporary Computing (2012).Google Scholar
- Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., and Bengio, Y. Generative Adversarial Nets. In Advances in neural information processing systems (2014).Google Scholar
- Ibrahim, M. R., Haworth, J., and Christie, N. Re-designing Cities with Conditional Adversarial Networks. arXiv preprint arXiv:2104.04013 (2021).Google Scholar
- Karras, T., Laine, S., and Aila, T. A Style-based Generator Architecture for Generative Adversarial Networks. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (2019).Google ScholarCross Ref
- Lowe, D., and Webb, A. R. Optimized Feature Extraction and the Bayes. IEEE Transactions on Pattern Analysis and Machine Intelligence 13, 4 (1991), 355.Google Scholar
- McGaughey, D., Semeniuk, T., Smith, R., and Knight, S. A Systematic Approach of Feature Selection for Encrypted Network Traffic Classification. In 2018 Annual IEEE International Systems Conference (SysCon) (2018).Google ScholarCross Ref
- Moustafa, N., and Slay, J. UNSW-NB15: A Comprehensive Data Set for Network Intrusion Detection Systems (UNSW-NB15 Network Data Set). In 2015 military communications and information systems conference (MilCIS) (2015).Google Scholar
- Nguyen, D., Memik, G., Memik, S. O., and Choudhary, A. Real-time Feature Extraction for High Speed Networks. In International Conference on Field Programmable Logic and Applications (2005).Google Scholar
- NSA. Cyber Research Center CDX Dataset. https://www.westpoint.edu/centers-and-research/cyber-research-center/data-setshttps://www.westpoint.edu/centers-and-research/cyber-research-center/data-sets. Accessed on 02/03/2021.Google Scholar
- Olusola, A. A., Oladele, A. S., and Abosede, D. O. Analysis of KDD’99 Intrusion Detection Dataset for Selection of Relevance Features. In Proceedings of the world congress on engineering and computer science (2010).Google Scholar
- Payer, G. Realistic Computer Network Simulation for Network Intrusion Detection Dataset Generation. In Next-Generation Robotics II; and Machine Intelligence and Bio-inspired Computation: Theory and Applications IX (2015).Google Scholar
- Perera, P., Tian, Y.-C., Fidge, C., and Kelly, W. A Comparison of Supervised Machine Learning Algorithms for Classification of Communications Network Traffic. In International Conference on Neural Information Processing (2017).Google ScholarDigital Library
- Pham, V. C., Makino, Y., Pho, K., Lim, Y., and Tan, Y. IoT Area Network Simulator For Network Dataset Generation. Journal of Information Processing 28 (2020), 668–678.Google ScholarCross Ref
- Sangster, B., O'Connor, T., Cook, T., Fanelli, R., Dean, E., Morrell, C., and Conti, G. J. Toward Instrumenting Network Warfare Competitions to Generate Labeled Datasets. In CSET (2009).Google Scholar
- Setiono, R., and Liu, H. Feature Extraction via Neural Networks. In Feature Extraction, Construction and Selection. Springer, 1998, pp. 191–204.Google ScholarCross Ref
- Shahraki, A., Abbasi, M., and Haugen, Ø. Boosting Algorithms for Network Intrusion Detection: A Comparative Evaluation of Real AdaBoost, Gentle AdaBoost and Modest AdaBoost. Engineering Applications of Artificial Intelligence 94 (2020), 103770.Google Scholar
- Thakkar, A., and Lohiya, R. A Review of the Advancement in Intrusion Detection Datasets. Procedia Computer Science 167 (2020), 636–645.Google Scholar
- Vasudevan, A., Harshini, E., and Selvakumar, S. SSENet-2011: A Network Intrusion Detection System Dataset and its Comparison with KDD CUP 99 Dataset. In 2011 second asian himalayas international conference on internet (AH-ICI) (2011).Google Scholar
- Wu, F., Jiang, X., Ma, W., Wang, L., Jiang, Y., Guan, S., Li, X., Song, M., Liu, M., and Yin, M. A Feature Extraction Method of Network Traffic for Time-Frequency Synchronization Applications. In International Conference on Computer Systems, Electronics and Control (ICCSEC) (2017).Google ScholarCross Ref
- Yan, W., Zhang, Y., Abbeel, P., and Srinivas, A. VideoGPT: Video Generation using VQ-VAE and Transformers. arXiv preprint arXiv:2104.10157 (2021).Google Scholar
- Yang, J., Kannan, A., Batra, D., and Parikh, D. LR-GAN: Layered Recursive Generative Adversarial Networks for Image Generation. arXiv preprint arXiv:1703.01560 (2017).Google Scholar
Index Terms
- Constructing the face of network data
Recommendations
A GAN-based Method for Generating Finger Vein Dataset
ACAI '20: Proceedings of the 2020 3rd International Conference on Algorithms, Computing and Artificial IntelligenceDeep learning is widely used in the field of biometrics, but a large amount of labeled image data is required to obtain a well-performing complicated model. Finger vein recognition has huge advantages over common biometric methods in terms of security ...
RNNIDS: Enhancing network intrusion detection systems through deep learning
AbstractSecurity of information passing through the Internet is threatened by today’s most advanced malware ranging from orchestrated botnets to simpler polymorphic worms. These threats, as examples of zero-day attacks, are able to change ...
Deep learning based Self-Sustained Personal Network
AbstractMost of the research on personal networking and deep learning have been conducted separately. Crossovers between the two fields have just emerged. This article provides a quick introduction to the fundamentals of deep learning, as well as the most ...
Comments