skip to main content
research-article

NetDiff: A Service-Guided Hierarchical Diffusion Model for Network Flow Trace Generation

Published: 21 August 2024 Publication History

Abstract

Network flow traces are fundamental to many network management workflows. In this paper, we aim to generate high-fidelity network flow traces by explicitly modeling users' dynamic network usage intents. We propose NetDiff, a service-guided hierarchical diffusion model for network flow trace generation. NetDiff employs a hierarchical generation structure that includes a layer to model mobile users' interactions with network services, such as app usage traces, and leverages these generated app usage traces to guide network flow generation. NetDiff avoids pattern collapse and generates controlled samples by gradually eliminating noise and using service conditions to guide each step more precisely. It captures the co-usage and sequential relationships across network service usage through a pre-trained embedding model and an encoder-decoder structure. Additionally, NetDiff captures the temporal and feature correlations present in multidimensional network flow data through a two-layer transformer network. Extensive experiments on real-world network flow datasets demonstrate that NetDiff significantly outperforms state-of-the-art baselines regarding Jensen-Shannon divergence, total variation distance, and cumulative residual probability sum squares. Furthermore, NetDiff is robust across various datasets from different cities, meeting users' requirements for downstream tasks by maintaining algorithm accuracy and order.

References

[1]
Joaquim Barros, Miguel Araujo, and Rosaldo JF Rossetti. 2015. Short-term real-time traffic prediction methods: A survey. In 2015 international conference on models and technologies for intelligent transportation systems (MT-ITS). IEEE, 132--139.
[2]
David Bau, Jun-Yan Zhu, Jonas Wulff, William Peebles, Hendrik Strobelt, Bolei Zhou, and Antonio Torralba. 2019. Seeing what a gan cannot generate. In Proceedings of the IEEE/CVF international conference on computer vision. 4502--4511.
[3]
Emmanuel Bengio, Moksh Jain, Maksym Korablyov, Doina Precup, and Yoshua Bengio. 2021. Flow network based generative models for non-iterative diverse candidate generation. Advances in Neural Information Processing Systems, Vol. 34 (2021), 27381--27394.
[4]
Thomas Bonald. 2006. The Erlang model with non-Poisson call arrivals. ACM SIGMETRICS Performance Evaluation Review, Vol. 34, 1 (2006), 276--286.
[5]
Niels Bouten, Ricardo de O Schmidt, Jeroen Famaey, Steven Latré, Aiko Pras, and Filip De Turck. 2015. QoE-driven in-network optimization for adaptive video streaming based on packet sampling measurements. Computer networks, Vol. 81 (2015), 96--115.
[6]
Xinjie Chang. 1999. Network simulations with OPNET. In Proceedings of the 31st conference on Winter simulation: Simulation--a bridge to the future-Volume 1. 307--314.
[7]
Kenneth Ward Church. 2017. Word2Vec. Natural Language Engineering, Vol. 23, 1 (2017), 155--162.
[8]
Prafulla Dhariwal and Alexander Nichol. 2021. Diffusion models beat gans on image synthesis. Advances in neural information processing systems, Vol. 34 (2021), 8780--8794.
[9]
Baik Dowoo, Yujin Jung, and Changhee Choi. 2019. PcapGAN: Packet capture file generator by style-based generative adversarial networks. In 2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA). IEEE, 1149--1154.
[10]
Hengyu Fu, Zhuoran Yang, Mengdi Wang, and Minshuo Chen. 2024. Unveil Conditional Diffusion Models with Classifier-free Guidance: A Sharp Statistical Theory. arXiv preprint arXiv:2403.11968 (2024).
[11]
Jiahui Gong, Qiaohong Yu, Tong Li, Haoqiang Liu, Jun Zhang, Hangyu Fan, Depeng Jin, and Yong Li. 2023. Scalable digital twin system for mobile networks with generative ai. In Proceedings of the 21st Annual International Conference on Mobile Systems, Applications and Services. 610--611.
[12]
Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2020. Generative adversarial networks. Commun. ACM, Vol. 63, 11 (2020), 139--144.
[13]
Thomas R Henderson, Mathieu Lacage, George F Riley, Craig Dowell, and Joseph Kopena. 2008. Network simulations with the ns-3 simulator. SIGCOMM demonstration, Vol. 14, 14 (2008), 527.
[14]
John R Hershey and Peder A Olsen. 2007. Approximating the Kullback Leibler divergence between Gaussian mixture models. In 2007 IEEE International Conference on Acoustics, Speech and Signal Processing-ICASSP'07, Vol. 4. IEEE, IV--317.
[15]
Soshi Hirono, Yukiko Yamaguchi, Hajime Shimada, and Hiroki Takakura. 2014. Development of a secure traffic analysis system to trace malicious activities on internal networks. In 2014 IEEE 38th Annual Computer Software and Applications Conference. IEEE, 305--310.
[16]
Jonathan Ho, Ajay Jain, and Pieter Abbeel. 2020. Denoising diffusion probabilistic models. Advances in neural information processing systems, Vol. 33 (2020), 6840--6851.
[17]
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation, Vol. 9, 8 (1997), 1735--1780.
[18]
Kaj Holmberg and Di Yuan. 2003. A multicommodity network-flow problem with side constraints on paths solved by column generation. INFORMS Journal on Computing, Vol. 15, 1 (2003), 42--57.
[19]
Rongjie Huang, Max WY Lam, Jun Wang, Dan Su, Dong Yu, Yi Ren, and Zhou Zhao. 2022. Fastdiff: A fast conditional diffusion model for high-quality speech synthesis. arXiv preprint arXiv:2204.09934 (2022).
[20]
Linbo Hui, Mowei Wang, Liang Zhang, Lu Lu, and Yong Cui. 2023. Digital Twin for Networking: A Data-Driven Performance Modeling Perspective. IEEE Network, Vol. 37, 3 (2023), 202--209.
[21]
Shuodi Hui, Huandong Wang, Tong Li, Xinghao Yang, Xing Wang, Junlan Feng, Lin Zhu, Chao Deng, Pan Hui, Depeng Jin, et al. 2023. Large-scale urban cellular traffic generation via knowledge-enhanced gans with multi-periodic patterns. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 4195--4206.
[22]
TELECOMMUNICATION STANDARDIZATION SECTOR OF ITU. 2020. Big data driven networking - requirements. https://www.itu.int/rec/T-REC-Y.3652 Retrieved June, 2020 from
[23]
TELECOMMUNICATION STANDARDIZATION SECTOR OF ITU. 2021. Big data-driven networking - Functional architecture. https://www.itu.int/rec/T-REC-Y.3653 Retrieved April, 2021 from
[24]
Xi Jiang, Shinan Liu, Aaron Gember-Jacobson, Arjun Nitin Bhagoji, Paul Schmitt, Francesco Bronzino, and Nick Feamster. 2024. NetDiffusion: Network Data Augmentation Through Protocol-Constrained Traffic Generation. Proceedings of the ACM on Measurement and Analysis of Computing Systems, Vol. 8, 1 (2024), 1--32.
[25]
Xi Jiang, Shinan Liu, Aaron Gember-Jacobson, Paul Schmitt, Francesco Bronzino, and Nick Feamster. 2023. Generative, high-fidelity network traces. In Proceedings of the 22nd ACM Workshop on Hot Topics in Networks. 131--138.
[26]
Yifan Jiang, Shiyu Chang, and Zhangyang Wang. 2021. Transgan: Two pure transformers can make one strong gan, and that can scale up. Advances in Neural Information Processing Systems, Vol. 34 (2021), 14745--14758.
[27]
Zhifeng Kong, Wei Ping, Jiaji Huang, Kexin Zhao, and Bryan Catanzaro. 2020. DiffWave: A Versatile Diffusion Model for Audio Synthesis. ArXiv, Vol. abs/2009.09761 (2020). https://api.semanticscholar.org/CorpusID:221818900
[28]
Andreas Köpke, Michael Swigulski, Karl Wessel, Daniel Willkomm, PT Klein Haneveld, Tom EV Parker, Otto W Visser, Hermann S Lichte, and Stefan Valentin. 2008. Simulating wireless and mobile networks in OMNeT the MiXiM vision. In Proceedings of the 1st international conference on Simulation tools and techniques for communications, networks and systems & workshops. 1--8.
[29]
Adrian Lara, Anisha Kolasani, and Byrav Ramamurthy. 2013. Network innovation using openflow: A survey. IEEE communications surveys & tutorials, Vol. 16, 1 (2013), 493--512.
[30]
Aris Leivadeas and Matthias Falkner. 2022. A survey on intent-based networking. IEEE Communications Surveys & Tutorials, Vol. 25, 1 (2022), 625--655.
[31]
Tong Li, Yali Fan, Yong Li, Sasu Tarkoma, and Pan Hui. 2021. Understanding the long-term evolution of mobile app usage. IEEE Transactions on Mobile Computing, Vol. 22, 2 (2021), 1213--1230.
[32]
Tong Li and Yong Li. 2023. Artificial intelligence for reducing the carbon emissions of 5G networks in China. Nature Sustainability, Vol. 6, 12 (2023), 1522--1523.
[33]
Tong Li, Yong Li, Mohammad Ashraful Hoque, Tong Xia, Sasu Tarkoma, and Pan Hui. 2020. To what extent we repeat ourselves? Discovering daily activity patterns across mobile app usage. IEEE Transactions on Mobile Computing, Vol. 21, 4 (2020), 1492--1507.
[34]
Tong Li, Tong Xia, Huandong Wang, Zhen Tu, Sasu Tarkoma, Zhu Han, and Pan Hui. 2022. Smartphone app usage analysis: datasets, methods, and applications. IEEE Communications Surveys & Tutorials, Vol. 24, 2 (2022), 937--966.
[35]
Tong Li, Li Yu, Yibo Ma, Tong Duan, Wenzhen Huang, Yan Zhou, Depeng Jin, Yong Li, and Tao Jiang. 2023. Carbon emissions of 5G mobile networks in China. Nature Sustainability, Vol. 6, 12 (2023), 1620--1631.
[36]
Zinan Lin, Alankar Jain, Chen Wang, Giulia Fanti, and Vyas Sekar. 2020. Using gans for sharing networked time series data: Challenges, initial promise, and open questions. In Proceedings of the ACM Internet Measurement Conference. 464--483.
[37]
Yixin Liu, Kai Zhang, Yuan Li, Zhiling Yan, Chujie Gao, Ruoxi Chen, Zhengqing Yuan, Yue Huang, Hanchi Sun, Jianfeng Gao, et al. 2024. Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models. arXiv preprint arXiv:2402.17177 (2024).
[38]
Zaoxing Liu, Hun Namkung, Georgios Nikolaidis, Jeongkeun Lee, Changhoon Kim, Xin Jin, Vladimir Braverman, Minlan Yu, and Vyas Sekar. 2021. Jaqen: A $$High-Performance$$$$Switch-Native$$ approach for detecting and mitigating volumetric $$DDoS$$ attacks with programmable switches. In 30th USENIX Security Symposium (USENIX Security 21). 3829--3846.
[39]
ML Menéndez, JA Pardo, L Pardo, and MC Pardo. 1997. The jensen-shannon divergence. Journal of the Franklin Institute, Vol. 334, 2 (1997), 307--318.
[40]
Olof Mogren. 2016. Continuous recurrent neural networks with adversarial training. arXiv preprint arXiv:1611.09904 (2016).
[41]
Andrew W Moore and Denis Zuev. 2005. Internet traffic classification using bayesian analysis techniques. In Proceedings of the 2005 ACM SIGMETRICS international conference on Measurement and modeling of computer systems. 50--60.
[42]
Elvis Nava, Seijin Kobayashi, Yifei Yin, Robert K Katzschmann, and Benjamin F Grewe. 2022. Meta-learning via classifier (-free) diffusion guidance. arXiv preprint arXiv:2210.08942 (2022).
[43]
Roberto Perdisci, Wenke Lee, and Nick Feamster. 2010. Behavioral clustering of http-based malware and signature generation using malicious network traces. In NSDI, Vol. 10. 14.
[44]
Aditya Ramesh, Prafulla Dhariwal, Alex Nichol, Casey Chu, and Mark Chen. 2022. Hierarchical text-conditional image generation with clip latents. arXiv preprint arXiv:2204.06125, Vol. 1, 2 (2022), 3.
[45]
Markus Ring, Daniel Schlör, Dieter Landes, and Andreas Hotho. 2019. Flow-based network traffic generation using generative adversarial networks. Computers & Security, Vol. 82 (2019), 156--172.
[46]
Nirhoshan Sivaroopan, Dumindu Bandara, Chamara Madarasingha, Guilluame Jourjon, Anura Jayasumana, and Kanchana Thilakarathna. 2023. Netdiffus: Network traffic generation by diffusion models through time-series imaging. arXiv preprint arXiv:2310.04429 (2023).
[47]
Joel Sommers, Hyungsuk Kim, and Paul Barford. 2004. Harpoon: a flow-level traffic generator for router and network tests. ACM SIGMETRICS Performance Evaluation Review, Vol. 32, 1 (2004), 392--392.
[48]
Naoya Takahashi, Mayank Kumar, Yuki Mitsufuji, et al. 2023. Hierarchical Diffusion Models for Singing Voice Neural Vocoder. In ICASSP 2023--2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 1--5.
[49]
Yusuke Tashiro, Jiaming Song, Yang Song, and Stefano Ermon. 2021. Csdi: Conditional score-based diffusion models for probabilistic time series imputation. Advances in Neural Information Processing Systems, Vol. 34 (2021), 24804--24816.
[50]
Sergio Verdú. 2014. Total variation distance and the distribution of relative information. In 2014 Information Theory and Applications Workshop (ITA). IEEE, 1--3.
[51]
Michail Vlachos, Marios Hadjieleftheriou, Dimitrios Gunopulos, and Eamonn Keogh. 2003. Indexing multi-dimensional time-series with support for multiple distance measures. In Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining. 216--225.
[52]
Pan Wang, Shuhang Li, Feng Ye, Zixuan Wang, and Moxuan Zhang. 2020. PacketCGAN: Exploratory study of class imbalance for encrypted traffic classification using CGAN. In ICC 2020--2020 IEEE International Conference on Communications (ICC). IEEE, 1--7.
[53]
ZiXuan Wang, Pan Wang, Xiaokang Zhou, ShuHang Li, and MoXuan Zhang. 2019. FLOWGAN: Unbalanced network encrypted traffic identification method based on GAN. In 2019 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom). IEEE, 975--983.
[54]
Shengzhe Xu, Manish Marwah, Martin Arlitt, and Naren Ramakrishnan. 2021. Stan: Synthetic network traffic generation with generative neural models. In Deployable Machine Learning for Security Defense: Second International Workshop, MLHat 2021, Virtual Event, August 15, 2021, Proceedings 2. Springer, 3--29.
[55]
Yucheng Yin, Zinan Lin, Minhao Jin, Giulia Fanti, and Vyas Sekar. 2022. Practical gan-based synthetic ip header trace generation using netshare. In Proceedings of the ACM SIGCOMM 2022 Conference. 458--472.
[56]
Jinsung Yoon, Daniel Jarrett, and Mihaela Van der Schaar. 2019. Time-series generative adversarial networks. Advances in neural information processing systems, Vol. 32 (2019).
[57]
Yuan Yuan, Jingtao Ding, Chenyang Shao, Depeng Jin, and Yong Li. 2023. Spatio-temporal diffusion point processes. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 3173--3184.
[58]
Junhui Zhang, Jiqiang Tang, Xu Zhang, Wen Ouyang, and Dongbin Wang. 2015. A survey of network traffic generation. (2015).
[59]
Lvmin Zhang, Anyi Rao, and Maneesh Agrawala. 2023. Adding conditional control to text-to-image diffusion models. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 3836--3847.
[60]
ST Zhang, XB Lin, Libo Wu, YQ Song, ND Liao, and ZH Liang. 2020. Network Traffic Anomaly Detection Based on ML-ESN for Power Metering System. Mathematical Problems in Engineering, Vol. 2020, 1 (2020), 7219659.
[61]
Zhilun Zhou, Jingtao Ding, Yu Liu, Depeng Jin, and Yong Li. 2023. Towards generative modeling of urban flow through knowledge-enhanced denoising diffusion. In Proceedings of the 31st ACM International Conference on Advances in Geographic Information Systems. 1--12.
[62]
Marco Zuppelli and Luca Caviglione. 2021. pcapStego: A tool for generating traffic traces for experimenting with network covert channels. In Proceedings of the 16th International Conference on Availability, Reliability and Security. 1--8.

Cited By

View all
  • (2025)Distributed Denial of Services (DDoS) attack detection in SDN using Optimizer-equipped CNN-MLPPLOS ONE10.1371/journal.pone.031242520:1(e0312425)Online publication date: 27-Jan-2025
  • (2025)Generative AI Empowered Network Digital Twins: Architecture, Technologies, and ApplicationsACM Computing Surveys10.1145/371168257:6(1-43)Online publication date: 10-Jan-2025
  • (2025)QBSD: Quartile-Based Seasonality Decomposition for Cost-Effective RAN KPI Forecasting2025 17th International Conference on COMmunication Systems and NETworks (COMSNETS)10.1109/COMSNETS63942.2025.10885747(847-851)Online publication date: 6-Jan-2025
  • Show More Cited By

Index Terms

  1. NetDiff: A Service-Guided Hierarchical Diffusion Model for Network Flow Trace Generation

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image Proceedings of the ACM on Networking
        Proceedings of the ACM on Networking  Volume 2, Issue CoNEXT3
        PACMNET
        September 2024
        108 pages
        EISSN:2834-5509
        DOI:10.1145/3689614
        Issue’s Table of Contents
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 21 August 2024
        Published in PACMNET Volume 2, Issue CoNEXT3

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. data generation
        2. diffusion model
        3. mobile networks
        4. network flow

        Qualifiers

        • Research-article

        Funding Sources

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)373
        • Downloads (Last 6 weeks)39
        Reflects downloads up to 27 Feb 2025

        Other Metrics

        Citations

        Cited By

        View all
        • (2025)Distributed Denial of Services (DDoS) attack detection in SDN using Optimizer-equipped CNN-MLPPLOS ONE10.1371/journal.pone.031242520:1(e0312425)Online publication date: 27-Jan-2025
        • (2025)Generative AI Empowered Network Digital Twins: Architecture, Technologies, and ApplicationsACM Computing Surveys10.1145/371168257:6(1-43)Online publication date: 10-Jan-2025
        • (2025)QBSD: Quartile-Based Seasonality Decomposition for Cost-Effective RAN KPI Forecasting2025 17th International Conference on COMmunication Systems and NETworks (COMSNETS)10.1109/COMSNETS63942.2025.10885747(847-851)Online publication date: 6-Jan-2025
        • (2025)A survey of intelligent reflecting surfaces: Performance analysis, extensions, potential challenges, and open research issuesVehicular Communications10.1016/j.vehcom.2024.10085951(100859)Online publication date: Feb-2025
        • (2025)SFIMCO: Scalable fair influence maximization based on overlapping communities and optimization algorithmsNeurocomputing10.1016/j.neucom.2025.129687629(129687)Online publication date: May-2025
        • (2025)A novel semi-local centrality to identify influential nodes in complex networks by integrating multidimensional factorsEngineering Applications of Artificial Intelligence10.1016/j.engappai.2025.110177145(110177)Online publication date: Apr-2025
        • (2025)Development of a multidimensional centrality metric for ranking nodes in complex networksChaos, Solitons & Fractals10.1016/j.chaos.2024.115843191(115843)Online publication date: Feb-2025
        • (2025)An efficient function placement approach in serverless edge computingComputing10.1007/s00607-025-01438-7107:3Online publication date: 21-Feb-2025
        • (2024)Controllable Human Trajectory Generation Using Profile-Guided Latent DiffusionACM Transactions on Knowledge Discovery from Data10.1145/3701736Online publication date: 25-Oct-2024
        • (2024)Research on University Network Data Anomaly Detection and Security Protection Algorithm Based on edge computingSPIN10.1142/S2010324724400095Online publication date: 7-Nov-2024

        View Options

        Login options

        Full Access

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Figures

        Tables

        Media

        Share

        Share

        Share this Publication link

        Share on social media