GeoTraPredict: A machine learning system of web spatio-temporal traffic flow
Introduction
Traffic flow prediction is an important component for self-driving. In the field of smart cities or self-driving, traffic prediction is widely used in route planning [1]. Traffic flow is closely related to population distribution, and the traffic flow is not only related to the absolute number of human population but also to their concerns and interests. At present, there is a lack of effective analysis methods and platforms of population interest distribution. Web flow traffic prediction is crucial to network providers, content providers and computer network management. It is an important task for many applications, such as adaptive applications, congestion control, admission control, anomaly detection and bandwidth allocation. As the number of network customers and kinds of service rapid grows, network problems are considered to be highly non-linear, time-variable dynamic systems, and the demand for network traffic forecasts has increased accordingly [2].
Considering that the source of network activity is people, internet traffic is found to be strongly associated with the spatio-temporal distribution of the population. Spatial analysis involving population distribution information can indirectly contribute to the analysis and prediction of web traffic flow. In addition, some works on traffic prediction using spatio-temporal characteristic in the autonomous vehicles field are proposed [3].On this basis, the paper proposed an innovative method to predict web traffic, which combines population and network event clusters to explore the correlation between network traffic and population classification, followed by comprehensive analysis. Specifically, the fluctuation in network traffic can be predicted through learning the relationship between traffic variations and network events, and the characteristics of cyber citizens. This allows us to then provide corresponding measures in advance. This method closely relates network traffic analysis and network traffic prediction to increase accuracy. This network traffic analysis mainly focuses on possible correlations between the change of network traffic and spatial and temporal distribution, which promotes better forecasting of network traffic. Consequently, the spatial analysis which can provide valuable inferences should be considered when doing network traffic prediction.
The general process of network traffic prediction covers several primary steps, such as collecting network traffic data, building a traffic prediction model with parameters, training the model to acquire parameters in a simulation environment, modify the parameters within the model, and so on. There are many different methods of collecting computer network traffic data, such as SNMP, packet sniffing, NetFlow and so on.
Deep learning provides valuable insights into network traffic prediction. There are various network traffic prediction models proposed which include models based on statistics, machine learning, and deep learning. For different application requirements, the prediction method must consider the prediction horizon, computational costs, prediction error rate, and response times, which creates challenges for traffic prediction.
In the paper we present a platform to conduct accurate spatio-temporal distribution and prediction of population by predicting network traffic. The proposed method combines population and network event clusters to explore the correlation between network traffic and population classification, followed by comprehensive analysis. Specifically, the fluctuation in network traffic can be predicted through learning the relationship between traffic variations and network events, and the characteristics of cyber citizens. This allows us to then provide corresponding measures in advance. This method closely relates network traffic analysis and network traffic prediction to increase accuracy. This network traffic analysis mainly focuses on possible correlations between the change of network traffic and spatial and temporal distribution, which promotes better forecasting of network traffic. Consequently, the spatial analysis which can provide valuable inferences should be considered when doing network traffic prediction. The proposed innovative framework named GeoTrafficPredict supports the accurate spatio-temporal prediction of web traffic flow. The following paper is organized as follows. After a description of related works in section two, and section three elaborates the process of GeoTrafficPredict. Section Four describes the framework of GeoTrafficPredict including the data tier, computation tier, and visualization tier. The implementation on China’s CSTNET (China Science and Technology Network) can be found in section four. Section five discusses the results and future work.
Section snippets
Related works
Web traffic flow prediction models developed in three stages: traditional models (short correlation model), Self-Similar Models (long correlation model), and emerging machine learning based models, shown in Fig. 1. In the 1970s and early 1980s, as network applications were relatively simple, data transmission volume was less, and network analysis technology was limited, people drew on the model of the public switched telephone network and used a Poisson model to describe the traffic of data
Events detection
Human behavior stimulated by external events shows different patterns, and considering that the source of network activity is people, the web traffic flow can be studied to detect the related events.
In the first stage, wavelet analysis was performed for network traffic data, and correlation analysis was conducted for clustering results and wavelet separation results to detect the events. Original data takes time as X-axis and network traffic as Y-axis. By the wavelet analysis, the related
SOA architecture design
Data analysis systems used to be deployed on stand-alone systems. Specifically, data processing modules, computing modules, and visualization modules are designed on computers or compute nodes with the same configuration. This architecture is suitable for small data sets and single source data sets without the need for data integration for comprehensive analysis. However, with the explosive growth of data sets including distributed multi-source data and the needs of network users, it is now
Data prediction process
The GeoTrafficPredict support the accurate spatio-temporal prediction of web traffic flow by establishing a series of models to explore the relationship between the population distribution and the external events stimulation. With collecting, pre-processing, formatting and storing the network traffic information from the front-end data acquisition part of the whole system, the events extraction model gets processible data to conduct statistical analysis and correlation analysis. Considering the
Cloud based model integration
Spatial analysis algorithms are diverse and have different angles of interest. Using different analysis algorithms on the same data set will be more conducive to system design. However, even if we have source code freely, it is difficult to implement all the relevant algorithms on a stand-alone machine. Specifications, configurations, and requirements for environments of different algorithms or models will bring endless obstacles to integration. On a cluster with multiple computers, making all
Network traffic data organization and storage
In the paper, we analyze and predict network traffic trends, using network data request packets coming from CSTNET. CSTNET is an academic, non-profit scientific research computer network under the leadership of the Chinese Academy of Sciences.The collection is based on the technology of sFlow and a sampling rate of 1/550. The time scope is from January 2018 to December 2018. For the purpose of testing, we divided data into two parts, one is for training while the other is for testing. In this
Experiments analysis
We conduct extensive experiments to evaluate the efficacy of the proposed GeoTrafficPredict system. We select one site deployed in CSTNET. The site is leading one to provide services including science news, project applications and project reviews. We select the flow data to the site of 2018, and make spatial-temporal correlation analysis on it. As most of visitors are teachers and researchers from national wide. Our proposed GeoTrafficPredict system show the variation of traffic flow of
Conclusions
In this paper, we proposed a comprehensive framework for spatial analysis on web flow prediction. The framework uses open source data and official data released by governments and related institutions to discover new network events. The framework automatically crawls network data and efficiently organizes data using a spatio-temporal cube data warehouse. During the organization of the data, the data retains time series information and geographic information. In the process of processing traffic
CRediT authorship contribution statement
Jingjing Li: Conceptualization, Methodology, Software. Jun Li: Supervision, Writing - review & editing. Nan Jia: Writing - original draft. Xunchun Li: Visualization. Wenzhen Ma: Writing - review & editing. Shanshan Shi: Writing - review & editing.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgments
This study is partly supported by the National Natural Science Foundation of China under Grant No. 41971366, 71673158, the National Key Research Development (R&D) Plan under Grant No. 2018YFC0809700, and the Beijing National Science Foundation of China under Grant No. 9194027.
Jingjing Li Senior engineer of Computer Network Information Center(CNIC), Chinese Academy of Sciences(CAS). He earned his M.S. degree from the CNIC, CAS in 2007. He has been working on engineering and research at the field of network management, network user data analysis and optimization over 10 years. His research interests include future Internet architecture, SDN, openflow, network management and forwarding technology.
References (26)
- et al.
Dynamic route planning with real-time traffic predictions
Inf. Syst.
(2017) Structure optimization of bilinear recurrent neural networks and its application to ethernet network traffic prediction
Inf. Sci.
(2013)- et al.
Topology-regularized universal vector autoregression for traffic forecasting in large urban areas
Expert Syst. Appl.
(2017) - et al.
On the self-similar nature of ethernet traffic (extended version)
IEEE/ACM Trans. Netw.
(1994) - W. Willinger, V. Paxson, M.S. Taqqu, Self-similarity and heavy tails: structural modeling of network traffic, A...
- et al.
Modeling video traffic using m/g//spl infin/input processes: a compromise between markovian and lrd models
IEEE J. Sel. Areas Commun.
(1998) On the use of fractional brownian motion in the theory of connectionless networks
IEEE J. Sel. Areas Commun.
(1995)- G. Hu, S. Zhu, B. Xie, Wavelet synthesis of fractional brownian motion, in: WCC 2000-ICSP 2000. 2000 5th International...
- et al.
A multifractal wavelet model with application to network traffic
IEEE Trans. Inf. Theory
(1999) - T. Karagiannis, M. Molle, M. Faloutsos, A. Broido, A nonstationary poisson view of internet traffic, in: IEEE INFOCOM...
Nonlinear network traffic prediction based on bp neural network
Jisuanji Yingyong/ J. Comput. Appl.
Cited by (5)
Meta Graph Transformer: A Novel Framework for Spatial–Temporal Traffic Prediction
2022, NeurocomputingCitation Excerpt :The results show that our model significantly outperforms the state-of-the-art methods. Traffic prediction is a classical task in ITS [40] and recent years have seen much progress [41–55]. Early works were mainly focused on statistical methods, such as autoregressive integrated moving average-based methods [1,2], Kalman filter [3], and vector autoregressive model [4].
Deep understanding of big geospatial data for self-driving cars
2021, NeurocomputingHow to Promote Urban Intelligent Transportation: A Fuzzy Cognitive Map Study
2022, Frontiers in NeuroscienceMining periodic patterns from spatio-temporal trajectories using FGO-based artificial neural network optimization model
2022, Neural Computing and Applications
Jingjing Li Senior engineer of Computer Network Information Center(CNIC), Chinese Academy of Sciences(CAS). He earned his M.S. degree from the CNIC, CAS in 2007. He has been working on engineering and research at the field of network management, network user data analysis and optimization over 10 years. His research interests include future Internet architecture, SDN, openflow, network management and forwarding technology.
Jun Li received the B.S. degree from Hunan University in 1989 and the M.S. and Ph.D. degrees from the Institute of Computing Technology, Chinese Academy of Sciences, in 1992 and 2006, respectively. He is currently a Professor and the Chief Engineer with the Computer Network Information Center, Chinese Academy of Sciences. He has been involved in research and engineering in the field of computer network over 20 years. He has authored or co-authored over 100 peer-reviewed papers and one book. His research interests include computer network architecture and protocols, involving Network Security, Artificial Intelligence and Big Data Application. He was a recipient of the National Technological Progress Awards.
Nan Jia is a lecturer at School for Police Information Engineering and Cyber Security, People’s Public Security University of China. Her research focuses on Big Data, public safety & security, and intelligent risk management.
Xunchun Li is a deputy director of Radio Technology Research Institute of Academy of Broadcasting Science. His research interest includes radio and television coverage network planning and optimization, 5G transmission and GIS.
Wenzhen Ma is an associate professor at National Space Science Center, Chinese Academy of Sciences. Her main research interest is big data processing and information system architecture.
Shanshan Shi receive the B.S. degree from Shandong Agricultural University in 2011 and the M.S. degree from the Computer Network Information Center, Chinese Academy of Sciences in 2016. She is currently pursuing the Ph.D. degree with the Computer Network Information Center, Chinese Academy of Sciences. Her research interests include future Internet architecture, information-centric networking, congestion control and forwarding technology.