An approximation method of origin–destination flow traffic from link load counts

https://doi.org/10.1016/j.compeleceng.2011.06.009Get rights and content

Abstract

Traffic matrix (TM) is a key input of traffic engineering and network management. However, it is significantly difficult to attain TM directly, and so TM estimation is so far an interesting topic. Though many methods of TM estimation are proposed, TM is generally unavailable in the large-scale IP backbone networks and is difficult to be estimated accurately. This paper proposes a novel method of TM estimation in large-scale IP backbone networks, which is based on the generalized regression neural network (GRNN), called GRNN TM estimation (GRNNTME) method. Firstly, building on top of GRNN, we present a multi-input and multi-output model of large-scale TM estimation. Because of the powerful capability of learning and generalizing of GRNN, the output of our model can sufficiently capture the spatio-temporal correlations of TM. This ensures that the estimation of TM can accurately be attained. And then GRNNTME uses the procedure of data posttreating further to make the output of our model closer to real value. Finally, we use the real data from the Abilene Network to validate GRNNTME. Simulation results show that GRNNTME can perform well the accurate and fast estimation of TM, track its dynamics, and holds the stronger robustness and lower estimation errors.

Graphical abstract

Spatial and temporal relative estimation errors.

  1. Download : Download full-size image

Highlights

► We model traffic matrix estimation problem. ► We present a multi-input and multi-output model. ► Our model can capture the spatio-temporal correlations of traffic matrix. ► We examine the robustness of our estimation method. ► We obtain the more accurate estimation of traffic matrix.

Introduction

With the development of information technology, the size of the Internet grows exponentially. Internet networks have turned from the simple ones into the heterogeneous and complex ones. And it is very difficult to maintain and manage a large-scale IP network. Hence, how effectively handling those activities associated with the large-scale IP networks is one of the main problems network operators are so far faced with. Those activities include network dimensioning, load balancing, traffic detecting, network capacity planning, route optimization, failure management, provisioning and so on [1], [3], [5], [7], [10]. To handle these activities very well, network operators should know how the traffic flows through their networks. Traffic matrix (TM) reflects the volume of traffic that flows between all pairs of sources and destinations in a network. Its element is referred to as an origin–destination (OD) pair (or flow). And TM gives network operators a global aspect of how all the traffic in a large-scale IP network flows. Thus, with them as a key input of traffic engineering and network management, it is very important for network operators to accurately get TM in a large-scale IP network.

Unfortunately, direct measurement of the traffic is not generally practical in the large-scale IP network. As commented in [2], there are a lot of reasons for this. Firstly, all the network elements may not support the information collection associated with the computation of TM. Secondly, measurement overhead is very expensive and in particular is such in the large-scale network. In the case of the large-scale network, when measuring the volume of traffic, communication and storage overhead, which may impact on the normal operation of network, is too large to be viable using today’s flow monitors. Finally, the information about TM may be private, and so majorities of ISPs (Internet Service Providers) would not like to make the direct measurement of TM. Thereby, to accurately attain TM to handle the network activities, indirect measurement of the traffic is the main way so far.

In 1996, Vardi firstly introduced network tomography method to research the problem that TM in a network is indirectly measured. Since then, many researchers studied the problem and proposed many solutions. TM is satisfied with some tomographic constraints as follows:Y=AX,where Y represents link loads (written as a column vector), X TM and A routing matrix whose element Aij is equal 1 if OD flow j traverses link i or zero otherwise [4], [6], [8]. Generally, the loads of every link in a network are easy to get, for example, via simple network management protocol (SNMP). And routing matrix is also readily available by the status and configuration information of the network. Thus, the problem denoted by (1) is that given link loads Y and routing matrix A, one finds a solution X. Because routing matrix A is usually under-constrained and ill-posed, this is a highly ill-posed inverse problem. Network tomography is a good way for this problem [9], [11], [12], [14]. In addition, for a highly under-constrained and ill-posed problem, some optimal methods [13], [15], [19] is also very useful.

Though TM estimation is extensively studied since Vardi introduced network tomography method to the research field, it is very difficult to accurately estimate TM of the large-scale IP network. On the one hand, the size of the large-scale IP network is much larger than that of simple one. This makes the computation and handling of TM more difficult and complex. In particular, the overhead in time and space is much larger. On the other hand, the model about TM in the large-scale IP network is not too accurate to capture their characteristics. Vardi [9] and Cao et al. [11] used the statistical inference method to estimate TM by modeling OD flows as the probability model. As noted in [6], these methods are sensitive to the prior. Zhang et al. [4], [12] discussed the problem of the large-scale IP TM estimation. Though, as mentioned in [6], this method partially reduced the sensitivity to the prior, it has the larger errors, because it only considered the spatial correlations among OD flows. Soule et al. [16] sufficiently considered the spatial and temporal correlations of TM, and so the estimation was closer to the real one.

To solve the problem of large-scale IP TM estimation, we propose a novel method from a new perspective, which is based on the generalized regression neural network (GRNN), called GRNN TM estimation (GRNNTME) method. GRNNTME uses a multi-input and multi-output model, which is built on top of GRNN, to model large-scale IP TM estimation problem, and then exploits iterative proportional fitting procedure (IPFP) to satisfy the output of our model with tomographic constraints of (1). GRNN is an extensive modeling tool. As mentioned in [17], GRNN holds many characteristics. Firstly, GRNN holds a highly parallel structure and is one-pass learning algorithm. GRNN makes only the simple calculation and can quickly converge to a certain point that one is to seek. These natures insure that GRNN can make quickly real-time predicting and large-scale calculation. Secondly, GRNN converges to the underlying regression surface and can be used for any regression problem. TM estimation is, in nature, to seek the relations between link loads and TM so that one can attain unknown TM by observable link loads, and can be referred as a regression problem. The characteristic of GRNN can well capture the true properties of TM in a large-scale IP network. Finally, GRNN can handle sparse data in a multidimensional measurement space. In the large-scale IP network, TM often holds sparse data. Thus GRNN makes it easier to estimate large-scale IP TM accurately. Now, GRNN is well studied and widely used for predicting, auto-control, system modeling and time-varying environment [18], [20]. Hence, it is easy of GRNN to complete the map from input data to output data and handle the dynamics of TM. All the properties insure that GRNN is suited for estimating large-scale IP TM.

Our contributions are multiple. First, our work brings GRNN into large-scale IP TM estimation. Though GRNN has found many applications in other scientific fields, to the best of our knowledge, this is the first time that it is applied to handling large-scale IP TM estimation. Second, based on GRNN, we propose a multi-input and multi-output model to model large-scale IP TM estimation problem. As mentioned in [21], Eq. (1) represents a highly under-constrained system because the number of links in a network is much smaller than that of OD flow. Thus there are infinite solutions to (1). How one finds a meaning solution is a difficult problem. Our model, which can handle any regression problem, can complete the map from Y to X in (1). This makes us avoid the complex mathematical computation. As long as using the input–output data pairs to train our model, we can well model large-scale IP TM estimation problem. Third, a novel method, namely GRNNTME, is proposed for large-scale IP TM estimation. GRNNTME includes two processes: training and predicting. Training process is used to build our model. Predicting process is exploited to estimate TM. To further make the accurate estimation of TM, GRNNTME adjusts the resulting estimation with IPFP, with the result that the resulting estimation is satisfied with the tomographic constraints of (1). Finally, we use the real data [30] from the Abilene network to validate GRNNTME. The dynamics tracking, estimation error, and robustness of GRNNTME are analyzed in detail. Simulation results show that GRNNTME can perform well the accurate and fast estimation of TM, track its dynamics, and it holds the stronger robustness and lower estimation error.

The rest of this paper is organized as follows. Related work is introduced in Section 2. Section 3 describes large-scale IP TM, GRNN network and its training process. Our method is derived in Section 4. Section 5 presents the simulation results and analysis. We conclude our work in Section 6.

Section snippets

Related work

TM is studied extensively up to date and many useful solutions based on the analysis of them are proposed. In 1996, Vardi firstly applied network tomography method to TM estimations. However, as mentioned in [4], network tomography method differs from traditional tomography method. The latter finds a solution via optimizing an objective function, while the former so often exploits the higher order statistics of the link load data to add the additional constraints. Vardi modeled each OD flow as

Traffic matrix

TM describes the traffic of flow between all source nodes and destination nodes, namely egress nodes and ingress nodes. Assume that there are the n nodes and L links in a large-scale IP network. And then there exist N=n2 OD flows. TM, link loads and routing matrix are satisfied with the tomographic constraints in (1), where A is L×N routing matrix of the studied IP network. Generally, LN exists in a large-scale IP network, with the result that this makes the problem denoted by (1) highly

Proposed method: GRNNTME

In this section, we will derive the proposed GRNNTME. We first briefly present basic idea of GRNNTME. Then, we describe the details of TM estimation modeling, which is followed by proposing how fast and accurately to predict large-scale IP TM by this model. Finally, two algorithms are given, which is followed by a complete GRNNTME method.

GRNNTME includes the training and predicting processes. First, we propose a multi-input and multi-output model for large-scale IP TM estimation, based on GRNN.

Simulation results

In this section, we conduct a series of simulations to study the performance of GRNNTME. We compare GRNNTME with TomoGravity. Large-scale IP TM estimation is a very difficult problem. TomoGravity has so far reported as the fast and accurate method of large-scale IP TM estimation, and is validated by the real data [22]. Moreover, it is now used for the practical network engineering.

We use the real data [30] from the Abilene network to simulate the performance of two methods, namely analyzing the

Conclusion

Based on GRNN, this paper has proposed a novel method, namely GRNNTME method, for large-scale IP TM estimation. Large-scale IP TM estimation is significantly difficult since it is a highly ill-posed inverse problem. How fast and accurately to estimation large-scale IP TM is a challenging that network operators so far face with. GRNNTME is simple and fast (taking less than ten minutes to estimate TM in the Abilene network for 3 weeks). Built on top of GRNN, we propose a multi-input and

Acknowledgments

This work was supported in part by the National Natural Science Foundation of China (Nos. 61071124, 61070162, 71071028, 60802023, 70931001), the Specialized Research Fund for the Doctoral Program of Higher Education (Nos. 20100042120035, 20100042110025, 20070145017), the Open Project of State key Laboratory of Networking and Switching Technology (No. SKLNST-2009-1-04) and the Fundamental Research Funds for the Central Universities (Nos. N090404014, N090504003, N090504006). The authors thank the

Dingde Jiang received Ph.D. in communication and information systems from School of Communication and Information Engineering, University of Electronic Science and Technology of China, Chengdu, China, in 2009. He is currently an Associate Professor in College of Information Science and Engineering, Northeastern University, Shenyang, China. His research interests include network measurement, network security, Internet traffic engineering, and communication networks. Dr. Jiang is a member of IEEE

References (36)

  • Chen A, Jin Y, Cao J, Li L. Tracking long duration flows in network traffic. In: Proceedings of the IEEE INFOCOM’10....
  • Papagiannaki K, Taft N, Lakhina A. A distributed approach to measure IP traffic matrices. In: Proceedings of the...
  • K.V. Vishwanath et al.

    Swing: realistic and responsive network traffic generation

    IEEE/ACM Trans Netw

    (2009)
  • Zhang Y, Roughan M, Duffield N, Greenberg A. Fast accurate computation of large-scale IP traffic matrices from link...
  • Stoev S, Michailidis G, Vaughan J. Global modeling of backbone network traffic. In: Proceedings of the IEEE INFOCOM’10....
  • Soule A, Lakhina A, Taft N, Papagiannaki K, Salamatian K, Nucci A, et al. Traffic matrices: balancing measurements,...
  • Y. Shen et al.

    Design and performance analysis of a practical load-balanced switch

    IEEE Trans Commun

    (2009)
  • Medina A, Taft N, Salamatian K, Bhattacharyya S, Diot C. Traffic matrix estimation: existing techniques and new...
  • Y. Vardi

    Network tomography: estimating source-destination traffic intensities from link data

    J Am Stat Assoc

    (1996)
  • A. Soule et al.

    Estimating dynamic traffic matrices by using viable routing changes

    IEEE/ACM Trans Netw

    (2007)
  • J. Cao et al.

    Time-varying network tomography: router link data

    J Am Stat Assoc

    (2000)
  • Zhang Y, Roughan M, Lund C, Donoho D. An information theoretic approach to traffic matrix estimation. In: Proceedings...
  • G.Y. Lazarou et al.

    Describing network traffic using the index of variability

    IEEE/ACM Trans Netw

    (2009)
  • C. Tebaldi et al.

    Bayesian inference on network traffic using link count data

    J Am Stat Assoc

    (1998)
  • M. Kodialam et al.

    Oblivious routing of highly variable traffic in service overlays and IP backbones

    IEEE/ACM Trans Netw

    (2009)
  • Soule A, Salamatian K, Nucci A, Taft N. Traffic matrix tracking using Kalman filtering. LIP6 Research Report...
  • D.F. Specht

    A general regression neural network

    IEEE Trans Neural Netw

    (1991)
  • Song Y, Ren Y. A predictive model of nonlinear system based on generalized regression neural network. In: Proceedings...
  • Cited by (39)

    • Application of improved GRNN model to predict interlamellar spacing and mechanical properties of hypereutectoid steel

      2020, Materials Science and Engineering: A
      Citation Excerpt :

      This may lead to the search of new methods allowing the description of the lamellar spacing and mechanical property. Previous scientific researches [14–16] has suggested that generalized regression neural network has an excellent multivariable mapping capacity and flexible network structure, which is advantageous in handling unknown parameters and nonlinear problems. The GRNN has advantages in approximation ability and learning speed, which enables it an effective analytic tool when dealing with a small sample of data [17].

    • How to reconstruct end-to-end traffic based on time-frequency analysis and artificial neural network

      2014, AEU - International Journal of Electronics and Communications
    • Regression based critical information aggregation and dissemination in VANETs: A cognitive agent approach

      2014, Vehicular Communications
      Citation Excerpt :

      Traffic matrix (TM) is a key input of traffic engineering and network management. TM estimation in large-scale IP backbone networks based on the generalized regression neural network (GRNN), called GRNN TM estimation (GRNNTME) method is proposed in [41]. A multi-input and multi-output model of large-scale TM estimation is presented.

    • Application of Prophet Model in Traffic Matrix Prediction for IP Backbone Network

      2023, 2023 8th International Conference on Intelligent Computing and Signal Processing, ICSP 2023
    • Traffic Matrix Estimation based on Denoising Diffusion Probabilistic Model

      2023, Proceedings - IEEE Symposium on Computers and Communications
    • A novel flash P2P network traffic prediction algorithm based on ELMD and garch

      2020, International Journal of Information Technology and Decision Making
    View all citing articles on Scopus

    Dingde Jiang received Ph.D. in communication and information systems from School of Communication and Information Engineering, University of Electronic Science and Technology of China, Chengdu, China, in 2009. He is currently an Associate Professor in College of Information Science and Engineering, Northeastern University, Shenyang, China. His research interests include network measurement, network security, Internet traffic engineering, and communication networks. Dr. Jiang is a member of IEEE and IEICE.

    Zhengzheng Xu is currently working toward Ph.D. in management science and engineering in College of Information Science and Engineering, Northeastern University, Shenyang, China. She is currently a Research Member at Key Lab of Comprehensive Automation of Process Industry of Ministry of Education, College of Information Science and Engineering, Northeastern University, Shenyang, China. She is also presently a Research Member at Systems Engineering Research Institute at the same university. Her research interests include supply chain and logistics management, decision analysis, modeling and optimization.

    Hongwei Xu received the B.Sc. degree in 2010. He is currently working toward Master in Communication and Information System, Northeastern University, China. His research interests include network measurement and performance analysis.

    Yang Han received the B.Sc. degree in 2010. He is currently working toward Master in Communication and Information System, Northeastern University, China. His research interests include wireless networks and cognitive networks.

    Zhenhua Chen received the B.Sc. degree in College of Information Science and Engineering, Northeastern University, Shenyang, China, in 2010. He is currently working toward Master in Communication and Information System, Northeastern University, China. His research interests include network measurement and cognitive networks.

    Zhen Yuan is currently a undergraduate in Communication Engineering, Northeastern University, China. Her research interests include network measurement and network security.

    Reviews processed and approved for publication to the Editor-in-Chief Dr. Manu Malek.

    View full text