ABSTRACT
Network traffic arises from the superposition of Origin-Destination (OD) flows. Hence, a thorough understanding of OD flows is essential for modeling network traffic, and for addressing a wide variety of problems including traffic engineering, traffic matrix estimation, capacity planning, forecasting and anomaly detection. However, to date, OD flows have not been closely studied, and there is very little known about their properties.We present the first analysis of complete sets of OD flow time-series, taken from two different backbone networks (Abilene and Sprint-Europe). Using Principal Component Analysis (PCA), we find that the set of OD flows has small intrinsic dimension. In fact, even in a network with over a hundred OD flows, these flows can be accurately modeled in time using a small number (10 or less) of independent components or dimensions.We also show how to use PCA to systematically decompose the structure of OD flow timeseries into three main constituents: common periodic trends, short-lived bursts, and noise. We provide insight into how the various constitutents contribute to the overall structure of OD flows and explore the extent to which this decomposition varies over time.
- P. Barford, J. Kline, D. Plonka, and A. Ron. A signal analysis of network traffic anomalies. In Internet Measurement Workshop, Marseille, November 2002. Google ScholarDigital Library
- S. Bhattacharyya, C. Diot, J. Jetcheva, and N. Taft. Pop-Level and Access-Link-Level Traffic Dynamics in a Tier-1 POP. In Internet Measurement Workshop, San Francisco, November 2001. Google ScholarDigital Library
- J. Brutlag. Aberrant behavior detection in timeseries for network monitoring. In USENIX LISA, New Orleans, December 2000. Google ScholarDigital Library
- J. Cao, D. Davis, S. V. Weil, and B. Yu. Time-Varying Network Tomography. J. of the American Statistical Association, pages 1063--1075, 2000.Google Scholar
- Cisco NetFlow. At www.cisco.com/warp/public/732/Tech/netflow/.Google Scholar
- M. Crovella and E. Kolaczyk. Graph Wavelets for Spatial Traffic Analysis. In IEEE INFOCOM, San Francisco, April 2003.Google ScholarCross Ref
- D. Donoho. High-Dimensional Data Analysis: The Curses and Blessings of Dimensionality. In American Math. Society. Available at: www-stat.stanford.edu/~donoho/Lectures/AMS2000/, 2000.Google Scholar
- N. Duffield, C. Lund, and M. Thorup. Estimating Flow Distributions from Sampled Flow Statistics. In ACM SIGCOMM, Karlsruhe, August 2003. Google ScholarDigital Library
- A. Feldmann, A. Greenberg, C. Lund, N. Reingold, J. Rexford, and F. True. Deriving traffic demands for operational IP networks: Methodology and experience. In IEEE/ACM Transactions on Neworking, pages 265--279, June 2001. Google ScholarDigital Library
- N. Hohn and D. Veitch. Inverting Sampled Traffic. In Internet Measurement Conference, Miami, October 2003. Google ScholarDigital Library
- H. Hotelling. Analysis of a complex of statistical variables into principal components. J. Educ. Psy., pages 417--441, 1933.Google Scholar
- Juniper Traffic Sampling. At www.juniper.net/techpubs/software/junos/junos60/swconfig60-policy/html/%sampling-overview.html.Google Scholar
- M. Kirby and L. Sirovich. Application of the Karhunen-Loève procedure for the characterization of human faces. IEEE Trans. Pattern Analysis and Machine Intelligence, pages 103--108, 1990. Google ScholarDigital Library
- B. Krishnamurthy, S. Sen, Y. Zhang, and Y. Chen. Sketch-based Change Detection: Methods, Evaluation, and Applications. In Internet Measurement Conference, Miami, October 2003. Google ScholarDigital Library
- L. Sirovich and K. S. Ball and L. R. Keefe. Plane Waves and Structures in Turbulent Channel Flow. Phys. Fluids. A, page 2217--2226, 1990.Google Scholar
- A. Lakhina, K. Papagiannaki, M. Crovella, C. Diot, E. D. Kolaczyk, and N. Taft. Analysis of Origin Destination Flows (Raw Data). Technical Report BUCS-2003-022, Boston University, 2003.Google ScholarCross Ref
- W. Leland, M. Taqqu, W. Willinger, and D. Wilson. On the Self-Similar Nature of Ethernet Traffic (Extended Version). Transactions on Networking, pages 1--15, Feburary 1994. Google ScholarDigital Library
- A. Medina, N. Taft, K. Salamatian, S. Bhattacharyya, and C. Diot. Traffic Matrix Estimation: Existing Techniques and New Directions. In ACM SIGCOMM, Pittsburgh, August 2002. Google ScholarDigital Library
- A. Nucci, R. Cruz, N. Taft, and C. Diot. Design of IGP Link Weight Changes for Traffic Matrix Estimation. In IEEE INFOCOM, Hong Kong, April 2004.Google ScholarCross Ref
- K. Papagiannaki, N. Taft, and C. Diot. Impact of Flow Dynamics on Traffic Engineering Design Principles. In IEEE INFOCOM, Hong Kong, April 2004.Google Scholar
- K. Papagiannaki, N. Taft, Z. Zhang, and C. Diot. Long-Term Forecasting of Internet Backbone Traffic: Observations and Initial Models. In IEEE INFOCOM, San Francisco, April 2003.Google ScholarCross Ref
- V. Paxson and S. Floyd. Wide Area Traffic: The Failure of Poisson Modeling. Transactions on Networking, pages 236--244, June 1995. Google ScholarDigital Library
- R. W. Preisendorfer. Principal Component Analysis in Meteorology and Oceanography. Elsevier, 1988.Google Scholar
- M. Roughan and J. Gottlieb. Large scale measurement and modeling of backbone internet traffic. In SPIE ITCom, Boston, August 2002.Google Scholar
- M. Roughan, A. Greenberg, C. Kalmanek, M. Rumsewicz, J. Yates, and Y. Zhang. Experience in measuring backbone traffic variability: Models, metrics, measurements and meaning. In International Teletraffic Conference (ITC-18), Berlin, September 2003.Google ScholarCross Ref
- S. Sarvotham, R. Riedi, and R. Baraniuk. Network Traffic Analysis and Modeling at the Connection Level. In Internet Measurement Workshop, San Francisco, November 2001.Google ScholarDigital Library
- A. Soule, A. Nucci, E. Leonardi, R. Cruz, and N. Taft. How to Identify and Estimate the Largest Traffic Matrix Elements in a Dynamic Environment. In ACM SIGMETRICS, New York, June 2004. Google ScholarDigital Library
- G. Strang. Linear Algebra and its Applications. Thomson Learning, 1988.Google Scholar
- C. Tebaldi and M. West. Bayesian Inference of Network Traffic Using Link Data. J. of the American Statistical Association, pages 557--573, June 1998.Google Scholar
- D. T'so, R. D. Frostig, E. E. Lieke, and A. Grinvald. Functional Organization of primate visual cortex revealed by high resolution optical imaging. Science, pages 417--420, 1990.Google Scholar
- Y. Vardi. Network Tomography: Estimating Source-Destination Traffic Intensities from Link Data. J. of the American Statistical Association, pages 365--377, 1996.Google Scholar
- V. Yegneswaran, P. Barford, and J. Ullrich. Internet Intrusions: Global Characteristics and Prevalence. In ACM SIGMETRICS, San Diego, June 2003. Google ScholarDigital Library
- Y. Zhang, M. Roughan, N. Duffield, and A. Greenberg. Fast Accurate Computation of Large-Scale IP Traffic Matrices from Link Loads. In ACM SIGMETRICS, San Diego, June 2003. Google ScholarDigital Library
- Y. Zhang, M. Roughan, C. Lund, and D. Donoho. An Information-Theoretic Approach to Traffic Matrix Estimation. In ACM SIGCOMM, Karlsruhe, August 2003. Google ScholarDigital Library
Index Terms
- Structural analysis of network traffic flows
Recommendations
Structural analysis of network traffic flows
Network traffic arises from the superposition of Origin-Destination (OD) flows. Hence, a thorough understanding of OD flows is essential for modeling network traffic, and for addressing a wide variety of problems including traffic engineering, traffic ...
Sensitivity of PCA for traffic anomaly detection
SIGMETRICS '07 Conference ProceedingsDetecting anomalous traffic is a crucial part of managing IP networks. In recent years, network-wide anomaly detection based on Principal Component Analysis (PCA) has emerged as a powerful method for detecting a wide variety of anomalies. We show that ...
Sensitivity of PCA for traffic anomaly detection
SIGMETRICS '07: Proceedings of the 2007 ACM SIGMETRICS international conference on Measurement and modeling of computer systemsDetecting anomalous traffic is a crucial part of managing IP networks. In recent years, network-wide anomaly detection based on Principal Component Analysis (PCA) has emerged as a powerful method for detecting a wide variety of anomalies. We show that ...
Comments