skip to main content
research-article

Characterizing per-application network traffic using entropy

Published:10 May 2013Publication History
Skip Abstract Section

Abstract

The Internet has been evolving into a more heterogeneous internetwork with diverse new applications imposing more stringent bandwidth and QoS requirements. Already new applications such as YouTube, Hulu, and Netflix are consuming a large fraction of the total bandwidth. We argue that, in order to engineer future internets such that they can adequately cater to their increasingly diverse and complex set of applications while using resources efficiently, it is critical to be able to characterize the load that emerging and future applications place on the underlying network. In this article, we investigate entropy as a metric for characterizing per-flow network traffic complexity. While previous work has analyzed aggregated network traffic, we focus on studying isolated traffic flows. Per-application flow characterization caters to the need of network control functions such as traffic scheduling and admission control at the edges of the network. Such control functions necessitate differentiating network traffic on a per-application basis. The “entropy fingerprints” that we get from our entropy estimator summarize many characteristics of each application's network traffic. Not only can we compare applications on the basis of peak entropy, but we can also categorize them based on a number of other properties of the fingerprints.

References

  1. Apple. 2010a. iChat in OS X Leopard. http://www.apple.com/asia/macosx/leopard/features/ichat.html.Google ScholarGoogle Scholar
  2. Apple. 2010b. iChat Wikipedia entry. http://en.wikipedia.org/wiki/Ichat.Google ScholarGoogle Scholar
  3. Basharin, G. 1959. On a statistical estimate for the entropy of a sequence of independent random variables. Theory Probab. Appl. 4, 333.Google ScholarGoogle ScholarCross RefCross Ref
  4. Beran, J., Sherman, R., Taqqu, M., and Willinger, W. 1995. Long-range dependence in variable-bit-rate video traffic. IEEE Trans. Comm. 43, 234, 1566--1579.Google ScholarGoogle ScholarCross RefCross Ref
  5. Berkeley, L. 2001. National laboratory network research. tcpdump: The protocol packet capture and dumper program. http://www.tcpdump.org. In The Protocol Packet Capture and Dumper Program, 2003. 164.Google ScholarGoogle Scholar
  6. Bonfiglio, D., Mellia, M., Meo, M., and Rossi, D. 2009. Detailed analysis of Skype traffic. IEEE Trans. Multimed. 11, 1, 117--127. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Contributors. 2010. YouTube Wikipedia entry. http://en.wikipedia.org/w/index.php?title=Youtube&oldid= 380031496.Google ScholarGoogle Scholar
  8. Cover, T. M. and Thomas, J. A. 1991. Elements of Information Theory. Wiley-Interscience, New York. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Crovella, M. E. and Bestavros, A. 1996. Self-similarity in world wide web traffic evidence and possible causes. IEEE/ACM Trans. Netwo. 5, 835--846. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Feinstein, L., Schnackenberg, D., Balupari, R., and Kindred, D. 2003. Statistical approaches to ddos attack detection and response. In Proceedings of the DARPA Information Survivability Conference and Exposition. 303--314.Google ScholarGoogle Scholar
  11. Gao, Y., Kontoyiannis, I., and Bienenstock, E. 2006. From the entropy to the statistical structure of spike trains. In Proceedings of the IEEE International Symposium on Information Theory. 645--649.Google ScholarGoogle Scholar
  12. Google. 2010. GoogleTalk developer info. http://code.google.com/apis/talk/open_communications.html.Google ScholarGoogle Scholar
  13. Hulu. Hulu media faq. http://www.hulu.com/about/media_faq.Google ScholarGoogle Scholar
  14. Hunt, N. 2008. Netflix encoding for streaming. http://blog.netflix.com/2008/11/encoding-for-streaming.html.Google ScholarGoogle Scholar
  15. Karagiannis, T., Faloutsos, M., and Molle, M. 2003. A user-friendly self-similarity analysis tool. SIGCOMM Comput. Comm. Rev. 33, 3, 81--93. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Lakhina, A., Crovella, M., and Diot, C. 2005. Mining anomalies using traffic feature distributions. In Proceedings of the Conference on Applications, Technologies, Architectures, and protocols for Computer Communication (SIGCOMM'05). ACM, New York, 217--228. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Lall, A., Sekar, V., Ogihara, M., Xu, J., and Zhang, H. 2006. Data streaming algorithms for estimating entropy of network traffic. In Proceedings of the Joint International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS'06/Performance'06). ACM, New York, 145--156. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Leland, W. E., Taqqu, M. S., Willinger, W., and Wilson, D. V. 1993. On the self-similar nature of ethernet traffic. In Conference Proceedings on Communications Architectures, Protocols and Applications (SIGCOMM'93). ACM, New York, 183--193. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Norris, R. 1998. Markov Chains (Cambridge Series in Statistics and Probabilistic Mathematics). Cambridge University Press.Google ScholarGoogle Scholar
  20. Park, K., Kim, G., and Crovella, M. 1996. On the relationship between file sizes, transport protocols, and self-similar network traffic. In Proceedings of the IEEE International Conference on Network Protocols. 171--180. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Paxson, V. and Floyd, S. 1995. Wide area traffic: The failure of poisson modeling. IEEE/ACM Trans. Netwo. 3, 3, 226--244. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Perényi, M. and Molnár, S. 2007. Enhanced Skype traffic identification. In Proceedings of the 2nd International Conference on Performance Evaluation Methodologies and Tools (ValueTools'07). ICST (Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering), Brussels, Belgium, 1--9. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Richman, J. S. and Moorman, J. R. 2000. Physiological time-series analysis using approximate entropy and sample entropy. Amer. J. Physiol. Heart Circ. Physiol. 278, 6, H2039--2049.Google ScholarGoogle ScholarCross RefCross Ref
  24. Riihijarvi, J., Wellens, M., and Mahonen, P. 2009. Measuring complexity and predictability in networks with multiscale entropy analysis. In Proceedings of IEEE INFOCOM. 1107--1115.Google ScholarGoogle Scholar
  25. Roberts, L. 2009. A radical new router. IEEE Spectrum. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Rossi, D., Valenti, S., Veglia, P., Bonfiglio, D., Mellia, M., and Meo, M. 2008. Pictures from the Skype. SIGMETRICS Perform. Eval. Rev. 36, 2, 83--86. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Sandvine Incorporated. 2011. Global internet phenomena report. http://www.sandvine.com/news/global_broadband_trends.asp.Google ScholarGoogle Scholar
  28. Sang, A. and Li, S. 2000. A predictability analysis of network traffic. In Proceedings of the 19th Annual Joint Conference of the IEEE Computer and Communications Societies (INFOCOM'00). 342--351Google ScholarGoogle Scholar
  29. Shannon, C. E. 1948. A mathematical theory of communication. Bell Syst. Tech. J. 27, 379--423.Google ScholarGoogle ScholarCross RefCross Ref
  30. Vu, V., Yu, B., and Kass, R. 2007. Coverage-adjusted entropy estimation. Stat. Med. 26, 21, 4039--4060.Google ScholarGoogle ScholarCross RefCross Ref
  31. Wagner, A. and Plattner, B. 2005. Entropy based worm and anomaly detection in fast IP networks. In Proceedings of the 14th IEEE International Workshops on Enabling Technologies: Infrastructure for Collaborative Enterprise (WETICE'05). IEEE Computer Society, Los Alamitos, CA, 172--177. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Walsworth, C., Aben, E., Claffy, K., and Andersen, D. 2009. The CAIDA anonymized 2009 Internet traces - (jan 15). http://www.caida.org/data/passive/passive_2009_dataset.xml.Google ScholarGoogle Scholar
  33. Willems, F., Shtarkov, Y., and Tjalkens, T. 1995. The context-tree weighting method: Basic properties. IEEE Trans. Info. Theory, 41, 3, 653--664. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Willinger, W., Taqqu, M. S., Sherman, R., and Wilson, D. V. 1997. Self-similarity through high-variability: statistical analysis of ethernet lan traffic at the source level. IEEE/ACM Trans. Netwo. 5, 1, 71--86. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Xu, K., Zhang, Z.-L., and Bhattacharya, S. 2005. Profiling internet backbone traffic: behavior models and applications. In Proceedings of the Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications (SIGCOMM'05). ACM, New York, 169--180. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Characterizing per-application network traffic using entropy

    Recommendations

    Reviews

    Amos O Olagunju

    Several emerging algorithms for data mining, modeling, and simulation offer support for the visualization of business data and information. Business applications with intensive computational needs require the efficient use of bandwidth to effectively operate over the Internet. How should these increasingly multifaceted applications be characterized for the effective engineering of future networks and the Internet__?__ The authors of this paper examine entropy as a measure of the magnitude of the intricacy of per-application network traffic flow. They use real-time applications (Skype, iChat, and Google Talk) and streaming media (Hulu, Netflix, and ABC's webcam stream) to investigate self-similarity and develop an entropy metric for characterizing flow in network traffic. The experimental tcpdump of network traces consists of buffered, bursty, bandwidth-dependent, and codec-dependent applications that use the transmission control protocol (TCP) or user datagram protocol (UDP) to transport data. The cumulative distributions of the packet inter-arrival times of real-time and media streaming flows are displayed to illustrate patterns in the traffic. Flows for Skype voice over Internet protocol (VoIP) and video conferencing show distinct patterns of audio and video traffic. The patterns of iChat audio and video flows were similar, but the Skype audio flow pattern was less complicated than the iChat audio flow pattern. The flow pattern of Google Talk audio was less complicated than the Skype audio traffic flows, and the traffic flow patterns of Google Talk audio and video were different. There were no discernible differences in the traffic pattern flows of Hulu, ABC, and Netflix. The average flow rate of traffic is used to estimate self-similarity for forecasting network traffic flows. There was only weak evidence to support the self-similarity of video and audio flows in iChat and Skype traffic. Consequently, the authors developed a multiscale plug-in that estimates packet timing entropy, to capture the predictable finite memory of arriving packets in time intervals and compute the packet size sequences. They then use a flow trace of packet arrival timestamps to validate the accuracy of the entropy estimator. The entropy estimator is shown to be reliable in predicting the arrival of packets in time intervals for real-time and streaming media audio and video traffic flows. Unlike well-known predictors that assume distribution models to forecast traffic flows [1], the entropy estimator uses a table of bit patterns with associated probabilities in its prediction, without assuming any model. The entropy estimator generates entropic peaks for comparing and classifying video and audio applications. The authors provide valuable insights on the use of entropy estimator fingerprints for network intrusion detection, admission control of application flows, and strategic traffic scheduling based on the available bandwidth. Although the study only looked at the effects of packet timing on traffic flows and not packet size, all current and future network engineers should find this incredible paper interesting. Online Computing Reviews Service

    Access critical reviews of Computing literature here

    Become a reviewer for Computing Reviews.

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Modeling and Computer Simulation
      ACM Transactions on Modeling and Computer Simulation  Volume 23, Issue 2
      May 2013
      92 pages
      ISSN:1049-3301
      EISSN:1558-1195
      DOI:10.1145/2457459
      Issue’s Table of Contents

      Copyright © 2013 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 10 May 2013
      • Accepted: 1 October 2012
      • Revised: 1 October 2011
      • Received: 1 June 2011
      Published in tomacs Volume 23, Issue 2

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader