Skip to main content

Achieving 10Gbps Network Processing: Are We There Yet?

  • Conference paper
High Performance Computing - HiPC 2008 (HiPC 2008)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5374))

Included in the following conference series:

Abstract

Scaling TCP/IP receive side processing to 10Gbps speeds on commercial server platforms has been a major challenge. This led to the development of two key techniques: Large Receive Offload (LRO) and Direct Cache Access (DCA). Only recently, systems supporting these two techniques have become available. So, we want to evaluate these two techniques using 10Gigabit NICs to find out if we can finally get 10Gbps rates. We evaluate these two techniques in detail to understand performance benefit offered by these two techniques and the remaining major overheads. Our measurements showed that LRO and DCA together improve TCP/IP receive performance by more than 50% over the base case (no LRO and DCA). These two techniques combined with the improvements in the CPU architecture and the rest of the platform over the last 3-4 years have more than doubled the TCP/IP receive processing throughput to 7Gbps. Our detailed architectural characterization of TCP/IP processing, with these two features enabled, has revealed that buffer management and copy operations still take up significant amount of processing time. We also analyze the scaling behavior of TCP/IP to figure out how multi-core architectures improve network processing. This part of our analysis has highlighted some limiting factors that need to be addressed to achieve scaling beyond 10Gbps.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Alacritech SLIC: A Data Path TCP Offload methodology, http://www.alacritech.com/html/techreview.html

  2. Mogul, J.: TCP offload is a dumb idea whose time has come. In: A Symposium on Hot Operating Systems (HOT OS) (2003)

    Google Scholar 

  3. Rangarajan, M., et al.: TCP Servers: Offloading TCP/IP Processing in Internet Servers. Design, Implementation, and Performance, Rutgers University, Technical Report, DCS-TR-481 (March 2002)

    Google Scholar 

  4. Regnier, G., Makineni, S., Illikkal, R., Iyer, R., et al.: TCP onloading for data center servers. IEEE Computer 37(11), 48–58 (2004)

    Article  Google Scholar 

  5. Blanton, E., Allman, M.: On the Impact of Bursting on TCP Performance. In: Proceedings of the Workshop for Passive and Active Measurement (March 2005)

    Google Scholar 

  6. Makineni, S., Iyer, R., Sarangam, P., Newell, D., Zhao, L., Illikkal, R., Moses, J.: Receive Side Coalescing for Accelerating TCP/IP Processing. In: Robert, Y., Parashar, M., Badrinath, R., Prasanna, V.K. (eds.) HiPC 2006. LNCS, vol. 4297, pp. 289–300. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  7. Kurmann, C., Müller, M., Rauch, F., Stricker, T.M.: Speculative defragmentation— A technique to improve the communication software efficiency for gigabit Ethernet. In: Proc. 9th IEEE Symp. High Performance Distr. Comp., Pittsburgh (August 2000)

    Google Scholar 

  8. Huggahalli, R., Iyer, R., Tetrick, S.: Direct Cache Access for High Bandwidth Network I/O. In: 32nd Annual International Symposium on Computer Architecture (ISCA 2005) (June 2005)

    Google Scholar 

  9. Kumar, A., et al.: Impact of Cache Coherence Protocols on the Processing of Network Traffic. In: 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-40) (2007)

    Google Scholar 

  10. Chase, J., et al.: End System Optimizations for High-Speed TCP. In: IEEE Communications, Special Issue on High-Speed TCP (2000)

    Google Scholar 

  11. Clark, D.D., Romkey, J., Salwen, H.: An analysis of TCP processing overhead. IEEE Communications 27(6), 23–29 (1989)

    Article  Google Scholar 

  12. Foong, A.P., Huff, T.R., Hum, H.H., Patwardhan, J.P., Regnier, G.J.: TCP Performance re-visited. In: Proc. IEEE Int. Symp. on Performance of Systems and Software, Austin, pp. 70–79 (March 2003)

    Google Scholar 

  13. Makineni, S., Iyer, R.: Architectural Characterization of TCP/IP Packet Processing on the Pentium M microprocessor. In: Int’l. Conf. on High Performance Computer Architecture (HPCA-10) (February 2004)

    Google Scholar 

  14. Makineni, S., et al.: Measurement-based Analysis of TCP/IP Processing Requirements. HiPC Poster Presentation (2003)

    Google Scholar 

  15. Kay, J., Pasquale, J.: The importance of non-data touching processing overheads in TCP/IP. In: Proc. ACM SIGCOMM, San Francisco, pp. 259–268 (October 1993)

    Google Scholar 

  16. Mogul, J.C.: Observing TCP Dynamics in Real Networks. In: ACM SIGCOMM, pp. 305–317 (1992)

    Google Scholar 

  17. Postel, J. (ed.): Internet Protocol - DARPA Internet program protocol specification, RFC 791 (September 1981)

    Google Scholar 

  18. Postel, J.B.: Transmission Control Protocol, RFC 793, Information Sciences Institute (September 1981)

    Google Scholar 

  19. The TTTCP Benchmark, http://ftp.arl.mil/~mike/ttcp.html

  20. NTttcp, http://www.microsoft.com/whdc/device/network/TCP_tool.mspx

  21. http://dast.nlanr.net/Projects/Iperf

  22. Zhao, L., et al.: Hardware Support for Bulk Data Movement in Server Platforms. In: Proceedings of ICCD 2005 (2005)

    Google Scholar 

  23. Binkert, N., et al.: Integrated network interfaces for high-bandwidth TCP/IP. In: Proceedings of the 2006 ASPLOS Conference (December 2006)

    Google Scholar 

  24. Grossman, L.: Large Receive Offload Implementation in Neterion 10GbE Ethernet Driver. In: Ottawa Linux Symposium, Ottawa (2005)

    Google Scholar 

  25. Foong, A., Fung, J., Newell, D.: Improved Linux* SMP Scaling: User-directed Processor Affinity, http://softwarecommunity.intel.com/articles/eng/1781.htm

  26. Scalable Networking: Eliminating the Receive Processing Bottleneck – Introducing SS. Microsoft WinHEC (April 2004)

    Google Scholar 

  27. About OProfile, http://oprofile.sourceforge.net/about/

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Govindarajan, P., Makineni, S., Newell, D., Iyer, R., Huggahalli, R., Kumar, A. (2008). Achieving 10Gbps Network Processing: Are We There Yet?. In: Sadayappan, P., Parashar, M., Badrinath, R., Prasanna, V.K. (eds) High Performance Computing - HiPC 2008. HiPC 2008. Lecture Notes in Computer Science, vol 5374. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-89894-8_45

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-89894-8_45

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-89893-1

  • Online ISBN: 978-3-540-89894-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics