Abstract
Scaling TCP/IP receive side processing to 10Gbps speeds on commercial server platforms has been a major challenge. This led to the development of two key techniques: Large Receive Offload (LRO) and Direct Cache Access (DCA). Only recently, systems supporting these two techniques have become available. So, we want to evaluate these two techniques using 10Gigabit NICs to find out if we can finally get 10Gbps rates. We evaluate these two techniques in detail to understand performance benefit offered by these two techniques and the remaining major overheads. Our measurements showed that LRO and DCA together improve TCP/IP receive performance by more than 50% over the base case (no LRO and DCA). These two techniques combined with the improvements in the CPU architecture and the rest of the platform over the last 3-4 years have more than doubled the TCP/IP receive processing throughput to 7Gbps. Our detailed architectural characterization of TCP/IP processing, with these two features enabled, has revealed that buffer management and copy operations still take up significant amount of processing time. We also analyze the scaling behavior of TCP/IP to figure out how multi-core architectures improve network processing. This part of our analysis has highlighted some limiting factors that need to be addressed to achieve scaling beyond 10Gbps.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Alacritech SLIC: A Data Path TCP Offload methodology, http://www.alacritech.com/html/techreview.html
Mogul, J.: TCP offload is a dumb idea whose time has come. In: A Symposium on Hot Operating Systems (HOT OS) (2003)
Rangarajan, M., et al.: TCP Servers: Offloading TCP/IP Processing in Internet Servers. Design, Implementation, and Performance, Rutgers University, Technical Report, DCS-TR-481 (March 2002)
Regnier, G., Makineni, S., Illikkal, R., Iyer, R., et al.: TCP onloading for data center servers. IEEE Computer 37(11), 48–58 (2004)
Blanton, E., Allman, M.: On the Impact of Bursting on TCP Performance. In: Proceedings of the Workshop for Passive and Active Measurement (March 2005)
Makineni, S., Iyer, R., Sarangam, P., Newell, D., Zhao, L., Illikkal, R., Moses, J.: Receive Side Coalescing for Accelerating TCP/IP Processing. In: Robert, Y., Parashar, M., Badrinath, R., Prasanna, V.K. (eds.) HiPC 2006. LNCS, vol. 4297, pp. 289–300. Springer, Heidelberg (2006)
Kurmann, C., Müller, M., Rauch, F., Stricker, T.M.: Speculative defragmentation— A technique to improve the communication software efficiency for gigabit Ethernet. In: Proc. 9th IEEE Symp. High Performance Distr. Comp., Pittsburgh (August 2000)
Huggahalli, R., Iyer, R., Tetrick, S.: Direct Cache Access for High Bandwidth Network I/O. In: 32nd Annual International Symposium on Computer Architecture (ISCA 2005) (June 2005)
Kumar, A., et al.: Impact of Cache Coherence Protocols on the Processing of Network Traffic. In: 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-40) (2007)
Chase, J., et al.: End System Optimizations for High-Speed TCP. In: IEEE Communications, Special Issue on High-Speed TCP (2000)
Clark, D.D., Romkey, J., Salwen, H.: An analysis of TCP processing overhead. IEEE Communications 27(6), 23–29 (1989)
Foong, A.P., Huff, T.R., Hum, H.H., Patwardhan, J.P., Regnier, G.J.: TCP Performance re-visited. In: Proc. IEEE Int. Symp. on Performance of Systems and Software, Austin, pp. 70–79 (March 2003)
Makineni, S., Iyer, R.: Architectural Characterization of TCP/IP Packet Processing on the Pentium M microprocessor. In: Int’l. Conf. on High Performance Computer Architecture (HPCA-10) (February 2004)
Makineni, S., et al.: Measurement-based Analysis of TCP/IP Processing Requirements. HiPC Poster Presentation (2003)
Kay, J., Pasquale, J.: The importance of non-data touching processing overheads in TCP/IP. In: Proc. ACM SIGCOMM, San Francisco, pp. 259–268 (October 1993)
Mogul, J.C.: Observing TCP Dynamics in Real Networks. In: ACM SIGCOMM, pp. 305–317 (1992)
Postel, J. (ed.): Internet Protocol - DARPA Internet program protocol specification, RFC 791 (September 1981)
Postel, J.B.: Transmission Control Protocol, RFC 793, Information Sciences Institute (September 1981)
The TTTCP Benchmark, http://ftp.arl.mil/~mike/ttcp.html
NTttcp, http://www.microsoft.com/whdc/device/network/TCP_tool.mspx
Zhao, L., et al.: Hardware Support for Bulk Data Movement in Server Platforms. In: Proceedings of ICCD 2005 (2005)
Binkert, N., et al.: Integrated network interfaces for high-bandwidth TCP/IP. In: Proceedings of the 2006 ASPLOS Conference (December 2006)
Grossman, L.: Large Receive Offload Implementation in Neterion 10GbE Ethernet Driver. In: Ottawa Linux Symposium, Ottawa (2005)
Foong, A., Fung, J., Newell, D.: Improved Linux* SMP Scaling: User-directed Processor Affinity, http://softwarecommunity.intel.com/articles/eng/1781.htm
Scalable Networking: Eliminating the Receive Processing Bottleneck – Introducing SS. Microsoft WinHEC (April 2004)
About OProfile, http://oprofile.sourceforge.net/about/
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Govindarajan, P., Makineni, S., Newell, D., Iyer, R., Huggahalli, R., Kumar, A. (2008). Achieving 10Gbps Network Processing: Are We There Yet?. In: Sadayappan, P., Parashar, M., Badrinath, R., Prasanna, V.K. (eds) High Performance Computing - HiPC 2008. HiPC 2008. Lecture Notes in Computer Science, vol 5374. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-89894-8_45
Download citation
DOI: https://doi.org/10.1007/978-3-540-89894-8_45
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-89893-1
Online ISBN: 978-3-540-89894-8
eBook Packages: Computer ScienceComputer Science (R0)