skip to main content
10.1145/1095890.1095892acmconferencesArticle/Chapter ViewAbstractPublication PagesancsConference Proceedingsconference-collections
Article

Overcoming the memory wall in packet processing: hammers or ladders?

Published:26 October 2005Publication History

ABSTRACT

Overhead of memory accesses limits the performance of packet processing applications. To overcome this bottleneck, today's network processors can utilize a wide-range of mechanisms-such as multi-level memory hierarchy, wide-word accesses, special-purpose result-caches, asynchronous memory, and hardware multi-threading. However, supporting all of these mechanisms complicates programmability and hardware design, and wastes systemresources. In this paper, we address the following fundamental question: what minimal set of hardware mechanisms must a network processor support to achieve the twin goals of simplified programmability and high packet throughput? We show that no single mechanism sufficies; the minimal set must include data-caches and multi-threading. Data-caches and multi-threading are complementary; whereas data-caches exploit locality to reduce the number of context-switches and the off-chip memory bandwidth requirement, multi-threading exploits parallelism to hide long cache-miss latencies.

References

  1. http://www.caida.org/analysis/workload.Google ScholarGoogle Scholar
  2. Benchmarks, Network Processing Forum. http://www.npforum.org/benchmarking/index.shtml.Google ScholarGoogle Scholar
  3. Cacti3.2 research.compaq.com/wrl/people/jouppi/cacti.html.Google ScholarGoogle Scholar
  4. The reincarnation of network processor market. In-Stat MDR, Dec.2003.Google ScholarGoogle Scholar
  5. Snort: The Open Source Network Intrusion Detection System. http://www.snort.org/.Google ScholarGoogle Scholar
  6. The Simplescalar Tool Set Version 3.0. http://www.simplescalar.com/.Google ScholarGoogle Scholar
  7. The Tolly Group: Information Technology Testing, Research and Certification. http://www.tolly.com.Google ScholarGoogle Scholar
  8. University of Oregon Route Views Project. http://www.routeviews.org/.Google ScholarGoogle Scholar
  9. J.-L. Baer, D. Low, P. Crowley, and N. Sidhwaney. Memory hierarchy design for a multiprocessor look-up engine. In Proc. of the 12th International Conference on Parallel Architectures and Compilation Techniques, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. T.-C. Chiueh and P. Pradhan. High-Performance IP Routing Table Lookup using CPU Caching. In Proc. of IEEE INFOCOMM'99, 1999.Google ScholarGoogle ScholarCross RefCross Ref
  11. T.-C. Chiueh and P. Pradhan. Cache Memory Design for Network Processors. In Proc. of the 6th HPCA, Jan 2000.Google ScholarGoogle Scholar
  12. D. Comer. Network Systems Design Using Network Processors. Prentice Hall, ISBN 0-13-141792-4, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. P. Crowley, M. E. Fiuczynski, and J.-L. Baer. Characterizing Processor Architectures for Programmable Network Interfaces. In Proc. of the 2000 nternational Conference on Supercomputing, May 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. W. Eatherton, G. Varghese, and Z. Dittia. Tree bitmap: hardware/software IP lookups with incremental updates. SIGCOMM Comput. Commun. Rev., 34(2), 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. D. C. Feldemeir. Improving Gateway Performance with a Routing-table Cache. In Proceedings of IEEE INFOCOMM'88, March 1988.Google ScholarGoogle ScholarCross RefCross Ref
  16. K. Gopalan and T.-C. Chiueh. Improving Route Lookup Performance Using Network Processor Cache. In IEEE/ACM SC Conf., 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. P. Gupta and N. McKeown. Algorithms for Packet Classification. In IEEE Network, March/April 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. J. Hasan, S. Chandra, and T. N. Vijaykumar. Efficient Use of Memory Bandwidth to Improve Network Processor Throughput. In Proc. of the 30th ISCA, June 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Intel IXP2800 Hw. Ref. Manual, Nov 2002.Google ScholarGoogle Scholar
  20. S. Iyer, R. R. Kompella, and N. McKeown. Techniques for Fast Packet Buffers. In Gigabit Networking Workshop, April 2001.Google ScholarGoogle Scholar
  21. R. Jain. Characteristics of Destination Address Locality in Computer Networks: A Comparison of Caching Schemes. Computer Networks and ISDN Systems, 18(4), May 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. E. J. Johnson and A. Kunze. IXP 2xxx Programming. Intel Press, 2003.Google ScholarGoogle Scholar
  23. G. Memik, W. H. Mangione-Smith, and W. Hu. NetBench: A Benchmarking Suite for Network Processors. In ICCAD, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. J. Mudigonda, H. M. Vin, and R. Yavatkar. Managing Memory Access Latency in Packet Processing. In Proc. of ACM SIGMETRICS, Banff, Canada, June 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. E. Nahum, D. Yates, J. Kurose, and D. Towsley. Cache behavior of network protocols. SIGMETRICS, 25(1):169--180, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. NLANR Network Traffic Packet Header Traces. http://pma.nlanr.net/Traces/.Google ScholarGoogle Scholar
  27. C. Partridge. A Fifty Gigabit Per Second IP Router. IEEE/ACM Transactions on Networking, 6(3), June 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. C. Partridge. Locality and Route Caches. In NSF Workshop, Internet Statistics Measurement and Analysis, February 1999.Google ScholarGoogle Scholar
  29. R. Ramaswamy, N. Weng, and T. Wolf. Analysis of network processing workloads. In Proc. of IEEE Intnl. Symp. on Perf. Analysis of Systems and Software, Austin, TX, Mar. 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. M. Ruiz-Sanchez, E. Biersack, and W. Dabbous. Survey and Taxonomy of IP Address Lookup Algorithms. IEEE Network, March 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. T. Sherwood, G. Varghese, and B. Calder. A Pipelined Memory Architecture for High Throughput Network Processors. In ISCA, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. M. Shreedhar and G. Varghese. Efficient Fair Queuing Using Deficit Round Robin. In Proceedings of ACM SIGCOMM, August 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. K. Sklower. A Tree-Based Packet Routing Table for Berkely Unix. In Proceedings of the Winter 1991 USENIX Conference, January 1991.Google ScholarGoogle Scholar
  34. V. Srinivasan and G. Varghese. Fast Address Lookups using Controlled Prefix Expansion. ACM Trans. on Computer Systems, 17(1), 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. M. Waldvogel, G. Varghese, J. Turner, and B. Plattner. Scalable high speed ip routing lookups. In Proc. of the ACM SIGCOMM, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. T. Wolf and M. Franklin. CommBench - A Telecommunications Benchmark for Network Processors. In IEEE International Symposium on Performance Analysis of Systems and Software, April 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. T. Wolf and M. A. Franklin. Locality-aware Predictive Scheduling of Network Processors. In Proc. of IEEE International Symposium on Performance Analysis of Systems and Software, Nov 2001.Google ScholarGoogle ScholarCross RefCross Ref
  38. S. Wu and U. Manber. A Fast Algorithm for Multi-pattern Searching. Technical Report TR-94-17, 1994.Google ScholarGoogle Scholar
  39. W. A. Wulf and S. A. McKee. Hitting the memory wall: implications of the obvious. SIGARCH Comput. Archit. News, 23(1):20--24, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. H. Xie, L. Zhao, and L. Bhuyan. Architectural analysis and instruction-set optimization for design of network protocol processors. In Proc. of the 1st IEEE/ACM Intl. Conf. HW/SW codesign & synthesis, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. J. Xu, M. Singhal, and J. Degroat. A Novel Cache Architecture to Support Layer-Four Packet Classification at Memory Access Speeds. In IEEE INFOCOM, 2000.Google ScholarGoogle Scholar
  42. L. Zhao, R. Illikkal, S. Makineni, and L. Bhuyan. Tcp/ip cache characterization in commercial server workloads. In CAECW 2004, Along with HPCA-10, 2004.Google ScholarGoogle Scholar

Index Terms

  1. Overcoming the memory wall in packet processing: hammers or ladders?

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            ANCS '05: Proceedings of the 2005 ACM symposium on Architecture for networking and communications systems
            October 2005
            230 pages
            ISBN:1595930825
            DOI:10.1145/1095890

            Copyright © 2005 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 26 October 2005

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • Article

            Acceptance Rates

            Overall Acceptance Rate88of314submissions,28%

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader