research-article

High-performance packet classification algorithm for multithreaded IXP network processor

Authors:

Xinan TangAuthors Info & Claims

ACM Transactions on Embedded Computing Systems (TECS), Volume 7, Issue 2

Article No.: 16, Pages 1 - 25

https://doi.org/10.1145/1331331.1331340

Published: 29 January 2008 Publication History

Abstract

Packet classification is crucial for the Internet to provide more value-added services and guaranteed quality of service. Besides hardware-based solutions, many software-based classification algorithms have been proposed. However, classifying at 10 Gbps speed or higher is a challenging problem and it is still one of the performance bottlenecks in core routers. In general, classification algorithms face the same challenge of balancing between high classification speed and low memory requirements. This paper proposes a modified recursive flow classification (RFC) algorithm, Bitmap-RFC, which significantly reduces the memory requirements of RFC by applying a bitmap compression technique. To speed up classifying speed, we exploit the multithreaded architectural features in various algorithm development stages from algorithm design to algorithm implementation. As a result, Bitmap-RFC strikes a good balance between speed and space. It can significantly keep both high classification speed and reduce memory space consumption. This paper investigates the main NPU software design aspects that have dramatic performance impacts on any NPU-based implementations: memory space reduction, instruction selection, data allocation, task partitioning, and latency hiding. We experiment with an architecture-aware design principle to guarantee the high performance of the classification algorithm on an NPU implementation. The experimental results show that the Bitmap-RFC algorithm achieves 10 Gbps speed or higher and has a good scalability on Intel IXP2800 NPU.

References

[1]

Agere. Network Processors. http://www.agere.com/telecom/network_processors.html.

[2]

Allen, J. R., Bass, B., et al. 2003. IBM PowerNP Network Processor: Hardware, Software, and Applications. IBM J. Res. & Dev., 47, 2/3 (Mar./May).

Digital Library

[3]

Amcc. Network Processors. https://www.amcc.com/MyAMCC/jsp/public/browse/controller.jsp&quest; networkLevel=COMM&superFamily=NETP.

[4]

Avici. Avici Intros Multiservice Line Cards. http://www.lightreading.com/document.asp&quest;doc_id= 34665&site=supercomm.

[5]

Baboescu, F. and Varghese, G. 2001. Scalable packet classification. In Proceedings of ACM SIGCOMM'01. San Diego, California. 199--210.

Digital Library

[6]

Baboescu, F., Singh, S., and Varghese, G. 2003. Packet classification for core routers: Is there an alternative to CAMs&quest; In INFOCOM 2003, Twenty-Second Annual Joint Conference of the IEEE Computer and Communications Societies, Vol. 1, 53--63.

[7]

Cisco Systems. Cisco CRS-1 Carrier Routing System. http://www.cisco.com/en/US/products/ps5763/.

[8]

Degermark, M., Brodnik, A., Carlsson, S., and Pink, S. 1997. Small forwarding tables for fast routing lookups. In Proceedings of ACM SIGCOMM'97, Cannes, France. 3--14.

Digital Library

[9]

Eatherton, W., Varghese, G., and Dittia, Z. 2004. Tree bitmap: Hardware/software IP lookups with incremental updates. In ACM SIGCOMM Computer Communication Review 34, 2, (Apr.), 97--122.

Digital Library

[10]

Freescale. C-Port Network Processors. http://www.freescale.com/webapp/sps/site/homepage.jsp&quest; nodeId=02VS0lDFTQ3126.

[11]

Gupta, P. and Mckeown, N. 1999. Packet classification on multiple fields. In Proceedings of ACM SIGCOMM'99. Cambridge, MA. 147--160.

Digital Library

[12]

Gupta, P. and Mckeown, N. 2000. Classifying packets with hierarchical intelligent cuttings. In IEEE Micro 20, 1, (Jan./Feb.), 34--41.

Digital Library

[13]

Hu, X. H., Tang, X. N., and Hua, B. 2006. High-performance IPv6 forwarding algorithm for a multi-core and multithreaded network processor. InProceedings of the Eleventh ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP'06). Mar. 168--177.

Digital Library

[14]

Huawei. Huawei Launches NetEngine80 Core Router At Networld Interop 2001 Exhibition in US. http://www.huawei.com/news/view.do&quest;id=88&cid=-1001.

[15]

Intel. IXP2XXX Network Processors. http://www.intel.com/design/network/products/npfamily/ixp2xxx.htm.

[16]

Kounavis, M., et al. 2003. Directions in packet classification for network processors. In Proceedings of Second Workshop on Network Processors (NP2).

[17]

Kulkarni, C., Gries, M., Sauer, C., and Keutzer, K. 2003. Programming challenges in network processor deployment. In Proceedings of the 2003 International Conference on Compilers, Architecture, and Synthesis for Embedded System. San Jose, CA. 178--187.

Digital Library

[18]

Lakshman, T. V. and Stiliadis, D. 1998. High-speed policy-based packet forwarding using efficient multi-dimensional range matching. In Proceedings of ACM SIGCOMM'98. Vancouver, British Columbia, Canada. 203--214.

Digital Library

[19]

Qi, Y. X. and Li, J. 2006. Towards effective packet classification. In Proceedings of IASTED Conference on Communication, Network, and Information Security (CNIS).

[20]

Sherwood, T., Varghese, G., and Calder, B. 2003. A pipelined memory architecture for high throughput network processors. In Proceedings of the 30th Annual International Symposium on Computer Architecture (ACM ISCA'03). San Diego, CA. 288--299.

Digital Library

[21]

Singh, S., Baboescu, F., Varghese, G., and Wang, J. 2003. Packet classification using multidimensional cutting. InProceedings of ACM SIGCOMM'03. Karlsruhe, Germany. 213--224.

Digital Library

[22]

Spitznagel, E. 2003. Compressed Data Structures for Recursive Flow Classification. http://www.cse.seas.wustl.edu/Research/FileDownload.asp&quest;295.

[23]

Srinivasan, V., Suri, S., Varghese, G., and Waldvogel, M. 1998. Fast and scalable layer four switching. In Proceedings of ACM SIGCOMM'98. Vancouver, British Columbia, Canada. 191--202.

Digital Library

[24]

Tang, X. N. and Gao, G. R. 1998. How “hard” is thread partitioning and how “bad” is a list scheduling based partitioning algorithm&quest; In Proceedings of the tenth annual ACM symposium on Parallel algorithms and architectures (SPAA'98). Puerto Vallarta, Mexico. 130--139.

Digital Library

[25]

Tang, X. N. and Gao, G. R. 1999. Automatically partitioning threads for multithreaded architectures. In Journal of Parallel Distributed Computing 58, 2 (Aug.), 159--189.

Digital Library

[26]

Tang, X. N., Wang, J., Theobald, K., and Gao, G. R. 1997. Thread partitioning and scheduling based on cost model. In Proceedings of the Ninth Annual ACM Symposium on Parallel Algorithms and Architectures (SPAA'97). Newport, RI. 272--281.

Digital Library

[27]

Taylor, D. E. and Turner, J. S. 2005. ClassBench: A packet classification benchmark. In Proceedings of IEEE INFOCOMM'05. Miami, FL. 2068--2079.

Cited By

Wooguil Pak Young-June Choi (2013)High-Performance Packet Classification for Network-Device PlatformsIEEE Communications Letters10.1109/LCOMM.2013.051313.12177817:6(1252-1255)Online publication date: Jun-2013
https://doi.org/10.1109/LCOMM.2013.051313.121778
Zhian HJokar AFarrokhi NSabaei M(2013)A multi-thread based approach for IP address lookup2013 21st Iranian Conference on Electrical Engineering (ICEE)10.1109/IranianCEE.2013.6599786(1-4)Online publication date: May-2013
https://doi.org/10.1109/IranianCEE.2013.6599786
Chang YKuo F(2013)Hint-based cache design for reducing miss penalty in HBS packet classification algorithmJournal of Parallel and Distributed Computing10.1016/j.jpdc.2013.03.00573:8(1170-1182)Online publication date: 1-Aug-2013
https://dl.acm.org/doi/10.1016/j.jpdc.2013.03.005
Show More Cited By

Index Terms

Recommendations

High-performance packet classification algorithm for many-core and multithreaded network processor
CASES '06: Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems

Packet classification is crucial for the Internet to provide more value-added services and guaranteed quality of service. Besides hardware-based solutions, many software-based classification algorithms have been proposed. However, classifying at 10Gbps ...
Scalable packet classification using interpreting: a cross-platform multi-core solution
PPoPP '08: Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming

Packet classification is an enabling technology to support advanced Internet services. It is still a challenge for a software solution to achieve 10Gbps (line-rate) classification speed. This paper presents a classification algorithm that can be ...
High-performance IPv6 forwarding algorithm for multi-core and multithreaded network processor
PPoPP '06: Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming

IP forwarding is one of the main bottlenecks in Internet backbone routers, as it requires performing the longest-prefix match at 10Gbps speed or higher. IPv6 forwarding further exacerbates the situation because its search space is quadrupled. We propose ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Embedded Computing Systems

ACM Transactions on Embedded Computing Systems Volume 7, Issue 2

February 2008

412 pages

ISSN:1539-9087

EISSN:1558-3465

DOI:10.1145/1331331

Issue’s Table of Contents

Copyright © 2008 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Journal Family

ACM Journals for the Design of Smart and Connected Systems

Publication History

Published: 29 January 2008

Accepted: 01 June 2007

Revised: 01 June 2007

Received: 01 January 2007

Published in TECS Volume 7, Issue 2

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

10
Total Citations
View Citations
686
Total Downloads

Downloads (Last 12 months)5
Downloads (Last 6 weeks)0

Reflects downloads up to 15 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Wooguil Pak Young-June Choi (2013)High-Performance Packet Classification for Network-Device PlatformsIEEE Communications Letters10.1109/LCOMM.2013.051313.12177817:6(1252-1255)Online publication date: Jun-2013
https://doi.org/10.1109/LCOMM.2013.051313.121778
Zhian HJokar AFarrokhi NSabaei M(2013)A multi-thread based approach for IP address lookup2013 21st Iranian Conference on Electrical Engineering (ICEE)10.1109/IranianCEE.2013.6599786(1-4)Online publication date: May-2013
https://doi.org/10.1109/IranianCEE.2013.6599786
Chang YKuo F(2013)Hint-based cache design for reducing miss penalty in HBS packet classification algorithmJournal of Parallel and Distributed Computing10.1016/j.jpdc.2013.03.00573:8(1170-1182)Online publication date: 1-Aug-2013
https://dl.acm.org/doi/10.1016/j.jpdc.2013.03.005
Duncan RJungck PRoss KJungck PDuncan RMulcahy D(2011)packetC Language and Parallel Processing of Masked DatabasespacketC Programming10.1007/978-1-4302-4159-1_31(335-344)Online publication date: 2011
https://doi.org/10.1007/978-1-4302-4159-1_31
Duncan RJungck PRoss K(2010)PacketC Language and Parallel Processing of Masked DatabasesProceedings of the 2010 39th International Conference on Parallel Processing10.1109/ICPP.2010.55(472-481)Online publication date: 13-Sep-2010
https://dl.acm.org/doi/10.1109/ICPP.2010.55
Chang YFang-Chen Kuo (2010)Towards optimized packet processing for multithreaded network processor2010 International Conference on High Performance Switching and Routing10.1109/HPSR.2010.5580281(127-132)Online publication date: Jun-2010
https://doi.org/10.1109/HPSR.2010.5580281
Wang JCheng HHua BTang XGschwind MNicolau ASalapura VMoreira J(2009)Practice of parallelizing network applications on multi-core architecturesProceedings of the 23rd international conference on Supercomputing10.1145/1542275.1542307(204-213)Online publication date: 8-Jun-2009
https://dl.acm.org/doi/10.1145/1542275.1542307
Liu YXu DMu ZQin J(2009)Efficient Hybrid Packet Classification in Traffic Control System Using Network ProcessorsProceedings of the 2009 International Conference on Advanced Computer Control10.1109/ICACC.2009.31(57-61)Online publication date: 22-Jan-2009
https://dl.acm.org/doi/10.1109/ICACC.2009.31
Fouliras P(2008)On RTP filtering for network traffic reductionProceedings of the 6th International Conference on Advances in Mobile Computing and Multimedia10.1145/1497185.1497261(356-359)Online publication date: 24-Nov-2008
https://dl.acm.org/doi/10.1145/1497185.1497261
Liu YXu DSun LLiu D(2008)Accurate Traffic Classification with Multi-threaded Processors2008 IEEE International Symposium on Knowledge Acquisition and Modeling Workshop10.1109/KAMW.2008.4810528(478-481)Online publication date: Dec-2008
https://doi.org/10.1109/KAMW.2008.4810528

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Issue’s Table of Contents