research-article

Recursive design of hardware priority queues

Authors:

Anat Bremler-Barr,

Liron SchiffAuthors Info & Claims

SPAA '13: Proceedings of the twenty-fifth annual ACM symposium on Parallelism in algorithms and architectures

Pages 23 - 32

https://doi.org/10.1145/2486159.2486194

Published: 23 July 2013 Publication History

Abstract

A recursive and fast construction of an n elements priority queue from exponentially smaller hardware priority queues and size n RAM is presented. All priority queue implementations to date either require O (log n) instructions per operation or exponential (with key size) space or expensive special hardware whose cost and latency dramatically increases with the priority queue size. Hence constructing a priority queue (PQ) from considerably smaller hardware priority queues (which are also much faster) while maintaining the O(1) steps per PQ operation is critical. Here we present such an acceleration technique called the Power Priority Queue (PPQ) technique. Specifically, an n elements PPQ is constructed from 2k-1 primitive priority queues of size k√n (k=2,3,...) and a RAM of size n, where the throughput of the construct beats that of a single, size n primitive hardware priority queue. For example an n elements PQ can be constructed from either three √n or five 3√n primitive H/W priority queues.

Applying our technique to a TCAM based priority queue, results in TCAM-PPQ, a scalable perfect line rate fair queuing of millions of concurrent connections at speeds of 100 Gbps. This demonstrates the benefits of our scheme when used with hardware TCAM, we expect similar results with systolic arrays, shift-registers and similar technologies.

As a by product of our technique we present an O(n) time sorting algorithm in a system equipped with a O(w√n) entries TCAM, where here n is the number of items, and w is the maximum number of bits required to represent an item, improving on a previous result that used an Ω(n) entries TCAM. Finally, we provide a lower bound on the time complexity of sorting n elements with TCAM of size O(n) that matches our TCAM based sorting algorithm.

References

[1]

M. Thorup, "Equivalence between priority queues and sorting," in IEEE Symposium on Foundations of Computer Science, 2002, pp. 125--134.

Digital Library

[2]

P. Lavoie, D. Haccoun, and Y. Savaria, "A systolic architecture for fast stack sequential decoders," Communications, IEEE Transactions on, vol. 42, no. 234, pp. 324--335, feb/mar/apr 1994.

[3]

S.-W. Moon, K. Shin, and J. Rexford, "Scalable hardware priority queue architectures for high-speed packet switches," in Real-Time Technology and Applications Symposium, 1997. Proceedings., Third IEEE, jun 1997, pp. 203--212.

Digital Library

[4]

H. Wang and B. Lin, "Pipelined van emde boas tree: Algorithms, analysis, and applications," in IEEE INFOCOM, 2007, pp. 2471--2475.

Digital Library

[5]

K. Mclaughlin, S. Sezer, H. Blume, X. Yang, F. Kupzog, and T. G. Noll, "A scalable packet sorting circuit for high-speed wfq packet scheduling," IEEE Transactions on Very Large Scale Integration Systems, vol. 16, pp. 781--791, 2008.

Digital Library

[6]

A. Ioannou and M. Katevenis, "Pipelined heap (priority queue) management for advanced scheduling in high-speed networks," Networking, IEEE/ACM Transactions on, vol. 15, no. 2, pp. 450--461, april 2007.

Digital Library

[7]

R. Chandra and O. Sinnen, "Improving application performance with hardware data structures," in Parallel Distributed Processing, Workshops and Phd Forum (IPDPSW), 2010 IEEE International Symposium on, april 2010, pp. 1--4.

[8]

R. Panigrahy and S. Sharma, "Sorting and searching using ternary cams," IEEE Micro, vol. 23, pp. 44--53, January 2003.

Digital Library

[9]

Y. Afek, A. Bremler-Barr, and L. Schiff, "Recursive design of hardware priority queues." {Online}. Available: http://www.cs.tau.ac.il/~schiffli/PPQfull.pdf

[10]

L. Zhang, "Virtualclock: a new traffic control algorithm for packet-switched networks," ACM Transactions on Computer Systems (TOCS), vol. 9, no. 2, pp. 101--124, may 1991.

Digital Library

[11]

P. Goyal, H. Vin, and H. Cheng, "Start-time fair queueing: a scheduling algorithm for integrated services packet switching networks," Networking, IEEE/ACM Transactions on, vol. 5, no. 5, pp. 690--704, oct 1997.

Digital Library

[12]

S. Keshav, An engineering approach to computer networking: ATM networks, the Internet, and the telephone network. Boston, MA, USA: Addison-Wesley Longman Publishing Co., Inc., 1997.

Digital Library

[13]

A. Kortebi, L. Muscariello, S. Oueslati, and J. Roberts, "Evaluating the number of active flows in a scheduler realizing fair statistical bandwidth sharing," in Proceedings of the 2005 ACM SIGMETRICS international conference on Measurement and modeling of computer systems, ser. SIGMETRICS '05. New York, NY, USA: ACM, 2005, pp. 217--228. {Online}. Available: http://doi.acm.org/10.1145/1064212.1064237

Digital Library

[14]

M. Shreedhar and G. Varghese, "Efficient fair queueing using deficit round-robin," IEEE/ACM Trans. Netw., vol. 4, pp. 375--385, June 1996. {Online}. Available: http://dx.doi.org/10.1109/90.502236

Digital Library

[15]

H. Wang and B. Lin, "Succinct priority indexing structures for the management of large priority queues," in Quality of Service, 2009. IWQoS. 17th International Workshop on, july 2009, pp. 1--5.

[16]

X. Zhuang and S. Pande, "A scalable priority queue architecture for high speed network processing," in INFOCOM 2006. 25th IEEE International Conference on Computer Communications. Proceedings, april 2006, pp. 1--12.

[17]

G. S. Brodal, J. L. TrÃd'ff, and C. D. Zaroliagis, "A parallel priority queue with constant time operations," Journal of Parallel and Distributed Computing, vol. 49, no. 1, pp. 4 --21, 1998.

Digital Library

[18]

A. V. Gerbessiotis and C. J. Siniolakis, "Architecture independent parallel selection with applications to parallel priority queues," Theoretical Computer Science, vol. 301, no. 1A S3, pp. 119--142, 2003.

Digital Library

[19]

J. Garcia, M. March, L. Cerda, J. Corbal, and M. Valero, "On the design of hybrid dram/sram memory schemes for fast packet buffers," in High Performance Switching and Routing, 2004. HPSR. 2004 Workshop on, 2004, pp. 15--19.

[20]

H. J. Chao and B. Liu, High Performance Switches and Routers. John Wiley & Sons, Inc., 2006.

Digital Library

[21]

J. Patel, E. Norige, E. Torng, and A. X. Liu, "Fast regular expression matching using small tcams for network intrusion detection and prevention systems," in USENIX Security Symposium, 2010, pp. 111--126.

Digital Library

[22]

Packet size distribution comparison between Internet links in 1998 and 2008, CAIDA. {Online}. Available: http://www.caida.org/research/traffic-analysis/pkt_size_ distribution/graphs.xml

[23]

A. M. Ben-amram, "When can we sort in o(n log n) time"? Journal of Computer and System Sciences, vol. 54, pp. 345--370, 1997.

Digital Library

[24]

B. Agrawal and T. Sherwood, "Ternary cam power and delay model: Extensions and uses," IEEE Transactions on Very Large Scale Integration Systems, vol. 16, pp. 554--564, 2008.

Digital Library

Cited By

Collinson SBai ASinnen O(2024)A Fast Scalable Hardware Priority Queue and Optimizations for Multi-Pushes2024 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)10.1109/IPDPSW63119.2024.00038(134-140)Online publication date: 27-May-2024
https://doi.org/10.1109/IPDPSW63119.2024.00038
Bremler-Barr AHarchol YHay DHel-Or YHarchol YBremler-Barr AHay DHel-Or Y(2018)Encoding Short Ranges in TCAM Without ExpansionIEEE/ACM Transactions on Networking10.1109/TNET.2018.279769026:2(835-850)Online publication date: 1-Apr-2018
https://dl.acm.org/doi/10.1109/TNET.2018.2797690
Bremler-Barr AHarchol YHay DHel-Or YScheideler CGilbert S(2016)Encoding Short Ranges in TCAM Without ExpansionProceedings of the 28th ACM Symposium on Parallelism in Algorithms and Architectures10.1145/2935764.2935769(35-46)Online publication date: 11-Jul-2016
https://dl.acm.org/doi/10.1145/2935764.2935769

Recommendations

Waiting time and queue length analysis of Markov-modulated fluid priority queues
Abstract
This paper considers a multi-type fluid queue with priority service. The input fluid rates are modulated by a Markov chain, which is common for all fluid types. The service rate of the queue is constant. Various performance measures are derived, ...
An Arrival Time Approach to M/G/1-type Queues with Generalized Vacations

We propose a simple way, called the arrival time approach, of finding the queue length distributions for M/G/1-type queues with generalized server vacations. The proposed approach serves as a useful alternative to understanding complicated queueing ...
(N,n)-preemptive priority queues

In this paper, we propose a new priority discipline, called the (N,n)-preemptive priority discipline. Under this discipline, the preemption of the service of a low-class customer is determined by two thresholds N and n of the queue length of high-class ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SPAA '13: Proceedings of the twenty-fifth annual ACM symposium on Parallelism in algorithms and architectures

July 2013

348 pages

ISBN:9781450315722

DOI:10.1145/2486159

General Chair:
Guy Blelloch
Carnegie Mellon University, USA
,
Program Chair:
Berthold Vöcking
RWTH Aachen University, Germany

Copyright © 2013 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 July 2013

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

SPAA '13

Sponsor:

SPAA '13: 25th ACM Symposium on Parallelism in Algorithms and Architectures

July 23 - 25, 2013

Québec, Montréal, Canada

Acceptance Rates

SPAA '13 Paper Acceptance Rate 31 of 130 submissions, 24%;

Overall Acceptance Rate 447 of 1,461 submissions, 31%

Upcoming Conference

SPAA '25

Sponsor:
sigact
sigact

37th ACM Symposium on Parallelism in Algorithms and Architectures

July 28 - August 1, 2025

Portland , OR , USA

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3
Total Citations
View Citations
219
Total Downloads

Downloads (Last 12 months)13
Downloads (Last 6 weeks)0

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Collinson SBai ASinnen O(2024)A Fast Scalable Hardware Priority Queue and Optimizations for Multi-Pushes2024 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)10.1109/IPDPSW63119.2024.00038(134-140)Online publication date: 27-May-2024
https://doi.org/10.1109/IPDPSW63119.2024.00038
Bremler-Barr AHarchol YHay DHel-Or YHarchol YBremler-Barr AHay DHel-Or Y(2018)Encoding Short Ranges in TCAM Without ExpansionIEEE/ACM Transactions on Networking10.1109/TNET.2018.279769026:2(835-850)Online publication date: 1-Apr-2018
https://dl.acm.org/doi/10.1109/TNET.2018.2797690
Bremler-Barr AHarchol YHay DHel-Or YScheideler CGilbert S(2016)Encoding Short Ranges in TCAM Without ExpansionProceedings of the 28th ACM Symposium on Parallelism in Algorithms and Architectures10.1145/2935764.2935769(35-46)Online publication date: 11-Jul-2016
https://dl.acm.org/doi/10.1145/2935764.2935769

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten