Abstract
Three basic structures have been proposed to organize the task queues for shared-memory multiprocessor systems: centralized, distributed, and hierarchical. Centralized structures are not suitable for massively parallel systems since the shared queue becomes a bottleneck for frequent enqueuing and dequeuing operations. Distributed structures have load imbalancing problem because of no support for workload sharing between queues. Hierarchical structures intend to combine the advantage of the previous two structures and eliminate their disadvantages. Unfortunately, we find load imbalancing still exists in the hierarchical structure, and has significant impact on system performance, particularly when the workload is heavy and irregular. After identifying the cause of this problem, we propose the use of a clustered structure in place of the hierarchical one. Analyzes and simulations show the proposed structure can provide better load balancing and less contention than the hierarchical one.
Similar content being viewed by others
References
D. G. Feitelson, L. Rudolph, U. Schwiehelshohn, K. C. Sevcik, and P. Wong. Theory and practice in parallel jobscheduling. In Job Scheduling Strategies for Parallel Processing, Lecture Notes in Computer Science 1291, April 1997, pp. 1–34.
P. Stenstrom, E. Hagersten, D. J. Lilja, M. Martonosi, and M. Venugopal. Trends in shared memory multiprocessing. IEEE Computer, pp. 44–50, December 1997.
C. Natarajan, R. K. Iyer, and S. Sharma. Experimental evaluation of performance and scalability of a multiprogramd shared-memory multiprocessor. In Proc. of the 5th IEEE Symposium on Parallel and Distributed Processing, pp. 11–18, December 1993.
D. E. Lenoski and W.-D. Weber. Scalable Shared-Memory Multiprocessing, Morgan Kaufmann Publishers, 1997.
BBN Advanced Computer Inc. Inside the TC2000, 1989.
S. Frank, H. Burkhardt, and J. Rothnie. The KSR-1: Bridging the gap between shared memory and MPPs. In Proc. of Compcon'93, pp. 285–294, 1993.
J. Kuskin et al. The stanford flash multiprocessor. In Proc. of the 21st International Symposium on Computer Architecture, pp. 302–313, 1994.
L. M. Ni and C. E. Wu. Design tradeoff for process scheduling in shared memory multiprocessor systems. IEEE Transactions on Software Engineering, SE-15(3):327–334, 1989.
S. P. Dandamudi and P. S. P. Cheng. A hierarchical task queue organization for shared-memory multiprocessor systems. IEEE Transactions on Parallel and Distributed Systems, 6(1):1–16, 1995.
D. Lenoski, J. Laudon, K. Gharachorloo, W.-D. Weber, A. Gupta, J. Hennessy, M. Horowize, and M. Lam. The Stanford dash multiprocessor. IEEE Computer, 25:63–79, Mar. 1992.
S. P. Dandamudi. Reducing run queue contention in shared memory multiprocessos. IEEE Computer, pp. 82–89, March 1997.
D. D. Yao, M. L. Chaudhey, and J. G. C. Templeton. A Note on some relations in the queue GI x /M/c. Operations Research Letters, 3(1):53–56, 1984.
F. Chen and Y. S. Zheng. One-warehouse multretailer systems with centralized stock information. Operations Research, 45(2), 1997.
P. M. Ghare. Multichannel queueing system with bulk service. Operations Research, 16:189–192, 1968.
F. S. Hillier and O. S. Yu. Queueing Tables and Graphs, North Holland, 1981.
M. F. Arlitt and C. L. Williamson. Webserver workload characterization: The search for invarants. In Proc. ACM SIGMETRICS '96, pp. 126–137, 1996.
M. Nabe, M. Murata, and H. Miyahara. Analysis and modeling of world wide web traffic for capacity dimensioning of internet access lines. Performance Evaluation, (34), 1998.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Zhu, W. Cluster Queue Structure for Shared-Memory Multiprocessor Systems. The Journal of Supercomputing 25, 215–236 (2003). https://doi.org/10.1023/A:1024247027039
Issue Date:
DOI: https://doi.org/10.1023/A:1024247027039