Abstract
Simultaneous MultiThreading (SMT) achieves better system resource utilization and higher performance because it exploits Thread-Level Parallelism (TLP) in addition to “conventional” Instruction-Level Parallelism (ILP). Theoretically, system resources in every pipeline stage of an SMT microarchitecture can be dynamically shared. However, in commercial applications, all the major queues are statically partitioned. From an implementation point of view, static partitioning of resources is easier to implement and has a lower hardware overhead and power consumption. In this paper, we strive to quantitatively determine the tradeoff between static partitioning and dynamic sharing. We find that static partitioning of either the instruction fetch queue (IFQ) or the reorder buffer (ROB) is not sufficient if implemented alone (3% and 9% performance decrease respectively in the worst case comparing with dynamic sharing), while statically partitioning both the IFQ and the ROB could achieve an average performance gain of 9% at least, and even reach 148% when running with floating-point benchmarks, when compared with dynamic sharing. We varied the number of functional units in our efforts to isolate the reason for this performance improvement. We found that static partitioning both queues outperformed all the other partitioning mechanisms under the same system configuration. This demonstrates that the performance gain has been achieved by moving from dynamic sharing to static partitioning of the system resources.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Raasch, S.E., Reinhardt, S.K.: The Impact of Resource Partitioning on SMT Processors. In: Proceedings of the 12th Intenrational Conference on Parallel Architectures and Compilation Techniques (PACT 2003), New Orleans, Louisiana, USA, September 27 - October 01, pp. 15–26 (2003)
Sazeides, Y., Juan, T.: How to Compare the Performace of Two SMT Microarchitectures. In: Proceedings of 2001 IEEE International Symposium on Performance Analysis of System and Software (ISPASS-2001), Tucson, Arizona, USA, November 4-6 (2001)
Burger, D., Austin, T.: The SimpleScalar Tool Set, Version 2.0. University of Wisconsin-Madison Computer Science Department Technical Report No.1342 (June 1997)
Koufaty, D., Marr, D.T.: Hyperthreading Technology in the Netburst Microarchitecture. IEEE Micro (March-April 2003)
Marr, D.T., Binns, F., Hill, D.L., Hinton, G., Koufaty, D.A., Miller, J., Alan, U.M.: Hyper-Threading Technology Architecture and Microarchitecture. Intel Technology Journal Q1 (2002)
SPEC CPU 2000 Benchmark Suite (2000), http://www.specbench.org/osg/cpu2000/
Kang, D., Gaudiot, J.-L.: Speculation control for simultaneous Multithreading. In: Proceedings of the 18th International Parallel and Distributed Processing Symposium (IPDPS 2004), Santa Fe, New Mexico, April 26-30 (2004)
Alverson, R., Callahan, D., Cummings, D., Koblenz, B., Porterfield, A., Smith, B.: The TERA Computer System. ACM SIGARCH Computer Architecture News 18(3), 1–6 (1990)
Smith, B.J.: Architecture and Applications of the HEP Multiprocessor Computer System. SPIE Real Time Signal Processing IV, 241–248 (1981)
Hinton, G., Sager, D., Upton, M., Boggs, D., Carmean, D., Kyker, A., Roussel, P.: The Microarchitecture of the Pentium 4 Processor. Intel Technology Journal Q1 (2001)
Preston, R.P., Badeau, R.W., Bailey, D.W., Bell, S.L., et al.: Design of an 8-wide Superscalar RISC Microprocessor with Simultaneous Multithreading. In: Proceedings of 2002 IEEE International Solid-State Circuits Conference (ISSCC 2002), vol. 1 (2002)
Thistle, M.R., Smith, B.J.: A Processor Architecture for HORIZON. In: Proceedings of the 1988 ACM/IEEE Conference on Supercomputing, Orlando, Florida, USA, November 12-17, pp. 35–41 (1988)
Agarwal, A., Lim, B.-H., Kranz, D., Kubiatowicz, J.: APRIL: A Processor Architecture for Multiprocessing. In: Proceedings of the 17th Annual International Symposium on Computer Architecture (ISCA 1990), pp. 104–114 (1990)
Nemirovsky, M.D., Brewer, F., Wood, R.C.: DISC: Dynamic Instruction Stream Computer. In: Proceedings of the 24th annual international symposium on Microarchitecture (Micro-24), Albuquerque, New Mexico, Puerto Rico, pp. 163–171 (1991)
Yamamoto, W., Nemirovsky, M.D.: Increasing superscalar performance through multistreaming. In: Proceedings of the IFIP WG10.3 working conference on Parallel architectures and compilation techniques, Limassol, Cyprus, pp. 49–58 (1995)
Tullsen, D.M., Eggers, S.J., Levy, H.M.: Simultaneous Multithreading: Maximizing On-chip Parallelism. In: Proceedings of the 22nd Annual International Symposium on Computer Architecture (ISCA 1995), pp. 392–403 (1995)
Tullsen, D.M., Eggers, S.J., Emer, J.S., Levy, H.M., Lo, J.L., Stamm, R.L.: Exploiting choice: Instruction fetch and issue on an implementable simultaneous multithreading processor. In: Proceedings of the 23rd Annual International Symposium on Computer Architecture (ISCA 1996), pp. 191–202 (1996)
Eggers, S.J., Emer, J.S., Levy, H.M., Lo, J.L., Stamm, R.L., Tullsen, D.M.: Simultaneous Multithreading: A Platform for Next-Generation Processors. IEEE Micro 17(5), 12–19 (1997)
Shin, C.-H., Lee, S.-W., Gaudiot, J.-L.: Dynamic Scheduling Issues in SMT Architectures. In: Proceedings of the 17th International Parallel and Distributed Processing Symposium (IPDPS 2003), Nice, France, April 22-26, pp. 77–84 (2003)
Burns, J., Gaudiot, J.-L.: SMT Layout Overhead and Scalability. IEEE Transactions on Parallel and Distributed Systems 13(2), 142–155 (2002)
Lee, S.-W., Gaudiot, J.-L.: Clustered Microarchitecture Simultaneous Multithreading. In: Kosch, H., Böszörményi, L., Hellwagner, H. (eds.) Euro-Par 2003. LNCS, vol. 2790, pp. 576–585. Springer, Heidelberg (2003)
Thornton, J.E.: Design of a computer: the CDC 6600. Scott, Foresman Co., Glenview, Ill (1970)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Liu, C., Gaudiot, JL. (2005). Static Partitioning vs Dynamic Sharing of Resources in Simultaneous MultiThreading Microarchitectures. In: Cao, J., Nejdl, W., Xu, M. (eds) Advanced Parallel Processing Technologies. APPT 2005. Lecture Notes in Computer Science, vol 3756. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11573937_11
Download citation
DOI: https://doi.org/10.1007/11573937_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29639-3
Online ISBN: 978-3-540-32107-1
eBook Packages: Computer ScienceComputer Science (R0)