skip to main content
article
Free access

A comparison of the effect of branch prediction on multithreaded and scalar architectures

Published: 01 September 1998 Publication History

Abstract

Speculative instructions execution requires dynamic branch predictors to increase the performance of a processor by executing from predicted branch target routines. Conventional Scalar architectures such as the Superscalar or Multiscalar architecture executes from a single stream, while a Multithreaded architecture executes from multiple streams at a time. Several aggressive branch predictors have been proposed with high prediction accuracies. Unfortunately, none of the branch predictors can provide 100% accuracy. Therefore, there is an inherent limitation on speculative execution in real implementation. In this paper, we show that Multithreaded architecture is a better candidate for utilizing speculative execution than Scalar architectures. Generally the branch prediction performance degradation is compounded for larger window sizes on Scalar architectures, while for a Multithreaded architecture, by increasing the number of executing threads, we could sustain a higher performance for a large aggregated speculative window size. Hence, heavier workloads may increase performance and utilization for Multithreaded architectures. We present analytical and simulation results to support our argument.

References

[1]
T-Y Yeh and Y. N. Patt, "A Comparison of Dynamic Branch Predictors that use Two Levels of Branch History," International Symposium on Computer Architecture, San Diego, California, pp. 257--266, May 1994.
[2]
S-T Pan, K. So and J. T. Rahmeh, "Improving the Accuracy of Dynamic Branch Prediction Using Branch Correlation," Proceedings of the 5th International Conference on Architectural Support for Programming Languages and Operating systems, pp. 76--84, Oct 1992.
[3]
Mike Johnson, "Superscalar Microprocessor Design," P T R Prentice-Hall, Inc. Englewood Cliffs, New Jersey 07632, 1991.
[4]
Manoj Franklin, "The Multiscalar Architecture- Technical Report 1196," University of Wisconsin Madison, Computer Sciences Department, Madison, WI 53706, 1993.
[5]
T-Y Yeh and Y. N. Patt, "Two-Level Adaptive Branch Prediction," Proceedings of the 24th Annual ACM/IEEE International Symposium and Workshop on Microarchitecture, pp. 51--61, Nov 1991.
[6]
J. Lee and A. J. Smith, "Branch Prediction Strategies and Branch target Buffer Design," IEEE Computer, pp. 6--22, Jan 1984.
[7]
J. L. Hennessy and D. A. Patterson, "Computer Architecture: A Quantitative Approach," Morgan Kaufmann Publishers, INC., San Mateo, CA, 1990.
[8]
Prasad Golla and Eric Lin, "An Extension to Tomasulo's Algorithm for Exploiting Instruction and Thread Level Parallelism in Multithreaded Processors," Technical Report: 98-CSE-08, Computer Science and Engineering Department, Southern Methodist University, Dallas, TX 75275.
[9]
Peter Song, "Multithreading Comes of Age," MicroProcessor Report, pp. 13--18, July 14th, 1997.
[10]
T. E. Anderson H. L. Levy B. N. Bershad and E. D. Lazowska, "The Interaction of Architecture and Operating System Design," Computer Architecture News, Vol. 19, No. 2, pp. 108--120, 1991.
[11]
Tera Computer Company, "Press Releases," http://www.tera.com/, 1997.
[12]
R. J. Eickemyer R. E. Johnson S. R. Kunkel M. S. Squillante and S. Liu, "Evaluation of Multithreaded Uniprocessors for Commercial Application Environments," Proceedings of the 23rd Annual International Symposium on Computer Architecture, pp. 203--212, May 1996.
[13]
D. M. Tullsen S. J. Eggers and Henry M. Levy, "Simulataneous Multithreading," Computer Architecture News ACM, Vol. 23, No. 2, pp. 392--425, May, 1995.
[14]
B. J. Smith, "A Pipelined, Shared Resource MIMD Computer," Proceedings of the 1978 International Conference on Parallel Processing, pages 6--9, 1978.
[15]
Anant Agarwal et al., "The MIT Alewife Machine: Architecture and Performance," 22nd International Symposium on Computer Architecture, 1995.
[16]
Jack L. Lo et al., "Converting Thread-Level Parallelism to Instruction-Level Parallelism via Simultaneous Multithreading," International Symposium on Computer Architecture, May, 1996.
[17]
R. Alverson, D. Callahan, D. Cummings, B. Koblenz, A. Porterfield, and B. Smith., "The Tera computer system," In 1990 International Conference on Supercomputing, June 1990.
[18]
Wayne Yamamoto et al, "Performance Estimation of Multistreamed, Superscalar Processors," 27th Hawaii International Conference on System Sciences, pages I:195--204, Jan., 1994.
[19]
Wayne Yamamoto and Mario Nemirovsky, "Increasing Superscalar Performance Through Multistreaming," IFIP WG10.3 Working Conference on Parallel Architectures and Compilation Techniques (PACT95), pages 49--58, June 1995.
[20]
Bernard Karl Gunther, "Superscalar Performance in a Multithreaded Microprocessor," Ph.D. Dissertation, Department of Computer Science, University of Tasmania, Hobart, Dec. 1993.
[21]
Michael D. Smith, Mike Johnson, and Mark A. Horowitz, "Limits on Multiple Instruction Issue," Proceedings of the 3rd International Conference on Architectural Support for Programming Languages and Operating Systems, April, 1989.
[22]
William J. Dally et al, "M-Machine Architecture v 1.0," MIT Concurrent VLSI Architecture Memo 58.
[23]
Eric E. Johnson and Jiheng Ha, "PDATS - Lossless Address Trace Compression For Reducing File Size and Access Time," Proceedings - IEEE International Phoenix Conference on Computers and Communications, pp. 213--219, 1994.
[24]
Prasad N. Golla and Eric C. Lin, "Cache Memory Requirements for Uniprocessor Multithreaded Architecture," Technical Report: 98-CSE-4, Computer Science and Engineering Department, Southern Methodist University, Dallas, TX 75275.
[25]
K. Diefendorff and P. K. Dubey, "How Multimedia Workloads will Change Processor Design," IEEE Computer, pp. 43--45, September, 1997.

Cited By

View all
  • (2003)An evaluation of speculative instruction execution on simultaneous multithreaded processorsACM Transactions on Computer Systems10.1145/859716.85972021:3(314-340)Online publication date: 1-Aug-2003
  • (2002)Branch history register cacheJournal of Scheduling10.1002/jos.1135:5(413-424)Online publication date: 2002
  1. A comparison of the effect of branch prediction on multithreaded and scalar architectures

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image ACM SIGARCH Computer Architecture News
        ACM SIGARCH Computer Architecture News  Volume 26, Issue 4
        September 1998
        14 pages
        ISSN:0163-5964
        DOI:10.1145/1216475
        Issue’s Table of Contents

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 01 September 1998
        Published in SIGARCH Volume 26, Issue 4

        Check for updates

        Qualifiers

        • Article

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)72
        • Downloads (Last 6 weeks)3
        Reflects downloads up to 05 Mar 2025

        Other Metrics

        Citations

        Cited By

        View all
        • (2003)An evaluation of speculative instruction execution on simultaneous multithreaded processorsACM Transactions on Computer Systems10.1145/859716.85972021:3(314-340)Online publication date: 1-Aug-2003
        • (2002)Branch history register cacheJournal of Scheduling10.1002/jos.1135:5(413-424)Online publication date: 2002

        View Options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Login options

        Figures

        Tables

        Media

        Share

        Share

        Share this Publication link

        Share on social media