article

Free access

A comparison of the effect of branch prediction on multithreaded and scalar architectures

Authors:

Prasad N. Golla,

Eric C. LinAuthors Info & Claims

ACM SIGARCH Computer Architecture News, Volume 26, Issue 4

Pages 3 - 11

https://doi.org/10.1145/1216475.1216476

Published: 01 September 1998 Publication History

Abstract

Speculative instructions execution requires dynamic branch predictors to increase the performance of a processor by executing from predicted branch target routines. Conventional Scalar architectures such as the Superscalar or Multiscalar architecture executes from a single stream, while a Multithreaded architecture executes from multiple streams at a time. Several aggressive branch predictors have been proposed with high prediction accuracies. Unfortunately, none of the branch predictors can provide 100% accuracy. Therefore, there is an inherent limitation on speculative execution in real implementation. In this paper, we show that Multithreaded architecture is a better candidate for utilizing speculative execution than Scalar architectures. Generally the branch prediction performance degradation is compounded for larger window sizes on Scalar architectures, while for a Multithreaded architecture, by increasing the number of executing threads, we could sustain a higher performance for a large aggregated speculative window size. Hence, heavier workloads may increase performance and utilization for Multithreaded architectures. We present analytical and simulation results to support our argument.

References

[1]

T-Y Yeh and Y. N. Patt, "A Comparison of Dynamic Branch Predictors that use Two Levels of Branch History," International Symposium on Computer Architecture, San Diego, California, pp. 257--266, May 1994.

Digital Library

[2]

S-T Pan, K. So and J. T. Rahmeh, "Improving the Accuracy of Dynamic Branch Prediction Using Branch Correlation," Proceedings of the 5th International Conference on Architectural Support for Programming Languages and Operating systems, pp. 76--84, Oct 1992.

Digital Library

[3]

Mike Johnson, "Superscalar Microprocessor Design," P T R Prentice-Hall, Inc. Englewood Cliffs, New Jersey 07632, 1991.

[4]

Manoj Franklin, "The Multiscalar Architecture- Technical Report 1196," University of Wisconsin Madison, Computer Sciences Department, Madison, WI 53706, 1993.

[5]

T-Y Yeh and Y. N. Patt, "Two-Level Adaptive Branch Prediction," Proceedings of the 24th Annual ACM/IEEE International Symposium and Workshop on Microarchitecture, pp. 51--61, Nov 1991.

Digital Library

[6]

J. Lee and A. J. Smith, "Branch Prediction Strategies and Branch target Buffer Design," IEEE Computer, pp. 6--22, Jan 1984.

[7]

J. L. Hennessy and D. A. Patterson, "Computer Architecture: A Quantitative Approach," Morgan Kaufmann Publishers, INC., San Mateo, CA, 1990.

Digital Library

[8]

Prasad Golla and Eric Lin, "An Extension to Tomasulo's Algorithm for Exploiting Instruction and Thread Level Parallelism in Multithreaded Processors," Technical Report: 98-CSE-08, Computer Science and Engineering Department, Southern Methodist University, Dallas, TX 75275.

[9]

Peter Song, "Multithreading Comes of Age," MicroProcessor Report, pp. 13--18, July 14th, 1997.

[10]

T. E. Anderson H. L. Levy B. N. Bershad and E. D. Lazowska, "The Interaction of Architecture and Operating System Design," Computer Architecture News, Vol. 19, No. 2, pp. 108--120, 1991.

[11]

Tera Computer Company, "Press Releases," http://www.tera.com/, 1997.

[12]

R. J. Eickemyer R. E. Johnson S. R. Kunkel M. S. Squillante and S. Liu, "Evaluation of Multithreaded Uniprocessors for Commercial Application Environments," Proceedings of the 23rd Annual International Symposium on Computer Architecture, pp. 203--212, May 1996.

Digital Library

[13]

D. M. Tullsen S. J. Eggers and Henry M. Levy, "Simulataneous Multithreading," Computer Architecture News ACM, Vol. 23, No. 2, pp. 392--425, May, 1995.

Digital Library

[14]

B. J. Smith, "A Pipelined, Shared Resource MIMD Computer," Proceedings of the 1978 International Conference on Parallel Processing, pages 6--9, 1978.

[15]

Anant Agarwal et al., "The MIT Alewife Machine: Architecture and Performance," 22nd International Symposium on Computer Architecture, 1995.

Digital Library

[16]

Jack L. Lo et al., "Converting Thread-Level Parallelism to Instruction-Level Parallelism via Simultaneous Multithreading," International Symposium on Computer Architecture, May, 1996.

[17]

R. Alverson, D. Callahan, D. Cummings, B. Koblenz, A. Porterfield, and B. Smith., "The Tera computer system," In 1990 International Conference on Supercomputing, June 1990.

Digital Library

[18]

Wayne Yamamoto et al, "Performance Estimation of Multistreamed, Superscalar Processors," 27th Hawaii International Conference on System Sciences, pages I:195--204, Jan., 1994.

[19]

Wayne Yamamoto and Mario Nemirovsky, "Increasing Superscalar Performance Through Multistreaming," IFIP WG10.3 Working Conference on Parallel Architectures and Compilation Techniques (PACT95), pages 49--58, June 1995.

Digital Library

[20]

Bernard Karl Gunther, "Superscalar Performance in a Multithreaded Microprocessor," Ph.D. Dissertation, Department of Computer Science, University of Tasmania, Hobart, Dec. 1993.

[21]

Michael D. Smith, Mike Johnson, and Mark A. Horowitz, "Limits on Multiple Instruction Issue," Proceedings of the 3rd International Conference on Architectural Support for Programming Languages and Operating Systems, April, 1989.

Digital Library

[22]

William J. Dally et al, "M-Machine Architecture v 1.0," MIT Concurrent VLSI Architecture Memo 58.

[23]

Eric E. Johnson and Jiheng Ha, "PDATS - Lossless Address Trace Compression For Reducing File Size and Access Time," Proceedings - IEEE International Phoenix Conference on Computers and Communications, pp. 213--219, 1994.

[24]

Prasad N. Golla and Eric C. Lin, "Cache Memory Requirements for Uniprocessor Multithreaded Architecture," Technical Report: 98-CSE-4, Computer Science and Engineering Department, Southern Methodist University, Dallas, TX 75275.

[25]

K. Diefendorff and P. K. Dubey, "How Multimedia Workloads will Change Processor Design," IEEE Computer, pp. 43--45, September, 1997.

Digital Library

Cited By

Swanson SMcDowell LSwift MEggers SLevy H(2003)An evaluation of speculative instruction execution on simultaneous multithreaded processorsACM Transactions on Computer Systems10.1145/859716.85972021:3(314-340)Online publication date: 1-Aug-2003
https://dl.acm.org/doi/10.1145/859716.859720
Kisuki TCorporaal HKnijnenburg P(2002)Branch history register cacheJournal of Scheduling10.1002/jos.1135:5(413-424)Online publication date: 2002
https://doi.org/10.1002/jos.113

A comparison of the effect of branch prediction on multithreaded and scalar architectures

Recommendations

Low-power branch prediction techniques for VLIW architectures: a compiler-hints based approach
Special issue: ACM great lakes symposium on VLSI

The paper introduces a dynamic branch prediction scheme suitable for energy-aware Very Long Instruction Word (VLIW) processors. The proposed technique is based on a compiler hint mechanism to filter the accesses to the branch predictor blocks. We define ...
Value prediction for speculative multithreaded architectures
MICRO 32: Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture

The speculative multithreading paradigm (speculative thread-level parallelism) is based on the concurrent execution of control-speculative threads. The efficiency of microarchitectures that adopt this paradigm strongly depends on the performance of the ...
Reducing misspeculation penalty in trace-level speculative multithreaded architectures
ISHPC'05/ALPS'06: Proceedings of the 6th international symposium on high-performance computing and 1st international conference on Advanced low power systems

Trace-Level Speculative Multithreaded Processors exploit trace-level speculation by means of two threads working cooperatively. One thread, called the speculative thread, executes instructions ahead of the other by speculating on the result of several ...

Comments

Information & Contributors

Information

Published In

cover image ACM SIGARCH Computer Architecture News

ACM SIGARCH Computer Architecture News Volume 26, Issue 4

September 1998

14 pages

ISSN:0163-5964

DOI:10.1145/1216475

Issue’s Table of Contents

Copyright © 1998 Authors.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 September 1998

Published in SIGARCH Volume 26, Issue 4

Check for updates

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
279
Total Downloads

Downloads (Last 12 months)72
Downloads (Last 6 weeks)3

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Swanson SMcDowell LSwift MEggers SLevy H(2003)An evaluation of speculative instruction execution on simultaneous multithreaded processorsACM Transactions on Computer Systems10.1145/859716.85972021:3(314-340)Online publication date: 1-Aug-2003
https://dl.acm.org/doi/10.1145/859716.859720
Kisuki TCorporaal HKnijnenburg P(2002)Branch history register cacheJournal of Scheduling10.1002/jos.1135:5(413-424)Online publication date: 2002
https://doi.org/10.1002/jos.113

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Figures

Tables

Media

View Issue’s Table of Contents