Article

Free Access

Simultaneous multithreading: maximizing on-chip parallelism

Authors:
Dean M. Tullsen

Department of Computer Science and Engineering, University of Washington, Seattle, WA

Department of Computer Science and Engineering, University of Washington, Seattle, WA
View Profile

,
Susan J. Eggers

Department of Computer Science and Engineering, University of Washington, Seattle, WA

Department of Computer Science and Engineering, University of Washington, Seattle, WA
View Profile

,
Henry M. Levy

Department of Computer Science and Engineering, University of Washington, Seattle, WA

Department of Computer Science and Engineering, University of Washington, Seattle, WA
View Profile

ISCA '98: 25 years of the international symposia on Computer architecture (selected papers)August 1998Pages 533–544https://doi.org/10.1145/285930.286011

Published:01 August 1998Publication History

ISCA '98: 25 years of the international symposia on Computer architecture (selected papers)

Pages 533–544

References

1.A. Agarwal. Performance tradeoffs in multithreaded processots. IEEE Transactions on Parallel and Distributed Systems, 3(5):525-539, September 1992. Google ScholarDigital Library
2.A. Agarwal, B.H. Lira, D. Kranz, and J. Kubiatowicz. APRIL: a processor architecture for multiprocessing. In 17th Annual International Symposium on Computer Architecture. pages 104-114, May 1990. Google ScholarDigital Library
3.R. Alverson, D. Callahan, D. Cummings, B. Koblenz, A. Porterfield, and B. Smith. The Tera computer system. In International Conference on Supercomputing, pages 1--6, June 1990. Google ScholarDigital Library
4.R. Bedichek. Some efficient architecture simulation techniques. In Winter 1990 Usenix Conference, pages 53--63, January 1990.Google Scholar
5.M. Butler, T.Y. Yeh, Y. Patt, M. Alsup, H. Scales, and M. Shebanow. Single instruction steam parallelism is greater than two. in 18th Annual International Symposium on Computer Architecture, pages 276-286, May 199 I. Google ScholarDigital Library
6.G.E. Daddis, Jr. and H.C. Tomg. The concurrent execution of multiple instruction streams on superscalar processors. In International Conference on Parallel Processing, pages 1:76- 83, August 199 !.Google Scholar
7.W.J. Daily, S.W. Keckler, N. Carter. A. Chang, M. Fillo, and W.S. Lee. M-Machine architecture el.0. Technical Report- MIT Concurrent VLSI Architecture Memo 58, Massachusetts Institute of Technology, March 1994.Google Scholar
8.H. Davis, S.R. Goldschmidt, and J. Hennessy. Multiprocessor simulation and tracing using Tango. In International Conference on Parallel Processing, pages II:99- i 07, August 1991.Google Scholar
9.M. Denman. PowerPC 604. In Hot Chips VI, pages 193-200, August 1994,Google Scholar
10.K.M. Dixit. New CPU benchmark suites from SPEC. In COMPCON, Spring 1992, pages 305-310, 1992. Google ScholarDigital Library
11.J. Edmondson and P Rubinfietd. An overview of the 21164 AXP microprocessor. In Hot Chips VI, pages 1-8, August 1994.Google Scholar
12.M. Franklin. The Multiscalar Architecture. PhD thesis, University of Wisconsin, Madison, 1993. Google ScholarDigital Library
13.M. Franklin and G.S. Sohi. The expandable split window paradigm for exploiting fine-grain parallelism. In 19th Annual International Symposium on Computer Architecture, pages 58,--67, May 1992. Google ScholarDigital Library
14.A. Gupta, J. Hennessy, K. Gharachorloo, T. Mowry, and W.D. Weber. Comparative evaluation of latency reducing and tolerating techniques. In 18th Annual International Symposium on Computer Architecture, pages 254--263, May 199 !. Google ScholarDigital Library
15.R.H. Halstead and T. Fujita. MASA: A multithreaded processor architecture for parallel symbolic computing. In 15th Annual International Symposium on Computer Architecture, pages 4,43-451, May 1988. Google ScholarDigital Library
16.H. Hiram, K. Kimura, S. Nagamine, Y. Mochizuki, A. Nishimura, Y. Nakase, and T Nishizawa. An elementary processor architecture with simultaneous instruction issuing from multiple threads. In 19th Annual International Symposlum on Computer Architecture, pages 136-145. May 1992. Google ScholarDigital Library
17.S.W. Keckler and W.j. Dally. Processor coupling: Integrating compile time and runtime scheduling for parallelism. In 19th Annual International Symposium on Computer Architecture, pages 202-213, May 1992. Google ScholarDigital Library
18.M.S. Lam and R.P. Wilson. Limits of control flow on parallelism. In 19th Annual International Symposium on Computer Architecture, pages 46-57. May 1992. Google ScholarDigital Library
19.J. Laudon, A. Gupta, and M. Horowitz. Interleaving: A multithreading technique targeung multiprocessors and workstations. In Sixth International Conference on Architectural Support/'or Programmtng Languages and Operating Systems, pages 308-318, October 1994. Google ScholarDigital Library
20.P.G. Lowney, S.M. Freudenberger, T.J, Karzes, W.D. Lichtenstein, R.P. Nix, J.S. ODonnell, and J.C. Ruttenberg. The muitiflow trace scheduling compiler. Journal of Supercomputing, 7(I-2):51-142, May 1993. Google ScholarDigital Library
21.D.C. McCrackin. The synergistic effect of thread scheduling and caching in multithreaded computers. In COMPCON, Spring 1993, pages 157-164, 1993.Google ScholarCross Ref
22.R.S. Nikhil and Arvind. Can dataflow subsume von Neumann computing? In 16th Annual International Symposium on Computer Architecture, pages 262-272, June 1989. Google ScholarDigital Library
23.R.G. Prasadh and C.-L. Wu. A benchmark evaluation of a multi-threaded RISC processor architecture. In International Conference on Parallel Processing, pages 1:84--91, August 1991.Google Scholar
24.Microprocessor Report, October 24 1994.Google Scholar
25.Microprocessor Report, October 3 1994.Google Scholar
26.Microprocessor Report, November 14 1994.Google Scholar
27.R.H. Saavedra-Barrera, D.E. Culler, and T. von Eicken. Analysis of multithreaded architectures for parallel computing. In Second Annual ACM Symposium on Parallel Algorithms and Architectures, pages 169-I 78, July 1990. Google ScholarDigital Library
28.B.J. Smith. Architecture and applications of the HEP multiprocessor computer system. In SPIE Real 7qme Signal Processing IV, pages 241--248, 1981.Google Scholar
29.J. Smith. A study of branch prediction strategies. In 8th Annual International Symposium on Computer Architecture, pages 135-148,May 1981. Google ScholarDigital Library
30.G.S. Sohi and M. Franklin. High-bandwidth data memory systems for superscalar processors. In Fourth International Conference on Architectural Support for Programming 1.anguages and Operating Systems, pages 53-62, April 1991. Google ScholarDigital Library
31.R. Thekkath and S.J. Eggers. The effectiveness of multiple hardware contexts. In Sixth International Conference on Ar. chitectural Support for Programming Languages and Operat. ing Systems, pages 328-337, October 1994. Google ScholarDigital Library
32.D.W. Wall. Limits of instruction-level parallelism, in Fourth International Conference on Architectural Support for Pro. gramming Languagesand Operating Systems, pages 176-188, April 1991. Google ScholarDigital Library
33.W.D. Weber and A. Gupta. Exploring the benefits of multiple hardware contexts in a multiprocessor architecture: preliminaty results, in 16th Annual International Symposium on Computer Architecture, pages 273-280, June 1989. Google ScholarDigital Library
34.W. Yamamoto, M.J. Serrano, A.R. Talcott, R.C. Wood, and M. Nemirosky. Performance estimation of multistreamed, superscatar processors. In Twenty-Seventh Hawaii Internation Conferenceon System Sciences, pages 1:I 95-204, January 1994.Google Scholar

Index Terms

Simultaneous multithreading: maximizing on-chip parallelism

Recommendations

Simultaneous multithreading: maximizing on-chip parallelism
ISCA '95: Proceedings of the 22nd annual international symposium on Computer architecture

This paper examines simultaneous multithreading, a technique permitting several independent threads to issue instructions to a superscalar's multiple functional units in a single cycle. We present several models of simultaneous multithreading and ...
Read More
Converting thread-level parallelism to instruction-level parallelism via simultaneous multithreading

To achieve high performance, contemporary computer systems rely on two forms of parallelism: instruction-level parallelism (ILP) and thread-level parallelism (TLP). Wide-issue super-scalar processors exploit ILP by executing multiple instructions from a ...
Read More
Simultaneous multithreading
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ISCA '98: 25 years of the international symposia on Computer architecture (selected papers)
August 1998
546 pages
ISBN:1581130589
DOI:10.1145/285930
Editor:
Gurindar S. Sohi
Univ. of Wisconsin-Madison, Madison
Copyright © 1998 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 August 1998
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- Article
Conference

Acceptance Rates
Overall Acceptance Rate543of3,203submissions,17%
Upcoming Conference
ISCA '24

Sponsor:

sigarch

ISCA '24: The 51st Annual International Symposium on Computer Architecture

June 29 - July 3, 2024

Buenos Aires , Argentina
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 48
  Total Citations
  View Citations
- 1,832
  Total Downloads
- Downloads (Last 12 months)107
- Downloads (Last 6 weeks)19
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Simultaneous multithreading: maximizing on-chip parallelism

ISCA '98: 25 years of the international symposia on Computer architecture (selected papers)

References

Cited By

Index Terms

Recommendations

Simultaneous multithreading: maximizing on-chip parallelism

Converting thread-level parallelism to instruction-level parallelism via simultaneous multithreading

Simultaneous multithreading

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Simultaneous multithreading: maximizing on-chip parallelism

ISCA '98: 25 years of the international symposia on Computer architecture (selected papers)

References

Cited By

Index Terms

Recommendations

Simultaneous multithreading: maximizing on-chip parallelism

Converting thread-level parallelism to instruction-level parallelism via simultaneous multithreading

Simultaneous multithreading

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media