- 1.A. Agarwal. Performance tradeoffs in multithreaded processots. IEEE Transactions on Parallel and Distributed Systems, 3(5):525-539, September 1992. Google ScholarDigital Library
- 2.A. Agarwal, B.H. Lira, D. Kranz, and J. Kubiatowicz. APRIL: a processor architecture for multiprocessing. In 17th Annual International Symposium on Computer Architecture. pages 104-114, May 1990. Google ScholarDigital Library
- 3.R. Alverson, D. Callahan, D. Cummings, B. Koblenz, A. Porterfield, and B. Smith. The Tera computer system. In International Conference on Supercomputing, pages 1--6, June 1990. Google ScholarDigital Library
- 4.R. Bedichek. Some efficient architecture simulation techniques. In Winter 1990 Usenix Conference, pages 53--63, January 1990.Google Scholar
- 5.M. Butler, T.Y. Yeh, Y. Patt, M. Alsup, H. Scales, and M. Shebanow. Single instruction steam parallelism is greater than two. in 18th Annual International Symposium on Computer Architecture, pages 276-286, May 199 I. Google ScholarDigital Library
- 6.G.E. Daddis, Jr. and H.C. Tomg. The concurrent execution of multiple instruction streams on superscalar processors. In International Conference on Parallel Processing, pages 1:76- 83, August 199 !.Google Scholar
- 7.W.J. Daily, S.W. Keckler, N. Carter. A. Chang, M. Fillo, and W.S. Lee. M-Machine architecture el.0. Technical Report- MIT Concurrent VLSI Architecture Memo 58, Massachusetts Institute of Technology, March 1994.Google Scholar
- 8.H. Davis, S.R. Goldschmidt, and J. Hennessy. Multiprocessor simulation and tracing using Tango. In International Conference on Parallel Processing, pages II:99- i 07, August 1991.Google Scholar
- 9.M. Denman. PowerPC 604. In Hot Chips VI, pages 193-200, August 1994,Google Scholar
- 10.K.M. Dixit. New CPU benchmark suites from SPEC. In COMPCON, Spring 1992, pages 305-310, 1992. Google ScholarDigital Library
- 11.J. Edmondson and P Rubinfietd. An overview of the 21164 AXP microprocessor. In Hot Chips VI, pages 1-8, August 1994.Google Scholar
- 12.M. Franklin. The Multiscalar Architecture. PhD thesis, University of Wisconsin, Madison, 1993. Google ScholarDigital Library
- 13.M. Franklin and G.S. Sohi. The expandable split window paradigm for exploiting fine-grain parallelism. In 19th Annual International Symposium on Computer Architecture, pages 58,--67, May 1992. Google ScholarDigital Library
- 14.A. Gupta, J. Hennessy, K. Gharachorloo, T. Mowry, and W.D. Weber. Comparative evaluation of latency reducing and tolerating techniques. In 18th Annual International Symposium on Computer Architecture, pages 254--263, May 199 !. Google ScholarDigital Library
- 15.R.H. Halstead and T. Fujita. MASA: A multithreaded processor architecture for parallel symbolic computing. In 15th Annual International Symposium on Computer Architecture, pages 4,43-451, May 1988. Google ScholarDigital Library
- 16.H. Hiram, K. Kimura, S. Nagamine, Y. Mochizuki, A. Nishimura, Y. Nakase, and T Nishizawa. An elementary processor architecture with simultaneous instruction issuing from multiple threads. In 19th Annual International Symposlum on Computer Architecture, pages 136-145. May 1992. Google ScholarDigital Library
- 17.S.W. Keckler and W.j. Dally. Processor coupling: Integrating compile time and runtime scheduling for parallelism. In 19th Annual International Symposium on Computer Architecture, pages 202-213, May 1992. Google ScholarDigital Library
- 18.M.S. Lam and R.P. Wilson. Limits of control flow on parallelism. In 19th Annual International Symposium on Computer Architecture, pages 46-57. May 1992. Google ScholarDigital Library
- 19.J. Laudon, A. Gupta, and M. Horowitz. Interleaving: A multithreading technique targeung multiprocessors and workstations. In Sixth International Conference on Architectural Support/'or Programmtng Languages and Operating Systems, pages 308-318, October 1994. Google ScholarDigital Library
- 20.P.G. Lowney, S.M. Freudenberger, T.J, Karzes, W.D. Lichtenstein, R.P. Nix, J.S. ODonnell, and J.C. Ruttenberg. The muitiflow trace scheduling compiler. Journal of Supercomputing, 7(I-2):51-142, May 1993. Google ScholarDigital Library
- 21.D.C. McCrackin. The synergistic effect of thread scheduling and caching in multithreaded computers. In COMPCON, Spring 1993, pages 157-164, 1993.Google ScholarCross Ref
- 22.R.S. Nikhil and Arvind. Can dataflow subsume von Neumann computing? In 16th Annual International Symposium on Computer Architecture, pages 262-272, June 1989. Google ScholarDigital Library
- 23.R.G. Prasadh and C.-L. Wu. A benchmark evaluation of a multi-threaded RISC processor architecture. In International Conference on Parallel Processing, pages 1:84--91, August 1991.Google Scholar
- 24.Microprocessor Report, October 24 1994.Google Scholar
- 25.Microprocessor Report, October 3 1994.Google Scholar
- 26.Microprocessor Report, November 14 1994.Google Scholar
- 27.R.H. Saavedra-Barrera, D.E. Culler, and T. von Eicken. Analysis of multithreaded architectures for parallel computing. In Second Annual ACM Symposium on Parallel Algorithms and Architectures, pages 169-I 78, July 1990. Google ScholarDigital Library
- 28.B.J. Smith. Architecture and applications of the HEP multiprocessor computer system. In SPIE Real 7qme Signal Processing IV, pages 241--248, 1981.Google Scholar
- 29.J. Smith. A study of branch prediction strategies. In 8th Annual International Symposium on Computer Architecture, pages 135-148,May 1981. Google ScholarDigital Library
- 30.G.S. Sohi and M. Franklin. High-bandwidth data memory systems for superscalar processors. In Fourth International Conference on Architectural Support for Programming 1.anguages and Operating Systems, pages 53-62, April 1991. Google ScholarDigital Library
- 31.R. Thekkath and S.J. Eggers. The effectiveness of multiple hardware contexts. In Sixth International Conference on Ar. chitectural Support for Programming Languages and Operat. ing Systems, pages 328-337, October 1994. Google ScholarDigital Library
- 32.D.W. Wall. Limits of instruction-level parallelism, in Fourth International Conference on Architectural Support for Pro. gramming Languagesand Operating Systems, pages 176-188, April 1991. Google ScholarDigital Library
- 33.W.D. Weber and A. Gupta. Exploring the benefits of multiple hardware contexts in a multiprocessor architecture: preliminaty results, in 16th Annual International Symposium on Computer Architecture, pages 273-280, June 1989. Google ScholarDigital Library
- 34.W. Yamamoto, M.J. Serrano, A.R. Talcott, R.C. Wood, and M. Nemirosky. Performance estimation of multistreamed, superscatar processors. In Twenty-Seventh Hawaii Internation Conferenceon System Sciences, pages 1:I 95-204, January 1994.Google Scholar
Index Terms
- Simultaneous multithreading: maximizing on-chip parallelism
Recommendations
Simultaneous multithreading: maximizing on-chip parallelism
ISCA '95: Proceedings of the 22nd annual international symposium on Computer architectureThis paper examines simultaneous multithreading, a technique permitting several independent threads to issue instructions to a superscalar's multiple functional units in a single cycle. We present several models of simultaneous multithreading and ...
Converting thread-level parallelism to instruction-level parallelism via simultaneous multithreading
To achieve high performance, contemporary computer systems rely on two forms of parallelism: instruction-level parallelism (ILP) and thread-level parallelism (TLP). Wide-issue super-scalar processors exploit ILP by executing multiple instructions from a ...
Comments