Skip to main content
Log in

Instruction-level experimental evaluation of the Multiflow TRACE 14/300 VLIW computer

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Advances in compiler technology have recently led to the introduction of the architectural paradigm known as thevery long instruction word (VLIW) architecture. The Multiflow Trace series of processors is the first commercial line of processors with this architecture. This article presents experimental results concerning the performance and resource utilization of the TRACE 14/300 on a set of 11 common scientific programs written in both C and FORTRAN. Several characteristics of the application, architecture, implementation, and compiler that contribute to the observed results are identified. These characteristics include a conservative approach by the compiler in determining the existence of data dependence and disambiguating memory references, memory latency and resource dependences resulting from the TRACE 14/300 implementation, and actual data dependences that exist within the code. Alleviating the effects of the first three of these bottlenecks is found to improve the TRACE 14/300 performance by a factor of 1.55 on average. Performance of the TRACE 14/300 is also measured on several standard benchmarks, including the SPEC89 benchmark suite. Performance on the SPEC89 benchmarks is found to be comparable to the superscalar IBM RS/6000 when differences in implementation technology are considered. Concluding remarks concerning instruction-level parallel processing are also presented.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Breternitz, M., Jr., and Shen, J.P. 1988. Organization of array data for concurrent memory access. InProc. 21st Internat. Symp. on Microarchitecture (Nov.).

  • Colwell, R.P., Nix, R.P., O'Donnell, J., Papworth, D.B., and Rodman, P.K. 1987. A VLIW architecture for a trace scheduling compiler. InProc., 2nd Internat. Conf. on Architectural Support for Programming Languages and Operating Systems (Palo Alto, Calif., Oct. 5–8), pp. 180–192.

  • Colwell, R.P., Hall, W.E., Joshi, C.S., Papworth, D.B., Rodman, P.K., and Tornes, J.E. 1990. Architecture and implementation of a VLIW supercomputer. InProc., Supercomputing '90 (Nov.), pp. 910–919.

    Google Scholar 

  • Ebcioglu, K. 1988. Some design ideas for a VLIW architecture for sequential-natured software. IBM research rept. (Apr.).

  • Ellis, J.R. 1985. Bulldog: A compiler for VLIW architectures. Ph.D. thesis, Yale Univ., New Haven, Conn.

    Google Scholar 

  • Fisher, J.A. 1981. Trace scheduling: A technique for global microcode compaction.IEEE Trans. Comps., C-30, 7 (July): 478–490.

    Google Scholar 

  • Fisher, J.A. 1990. Very long instruction word architectures and the ELI-512. InProc., 10th Internat. Symp. on Comp. Architecture, pp. 140–150.

  • Hart, J.F., Cheney, E.W., Lawson, C.L., Maehly, H.J., Mesztenyi, C.K., Rice, J.R., Thacher, H.G. Jr., and Witzgall, C. 1968.Computer Approximations. John Wiley, New York.

    Google Scholar 

  • Johnson, M. 1991.Superscalar Microprocessor Design. Prentice-Hall, Englewood Cliffs, N.J.

    Google Scholar 

  • Jouppi, N. 1989. The nonuniform distribution of instruction-level and machine parallelism and its effect on performance.IEEE Trans. Comps., C-38, 12 (Dec.): 1645–1658.

    Google Scholar 

  • Labrousse, J., and Slavenburg, G. 1990. A 50 MHz microprocessor with a VLIW architecture. InProc., Internat. Solid State Circuits Conf. (San Francisco), pp. 44–45.

  • Nicolau, A. 1985. Percolation scheduling: A parallel compilation technique. Tech. Rept. TR 85-678, Dept. of Comp. Sci., Cornell, Ithaca, N.Y.

    Google Scholar 

  • Press, W.H., Flannery, B.P., Teukolsky, S.A., and Vetterling, W.T. 1988.Numerical Recipes in C: The Art of Scientific Computing. Cambridge Univ. Press, Cambridge, Mass.

    Google Scholar 

  • Press, W.H., Flannery, B.P., Teukolsky, S.A., and Vetterling, W.T. 1989.Numerical Recipes in FORTRAN: The Art of Scientific Computing. Cambridge Univ. Press, Cambridge, Mass.

    Google Scholar 

  • Rau, B.R., Yen, D.W.L., Yen, W., and Towle, R.A. 1989. The Cydra 5 departmental supercomputer: Design philosophies, decisions and trade-offs.IEEE Comp., 22, 1 (Jan.):12–34.

    Google Scholar 

  • SPEC. 1990.SPEC Benchmark Suite Release 1.0. Spring.

  • Stephens, C., Cogswell, B., Heinlein, J., Palmer, G., and Shen, J.P. 1991. Instruction level profiling and evaluation of the IBM RS/6000. InProc., 18th Annual Internat. Symp. on Comp. Architecture (Toronto, May 27–30), pp. 180–189.

  • Wall, D.W. 1991. Limits of instruction-level parallelism. InProc., 4th Internat. Conf. on Architectural Support for Programming Languages and Operating Systems (Santa Clara, Calif., Apr.), pp. 176–188.

  • Wolfe, A., and Shen, J.P. 1991. A variable instruction stream extension to the VLIW architecture. InProc., 4th Internat. Conf. on Architectural Support for Programming Languages and Operating Systems (Santa Clara, Calif., Apr.), pp. 2–14.

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Schuette, M.A., Shen, J.P. Instruction-level experimental evaluation of the Multiflow TRACE 14/300 VLIW computer. J Supercomput 7, 249–271 (1993). https://doi.org/10.1007/BF01205186

Download citation

  • Received:

  • Accepted:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF01205186

Keywords

Navigation