Skip to main content

Trace-Driven Memory Simulation: A Survey

  • Chapter
  • First Online:
Book cover Performance Evaluation: Origins and Directions

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1769))

Abstract

It is well known that the increasing gap between processor and main-memory speeds is one of the primary bottlenecks to good overall computer-system performance. The traditional solution to this problem is to build small, fast memories (caches) to hold recently-used data and instructions close to the processor for quicker access [64]. During the past decade, microprocessor clock rates have increased at a rate of 40% per year, while main-memory (DRAM) speeds have increased at a rate of only about 11% per year [76]. This trend has made modern computer systems increasingly dependent on caches. A case in point: disabling the cache of the VAX 11/780, a machine introduced in the late 1970’s, would have increased its workload run times by a factor of only 1.6 [32], while disabling the cache of the HP 9000/735, a more recent machine introduced in the early 1990’s, would cause workloads to slow by a factor of 15 [76].

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agarwal, A.: Analysis of cache performance for operating systems and multiprogramming. Ph.D. dissertation, Stanford. 1989

    Google Scholar 

  2. Agarwal, A., Horowitz, M., and Hennessy, J.: An analytical cache model. ACM Transactions on Computer Systems 7(2): 184–215, 1989

    Article  Google Scholar 

  3. Agarwal, A. and Huffman, M.: Blocking: Exploiting spatial locality for trace compaction. In Proc. of the 1990 SIGMETRICS Conf. on Measurement and Modeling of Computer Systems, Boulder, CO, ACM, 48–57, 1990.

    Google Scholar 

  4. Baker, M.: Cluster Computing Review. Northeast Parallel Architectures Center (NPAC) Technical Report SCCS-748, November, 1995

    Google Scholar 

  5. Becker, J. and Park, A.: An analysis of the information content of address and data reference streams. In Proc. of the 1993 SIGMETRICS Conf. on the Measurement and Modeling of Computer Systems, Santa Clara, CA, 262–263, 1993

    Google Scholar 

  6. Bedichek, R.: The Meerkat multicomputer: Tradeoffs in multicomputer architecture. Ph.D. dissertation, University of Washington Department of Computer Science Technical Report 94-06-06, August 1994

    Google Scholar 

  7. Bedichek, R.: Talisman: fast and accurate multicomputer simulation. In Proc. of the 1995 SIGMETRICS Conf. on Measurement and Modeling of Computer Systems, 14–24, 1995

    Google Scholar 

  8. Borg, A., Kessler, R., Lazana, G., and Wall, D.: Long address traces from RISC machines: generation and analysis. DEC Western Research Lab Technical Report 89/14, 1989

    Google Scholar 

  9. Borg, A., Kessler, R., and Wall, D.: Generation and analysis of very long address traces. In Proc. of the 17th Ann. Int. Symp. on Computer Architecture, IEEE, 1990

    Google Scholar 

  10. Chen, B.: Software methods for system address tracing. In Proc. of the Fourth Workshop on Workstation Operating Systems, Napa, California, 1993

    Google Scholar 

  11. Chen, B. and Bershad, B.: The impact of operating system structure on memory system performance. In Proc. of the 14th Symp. on Operating System Principles, 1993

    Google Scholar 

  12. Chen, B.: Memory behavior of an X11 window system. In Proc. of the USENIX Winter 1994 Technical Conf., 1994

    Google Scholar 

  13. Clark, D. W., Bannon, P. J., and Keller, J. B.: Measuring VAX 8800 performance with a histogram hardware monitor. In Proc. of the 15th Ann. Int. Symp. on Computer Architecture, Honolulu, Hawaii, IEEE, 176–185, 1985

    Google Scholar 

  14. Cmelik, R. and Keppel, D.: Shade: A fast instruction-set simulator for execution profiling. University of Washington Technical Report UWCSE 93-06-06. 1993

    Google Scholar 

  15. Cmelik, B. and Keppel, D.: Shade: A fast instruction-set simulator for execution profiling. In Proc. of the 1994 SIGMETRICS Conf. on Measurement and Modeling of Computer Systems, Nashville, TN, ACM, 128–137, 1994

    Google Scholar 

  16. Cvetanovic, Z. and Bhandarkar, D.: Characterization of Alpha AXP performance using TP and SPEC Workloads. In Proc. of the 21st Ann. Int. Symp. on Computer Architecture, Chicago, IL, IEEE, 1994

    Google Scholar 

  17. Davies, P., Lacroute, P., Heinlein, J., Horowitz, M.: Mable: A technique for efficient machine simulation. Stanford University Technical Report CSL-TR-94-636, October, 1994

    Google Scholar 

  18. Davis, H., Goldschmidt, S., and Hennessy, J.: Multiprocessor simulation and tracing using Tango. In Proc. of the 1991 Int. Conf. on Parallel Processing, 99–107, 1991

    Google Scholar 

  19. Digital: Alpha Architecture Handbook. USA, Digital Equipment Corporation, 1992

    Google Scholar 

  20. Eggers, S., Keppel, D., Koldinger, E., and Levy, H.: Techniques for eficient inline tracing on a shared-memory multiprocessor. In Proc. of the 1990 SIGMETRICS Conf. on Measurement and Modeling of Computer Systems, Boulder, CO, 37–47, 1990

    Google Scholar 

  21. Emer, J. and Clark, D.: A characterization of processor performance in the VAX-11/780. In Proc. of the 11th Ann. Symp. on Computer Architecture, Ann Arbor, MI, 301–309, 1984

    Google Scholar 

  22. Eustace, A. and Srivastava, A.: ATOM: a flexible interface for building high performance program analysis tools. In Proc. of the USENIX Winter 1995 Technical Conf. on UNIX and Advanced Computing Systems, New Orleans, Louisiana, 303–314, January, 1995

    Google Scholar 

  23. Flanagan, J. K., Nelson, B. E., Archibald, J. K., and Grimsrud, K.: BACH: BYU address collection hardware, the collection of complete traces. In Proc. of the 6th Int. Conf. on Modelling Techniques and Tools for Computer Performance Evaluation, 128–137, 1992

    Google Scholar 

  24. Gee, J., Hill, M., Pnevmatikatos, D., and Smith, A. J.: Cache performance of the SPEC92 benchmark suite. IEEE Micro (August): 17–27, 1993

    Google Scholar 

  25. Goldschmidt, S. and Hennessy, J.: The accuracy of trace-driven simulation of multiprocessors. Stanford University Technical Report CSL-TR-92-546, September 1992

    Google Scholar 

  26. Goldschmidt, S. and Hennessy, J.: The accuracy of trace-driven simulation of multiprocessors. In Proc. of the 1993 ACM SIGMETRICS Conf. on Measurement and Modeling of Computer Systems, 146–157, May 1993

    Google Scholar 

  27. Hammerstrom, D. and Davidson, E.: Information content of CPU memory referencing behavior. In Proc. of the 4th Int. Symp. on Computer Architecture, 184–192, 1977

    Google Scholar 

  28. Hill, M.: Aspects of cache memory and instruction buffer performance. Ph.D. dissertation, The University of California at Berkeley. 1987

    Google Scholar 

  29. Hill, M. and Smith, A.: Evaluating associativity in CPU caches. IEEE Transactions on Computers 38(12): 1612–1630, 1989

    Article  Google Scholar 

  30. Holliday, M.: Techniques for cache and memory simulation using address reference traces. Int. Journal in Computer Simulation 1: 129–151, 1991

    Google Scholar 

  31. IBM: IBM RISC System/6000 Technology. Austin, TX, IBM, 1990

    Google Scholar 

  32. Jouppi, N.: Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers. In Proc. of the 17th Ann. Int. Symp. on Computer Architecture, Seattle, WA, IEEE, 364–373, 1990

    Google Scholar 

  33. Kaeli, D.: Issues in trace-driven simulation. In Proc. of the 22rd Ann. Pittsburgh Modeling and Simulation Conf., Vol. 22, Part 5, May, 2533–2540, 1991

    Google Scholar 

  34. Kessler, R.: Analysis of multi-megabyte secondary CPU cache memories. Ph.D. dissertation, University of Wisconsin-Madison. 1991

    Google Scholar 

  35. Laha, S., Patel, J., and Iyer, R.: Accurate low-cost methods for performance evaluation of cache memory systems. IEEE Transactions on Computers 37(11): 1325–1336, 1988

    Article  Google Scholar 

  36. Larus, J. R.: Abstract execution: A technique for efficiently tracing programs. Software Practice and Experience, 20(12):1241–1258, December, 1990

    Article  Google Scholar 

  37. Larus, J.: SPIM S20: A MIPS R2000 Simulator. University of Wisconsin-Madison Technical Report, Revision 9. 1991

    Google Scholar 

  38. Larus, J. R.: Efficient program tracing. IEEE Computer, May: 52–60, 1993

    Google Scholar 

  39. Larus, J. R. and Schnorr, E.: EEL: Machine independent executable editing. In Proc. SIGPLAN Conf. on Programming Language Design and Implementation, June, 1995

    Google Scholar 

  40. Lebeck, A. and Wood, D.: Fast-Cache: A new abstraction for memory-system simulation. University of Wisconsin-Madison Technical Report 1211, 1994

    Google Scholar 

  41. Lebeck, A. and Wood, D.: Active Memory: A new abstraction for memory-system simulation. In Proc. of the 1995 SIGMETRICS Conf. on the Measurement and Modeling of Computer Systems, May, 220–230, 1995

    Google Scholar 

  42. Lee, C.-C.: A case study of a hardware-managed TLB in a multi-tasking environment. University of Michigan Technical Report. 1994

    Google Scholar 

  43. Magnusson, P.: A design for efficient simulation of a multiprocessor. In Proc. of the 1993 Western Simulation Multiconference on Int. Workshop on Modeling, Analysis and Simulation of Computer and Telecommunication Systems, 69–78, La Jolla, California, 1993

    Google Scholar 

  44. Martonosi, M., Gupta, A., and Anderson, T.: MemSpy: Analyzing memory system bottlenecks in programs. In Proc. of the 1992 SIGMETRICS Conf. on the Measurement and Modeling of Computer Systems, ACM, 1992

    Google Scholar 

  45. Martonosi, M., Gupta, A., and Anderson, T.: Effectiveness of trace sampling for performance debugging tools. In Proc. of the 1993 SIGMETRICS Conf. on the Measurement and Modeling of Computer Systems, Santa Clara, California, ACM, 248–259, 1993

    Google Scholar 

  46. Mattson, R. L., Gecsei, J., Slutz, D. R., and Traiger, I. L.: Evaluation techniques for storage hierarchies. IBM Systems Journal 9(2): 78–117, 1970

    Article  Google Scholar 

  47. Maynard, A. M., Donnelly, C., and Olszewski, B.: Contrasting characteristics and cache performance of technical and multi-user commercial workloads. In Proc. of the Sixth Int. Conf. on Architectural Support for Programming Languages and Operating Systems, San Jose, CA, ACM, 145–156, 1994

    Google Scholar 

  48. MIPS: RISCompiler Languages Programmer’s Guide. MIPS, 1988

    Google Scholar 

  49. Mogul, J. C. and Borg, A.: The effect of context switches on cache performance. In Proc. of the 4th Int. Conf. on Architectural Support for Programming Languages and Operating Systems, Santa Clara, California, ACM, 75–84, 1991

    Google Scholar 

  50. Nagle, D., Uhlig, R., and Mudge, T.: Monster: A tool for analyzing the interaction between operating systems and computer architectures. University of Michigan Technical Report CSE-TR-147-92. 1992

    Google Scholar 

  51. Nagle, D., Uhlig, R., Stanley, T., Sechrest, S., Mudge, T., and Brown, R.: Design tradeoffs for software-managed TLBs. In Proc. of the 20th Ann. Int. Symp. on Computer Architecture, San Diego, California, IEEE, 27–38, 1993

    Google Scholar 

  52. Nagle, D., Uhlig, R., Mudge, T., and Sechrest, S.: Optimal allocation of on-chip memory for multiple-API operating systems. In Proc. of the 21st Int. Symp. on Computer Architecture, Chicago, IL, 1994

    Google Scholar 

  53. Pierce, J. and Mudge, T.: IDtrace — A tracing tool for i486 simulation. University of Michigan Technical Report CSE-TR-203-94. 1994

    Google Scholar 

  54. Pierce, J., Smith, M. D., and Mudge, T.: “Instrumentation tools,” in Fast Simulation of Computer Architectures (T. M. Conte and C. E. Gimarc, eds.), Kluwer Academic Publishers: Boston, MA, 1995

    Google Scholar 

  55. Pleszkun, A.: Techniques for compressing program address traces. Technical Report, Department of Electrical and Computer Engineering, University of Colorado-Boulder. 1994

    Google Scholar 

  56. Puzak, T.: Analysis of cache replacement algorithms. Ph.D. dissertation, University of Massachusetts. 1985

    Google Scholar 

  57. Reinhardt, S., Hill, M., Larus, J., Lebeck, A., Lewis, J., and Wood, D.: The Wisconsin Wind Tunnel: Virtual prototyping of parallel computers. In Proc. of the 1993 SIGMETRICS Int. Conf. on Measurement and Modeling of Computer Systems, Santa Clara, CA, ACM, 48–60, 1993

    Google Scholar 

  58. Reinhardt, S., Pfile, R., and Wood, D.: Decoupled hardware support for distributed shared memory. To appear in Proc. of the 23rd Ann. Int. Symp. on Computer Architecture, 1996

    Google Scholar 

  59. Romer, T., Lee, D., Voelker, G., Wolman, A., Wong, W., Baer, J., Bershad, B., and Levy, H.: The structure and performance of interpreters. To appear in the Proc. of the 7th Int. Conf. on Architectural Support for Programming Languages and Operating Systems, Cambridge, MA, October, 1996

    Google Scholar 

  60. Rosenblum, M., Herrod, S., Witchel, E., and Gupta, A.: Complete computer simulation: the SimOS approach, In IEEE Parallel and Distributed Technology, Fall 1995

    Google Scholar 

  61. Samples, A.: Mache: no-loss trace compaction. In Proc. of 1989 SIGMETRICS Conf. on Measurement and Modeling of Computer Systems, ACM, 89–97, 1989

    Google Scholar 

  62. Sites, R., Chernoff, A., Kirk, M., Marks, M., and Robinson, S.: Binary translation. Digital Technical Journal 4(4): 137–152, 1992

    Google Scholar 

  63. Smith, A. J.: Two methods for the efficient analysis of memory address trace data. IEEE Transactions on Software Engineering SE-3(1): 94–101, 1977

    Article  Google Scholar 

  64. Smith, A. J.: Cache memories. Computing Surveys 14(3): 473–530, 1982

    Article  Google Scholar 

  65. Smith, M. D.: Tracing with pixie. Technical Report, Stanford University, Stanford, CA. 1991

    Google Scholar 

  66. Srivastava, A. and Eustace, A.: ATOM: A system for building customized program analysis tools. In Proc. of the SIGPLAN’ 94 Conf. on Programming Language Design and Implementation, 196–205, June 1994

    Google Scholar 

  67. Stephens, C., Cogswell, B., Heinlein, J., Palmer, G., and Shen, J.: Instruction level profiling and evaluation of the IBM RS/6000. In Proc. of the 18th Ann. Int. Symp. on Computer Architecture, Toronto, Canada, ACM, 180–189, 1991

    Google Scholar 

  68. Stunkel, C. and Fuchs, W.: TRAPEDS: producing traces for multicomputers via execution-driven simulation. In Proc. of the 1989 SIGMETRICS Conf. on Measurement and Modeling of Computer Systems, Berkeley, CA, ACM, 70–78, 1989

    Google Scholar 

  69. Stunkel, C., Janssens, B., and Fuchs, W. K.: Collecting address traces from parallel computers. In Proc. of the 24th Ann. Hawaii Int. Conf. on System Sciences, Hawaii, 373–383, 1991

    Google Scholar 

  70. Sugumar, R.: Multi-configuration simulation algorithms for the evaluation of computer designs. Ph.D. dissertation, University of Michigan. 1993

    Google Scholar 

  71. Talluri, M. and Hill, M.: Surpassing the TLB performance of superpages with less operating system support. In Proc. of the 6th Int. Conf. on Architectural Support for Programming Languages and Operating Systems, San Jose, CA, ACM, 1994

    Google Scholar 

  72. Thompson, J. and Smith, A.: Efficient (stack) algorithms for analysis of write-back and sector memories. ACM Transactions on Computer Systems 7(1): 78–116, 1989

    Article  Google Scholar 

  73. Uhlig, R., Nagle, D., Mudge, T., and Sechrest, S.: Trap-driven simulation with Tapeworm II. In Proc. of the Sixth Int. Conf. on Architectural Support for Programming Languages and Operating Systems, San Jose, California, ACM Press (SIGARCH), 132–144, 1994

    Chapter  Google Scholar 

  74. Uhlig, R., Nagle, D., Mudge, T. Sechrest, S., and Emer, J.: Instruction fetching: coping with code bloat. To appear in Proc. of the 22nd Int. Symp. on Computer Architecture, Santa Margherita Ligure, Italy, June, 1995

    Google Scholar 

  75. Uhlig, R., and Mudge, T.: Trace driven memory simulation: A survey. Computing Surveys 29(2) 128–170, 1997.

    Article  Google Scholar 

  76. Upton, M. D.: Architectural trade-offs in a latency tolerant gallium arsenide microprocessor. Ph.D. Dissertation, The University of Michigan, 1994

    Google Scholar 

  77. Veenstra, J. and Fowler, R.: MINT: A front end for efficient simulation of shared-memory multiprocessors. In Proc. of the 2nd Int. Workshop on Modeling, Analysis, and Simulation of Computer and Telecommunication systems (MASCOTS), 201–207, 1994

    Google Scholar 

  78. Wall, D.: Link-time code modification. DEC Western Research Lab Technical Report 89/17. 1989

    Google Scholar 

  79. Wall, D.: Systems for late code modification. DEC Western Research Lab Technical Report 92/3. 1992

    Google Scholar 

  80. Wang, W.-H. and Baer, J.-L.: Efficient trace-driven simulation methods for cache performance analysis. In Proc. of the 1990 SIGMETRICS Conf. on Measurement and Modeling of Computer Systems, Boulder, CO, ACM, 27–36, 1990

    Google Scholar 

  81. Witchel, E. and Rosenblum, M.: Embra: fast and flexible machine simulation, In Proc. of the 1996 SIGMETRICS Conf. on Measurement and Modeling of Computer Systems, Philadelphia, May, 1996

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2000 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Uhlig, R.A., Mudge, T.N. (2000). Trace-Driven Memory Simulation: A Survey. In: Haring, G., Lindemann, C., Reiser, M. (eds) Performance Evaluation: Origins and Directions. Lecture Notes in Computer Science, vol 1769. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-46506-5_5

Download citation

  • DOI: https://doi.org/10.1007/3-540-46506-5_5

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-67193-0

  • Online ISBN: 978-3-540-46506-5

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics