Abstract
Power consumption has emerged as a key design concern across the entire computing range, from low-end embedded systems to high-end supercomputers. Understanding the power characteristics of a microprocessor under design requires a careful study using a variety of workloads. These workloads range from benchmarks that represent typical behavior up to hand-tuned stress benchmarks (so called stressmarks) that stress the microprocessor to its extreme power consumption.
This paper closes the gap between these two extremes by studying techniques for the automated identification of stress patterns (worst-case or extreme application behaviors) in typical workloads. For doing so, we borrow from sampled simulation theory and we provide two key insights. First, although representative sampling is slightly less effective in characterizing average behavior than statistical sampling, it is substantially more effective in finding stress patterns. Second, we find that threshold clustering is a better alternative than k-means clustering, which is typically used in representative sampling, for finding stress patterns. We identify a wide range of extreme behaviors, such as max energy, max power, max CPI, max branch misprediction rate, and max cache miss rate stress patterns. Overall, we can identify extreme behaviors in microprocessor workloads with a three orders of magnitude speedup and an error of a few percent on average.
Lieven Eeckhout is a postdoctoral fellow with the Fund for Scientific Research in Flanders (Belgium) (FWO-Vlaanderen). Additional support is provided by the FWO projects G.0232.06 and G.0255.08, and the UGent-BOF project 01J14407.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Felter, W., Keller, T.: Power measurement on the Apple Power Mac G5. Technical Report RC23276, IBM (2004)
Gowan, M.K., Biro, L.L., Jackson, D.B.: Power considerations in the design of the Alpha 21264 microprocessor. In: Proceedings of the 35th Design Automation Conference (DAC), pp. 726–731 (June 1998)
Vishmanath, R., Wakharkar, V., Watwe, A., Lebonheur, V.: Thermal performance challenges from silicon to systems. Intel Technology Journal 4(3) (August 2000)
Joseph, R., Brooks, D., Martonosi, M.: Control techniques to eliminate voltage emergencies in high performance processors. In: Proceedings of the International Symposium on High-Performance Computer Architecture (HPCA), pp. 79–90 (February 2003)
Brooks, D., Martonosi, M.: Dynamic thermal management for high-performance microprocessors. In: Proceedings of the Seventh International Symposium on High-Performance Computer Architecture (HPCA), pp. 171–182 (January 2001)
Skadron, K., Stan, M.R., Huang, W., Velusamy, S., Sankaranarayanan, K., Tarjan, D.: Temperature-aware microarchitecture. In: Proceedings of the International Symposium on Computer Architecture (ISCA), pp. 2–13 (June 2003)
Gunther, S.H., Binns, F., Carmean, D.M., Hall, J.C.: Managing the impact of increasing microprocessor power consumption. Intel Journal of Technology 5(1) (February 2001)
Tiwari, V., Singh, D., Rajgopal, S., Mehta, G., Patel, R., Baez, F.: Reducing power in high-performance microprocessors. In: Proceedings of the Design Automation Conference (DAC), pp. 732–737 (June 1998)
Wunderlich, R.E., Wenisch, T.F., Falsafi, B., Hoe, J.C.: SMARTS: Accelerating microarchitecture simulation via rigorous statistical sampling. In: Proceedings of the Annual International Symposium on Computer Architecture (ISCA), pp. 84–95 (June 2003)
Sherwood, T., Perelman, E., Hamerly, G., Calder, B.: Automatically characterizing large scale program behavior. In: Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pp. 45–57 (October 2002)
Yi, J.J., Kodakara, S.V., Sendag, R., Lilja, D.J., Hawkins, D.M.: Characterizing and comparing prevailing simulation techniques. In: Proceedings of the International Symposium on High-Performance Computer Architecture (HPCA), pp. 266–277 (February 2005)
Vandeputte, F., Eeckhout, L.: Finding stress patterns in microprocessor workloads. In: Proceedings of the 2009 International Conference on High Performance and Embedded Architectures and Compilers (HiPEAC), pp. 153–167 (January 2009)
(SPEC), S.P.E.C.: Specpower_ssj2008, http://www.spec.org/power_ssj2008/
Kanter, D.: EEMBC energizes benchmarking. Microprocessor Report (July 2006)
Van Biesbrouck, M., Eeckhout, L., Calder, B.: Efficient sampling startup for sampled processor simulation. In: 2005 International Conference on High Performance Embedded Architectures and Compilation (HiPEAC), pp. 47–67 (November 2005)
Wenisch, T.F., Wunderlich, R.E., Falsafi, B., Hoe, J.C.: Simulation sampling with live-points. In: Proceedings of the Annual International Symposium on Performance Analysis of Systems and Software (ISPASS), pp. 2–12 (March 2006)
Conte, T.M., Hirsch, M.A., Menezes, K.N.: Reducing state loss for effective trace sampling of superscalar processors. In: Proceedings of the International Conference on Computer Design (ICCD), pp. 468–477 (October 1996)
Eeckhout, L., Luo, Y., De Bosschere, K., John, L.K.: BLRL: Accurate and efficient warmup for sampled processor simulation. The Computer Journal 48(4), 451–459 (2005)
Haskins Jr., J.W., Skadron, K.: Accelerated warmup for sampled microarchitecture simulation. ACM Transactions on Architecture and Code Optimization (TACO) 2(1), 78–108 (2005)
Kluyskens, S., Eeckhout, L.: Branch history matching: Branch predictor warmup for sampled simulation. In: Proceedings of the Second International Conference on High Performance Embedded Architectures and Compilation (HiPEAC), pp. 153–167 (January 2007)
Laha, S., Patel, J.H., Iyer, R.K.: Accurate low-cost methods for performance evaluation of cache memory systems. IEEE Transactions on Computers 37(11), 1325–1336 (1988)
Dubey, P.K., Nair, R.: Profile-driven sampled trace generation. Technical Report RC 20041, IBM Research Division, T. J. Watson Research Center (April 1995)
Iyengar, V.S., Trevillyan, L.H., Bose, P.: Representative traces for processor models with infinite cache. In: Proceedings of the Second International Symposium on High-Performance Computer Architecture (HPCA), pp. 62–73 (February 1996)
Lafage, T., Seznec, A.: Choosing representative slices of program execution for microarchitecture simulations: A preliminary application to the data stream. In: IEEE 3rd Annual Workshop on Workload Characterization (WWC-2000) Held in Conjunction with the International Conference on Computer Design (ICCD (September 2000)
Lauterbach, G.: Accelerating architectural simulation by parallel execution of trace samples. Technical Report SMLI TR-93-22, Sun Microsystems Laboratories Inc. (December 1993)
Skadron, K., Ahuja, P.S., Martonosi, M., Clark, D.W.: Branch prediction, instruction-window size, and cache size: Performance tradeoffs and simulation techniques. IEEE Transactions on Computers 48(11), 1260–1281 (1999)
Sherwood, T., Perelman, E., Calder, B.: Basic block distribution analysis to find periodic behavior and simulation points in applications. In: Proceedings of the International Conference on Parallel Architectures and Compilation Techniques (PACT), pp. 3–14 (September 2001)
Lau, J., Sampson, J., Perelman, E., Hamerly, G., Calder, B.: The strong correlation between code signatures and performance. In: Proceedings of the International Symposium on Performance Analysis of Systems and Software (ISPASS), pp. 236–247 (March 2005)
Burger, D.C., Austin, T.M.: The SimpleScalar Tool Set. Computer Architecture News (1997), http://www.simplescalar.com
Brooks, D., Tiwari, V., Martonosi, M.: Wattch: A framework for architectural-level power analysis and optimizations. In: Proceedings of the 27th Annual International Symposium on Computer Architecture (ISCA), pp. 83–94 (June 2000)
Hamerly, G., Perelman, E., Lau, J., Calder, B.: SimPoint 3.0: Faster and more flexible program analysis. Journal of Instruction-Level Parallelism 7 (September 2005)
Chou, T., Roy, K.: Accurate power estimation of CMOS sequential circuits. IEEE Transaction on VLSI Systems 4(3), 369–380 (1996)
Srinivasan, V., Brooks, D., Gschwind, M., Bose, P., Zyuban, V., Strenski, P.N., Emma, P.G.: Optimizing pipelines for power and performance. In: Proceedings of the 35th Annual International Symposium on Microarchitecture (MICRO), pp. 333–344 (November 2002)
Joshi, A.M., Eeckhout, L., John, L.K., Isen, C.: Automated microprocessor stressmark generation. In: Proceedings of the International Symposium on High-Performance Computer Architecture (HPCA), pp. 229–239 (February 2008)
Dhodapkar, A., Smith, J.E.: Managing multi-configuration hardware via dynamic working set analysis. In: Proceedings of the 29th Annual International Symposium on Computer Architecture (ISCA), pp. 233–244 (May 2002)
Dhodapkar, A.S., Smith, J.E.: Comparing program phase detection techniques. In: Proceedings of the 36th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), pp. 217–227 (December 2003)
Sherwood, T., Sair, S., Calder, B.: Phase tracking and prediction. In: Proceedings of the 30th Annual International Symposium on Computer Architecture (ISCA), pp. 336–347 (June 2003)
Huang, M., Renau, J., Torrellas, J.: Positional adaptation of processors: Application to energy reduction. In: Proceedings of the 30th Annual International Symposium on Computer Architecture (ISCA), pp. 157–168 (June 2003)
Duesterwald, E., Cascaval, C., Dwarkadas, S.: Characterizing and predicting program behavior and its variability. In: Proceedings of the International Conference on Parallel Architectures and Compilation Techniques (PACT), pp. 220–231 (October 2003)
Isci, C., Martonosi, M.: Identifying program power phase behavior using power vectors. In: Proceedings of the Sixth Annual IEEE International Workshop on Workload Characterization (WWC) (September 2003)
Isci, C., Martonosi, M.: Runtime power monitoring in high-end processors: Methodology and empirical data. In: Proceedings of the 36th Annual International Symposium on Microarchitecture (MICRO), pp. 93–104 (December 2003)
Isci, C., Martonosi, M.: Phase characterization for power: Evaluating control-flow-based and event-counter-based techniques. In: Proceedings of the International Symposium on High-Performance Computer Architecture (HPCA), pp. 122–133 (February 2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Vandeputte, F., Eeckhout, L. (2011). Finding Extreme Behaviors in Microprocessor Workloads. In: Stenström, P. (eds) Transactions on High-Performance Embedded Architectures and Compilers IV. Lecture Notes in Computer Science, vol 6760. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24568-8_8
Download citation
DOI: https://doi.org/10.1007/978-3-642-24568-8_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-24567-1
Online ISBN: 978-3-642-24568-8
eBook Packages: Computer ScienceComputer Science (R0)