HeDGE: Hybrid Dataflow Graph Execution in the Issue Logic

Subramanian, Suriya; McKinley, Kathryn S.

doi:10.1007/978-3-540-92990-1_23

Suriya Subramanian⁶ &
Kathryn S. McKinley⁶

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5409))

Included in the following conference series:

International Conference on High-Performance Embedded Architectures and Compilers

943 Accesses

Abstract

Exposing more instruction-level parallelism in out-of-order superscalar processors requires increasing the number of dynamic in-flight instructions. However, large instruction windows increase power consumption and latency in the issue logic. We propose a design called Hybrid Dataflow Graph Execution (HeDGE) for conventional Instruction Set Architectures (ISAs). HeDGE explicitly maintains dependences between instructions in the issue window by modifying the issue, register renaming, and wakeup logic. The HeDGE wakeup logic notifies only consumer instructions when data values arrive. Explicit consumer encoding naturally leads to the use of Random Access Memory (RAM) instead of Content Addressable Memory (CAM) needed for broadcast. HeDGE is distinguished from prior approaches in part because it dynamically inserts forwarding instructions. Although these additional instructions degrade performance by an average of 3 to 17% for SPEC C and Fortran benchmarks and 1.5% to 8% for DaCapo Java benchmarks, they enable energy efficient execution in large instruction windows. The HeDGE RAM-based instruction window consumes on average 98% less energy than a conventional CAM as modeled in CACTI for 70nm technology. In conventional designs, this structure contributes 7 to 20% to total energy consumption. HeDGE allows us to achieve power and energy gains by using RAMs in the issue logic while maintaining a conventional instruction set.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Abella, J., Canal, R., González, A.: Power- and Complexity-Aware Issue Queue Designs. IEEE Micro. 23(5), 50–58 (2003)
Article Google Scholar
Alpern, B., Attanasio, D., Barton, J.J., Cocchi, A., Flynn Hummel, S., Lieber, D., Mergen, M., Ngo, T., Shepherd, J., Smith, S.: Implementing Jalapeño in Java. In: ACM Conference on Object–Oriented Programming, Systems, Languages, and Applications, Denver, CO (November 1999)
Google Scholar
Blackburn, S.M., Garner, R., Hoffman, C., Khan, A.M., McKinley, K.S., Bentzur, R., Diwan, A., Feinberg, D., Frampton, D., Guyer, S.Z., Hirzel, M., Hosking, A., Jump, M., Lee, H., Moss, J.E.B., Phansalkar, A., Stefanović, D., VanDrunen, T., von Dincklage, D., Wiedermann, B.: The DaCapo benchmarks: Java benchmarking development and analysis. In: ACM Conference on Object–Oriented Programming, Systems, Languages, and Applications, Portland, OR (October 2006)
Google Scholar
Brooks, D., Tiwari, V., Martonosi, M.: Wattch: A Framework for Architectural-Level Power Analysis and Optimizations. In: International Symposium on Computer Architecture, Vancouver, British Columbia, Canada, pp. 83–94 (2000)
Google Scholar
Burger, D., Austin, T.M.: The Simplescalar Tool Set Version 2.0. Technical Report 1342, Computer Sciences Department, University of Wisconsin (June 1997)
Google Scholar
Canal, R., González, A.: Reducing the complexity of the issue logic. In: International Conference on Supercomputing, Sorrento, Italy, pp. 312–320 (2001)
Google Scholar
Dennis, J.B., Misunas, D.P.: A Preliminary Architecture for a Basic Data-Flow Processor. In: International Symposium on Computer Architecture, pp. 126–132 (1975)
Google Scholar
Fields, B., Rubin, S., Bodík, R.: Focusing Processor Policies via Critical-Path Prediction. In: International Symposium on Computer Architecture, Göteborg, Sweden, pp. 74–85 (2001)
Google Scholar
Folegnani, D., González, A.: Energy-Effective Issue Logic. In: International Symposium on Computer Architecture, Göteborg, Sweden, pp. 230–239 (2001)
Google Scholar
Gewnnap, L.: Intel’s P6 uses Decoupled Superscalar Design. Microprocessor Report 9(2), 9–15 (1995)
Google Scholar
Gowan, M.K., Biro, L.L., Jackson, D.B.: Power Considerations in the Design of the Alpha 21264 Microprocessor. In: Design Automation Conference, pp. 726–731 (1998)
Google Scholar
Hamerly, G., Perelman, E., Lau, J., Calder, B.: Simpoint 3.0: Faster and More Flexible Program Phase Analysis. The Journal of Instruction-Level Parallelism 7 (September 2005)
Google Scholar
Huang, M., Renau, J., Torrellas, J.: Energy-Efficient Hybrid Wakeup Logic. In: ISLPED 2002: Proceedings of the 2002 International Symposium on Low Power Electronics and Design, Monterey, California, USA, pp. 196–201 (2002)
Google Scholar
Huang, X., Moss, J.E.B., McKinley, K.S., Blackburn, S.M., Burger, D.: Dynamic Simplescalar: Simulating Java Virtual Machines. Technical Report TR-03-03, Department of Computer Sciences, The University of Texas at Austin (February 2003)
Google Scholar
Kessler, R.E.: The Alpha 21264 Microprocessor. IEEE Micro. 19(2), 24–36 (1999)
Article Google Scholar
Lebeck, A.R., Koppanalil, J., Li, T., Patwardhan, J., Rotenberg, E.: A Large, Fast Instruction Window for Tolerating Cache Misses. In: ISCA 2002: Proceedings of the 29th annual International Symposium on Computer Architecture, Anchorage, Alaska, pp. 59–70 (2002)
Google Scholar
Michaud, P., Seznec, A.: Data-Flow Prescheduling for Large Instruction Windows in Out-of-Order Processors. In: HPCA 2001: Proceedings of the 7th International Symposium on High-Performance Computer Architecture, Monterrey, Mexico (2001)
Google Scholar
Nagarajan, R., Sankaralingam, K., Burger, D., Keckler, S.W.: A Design Space Evaluation of Grid Processor Architectures. In: MICRO 34: Proceedings of the 34th annual ACM/IEEE International Symposium on Microarchitecture, Austin, Texas, pp. 40–51 (2001)
Google Scholar
Önder, S., Gupta, R.: Superscalar Execution with Direct Data Forwarding. In: International Conference on Parallel Architectures and Compilation Techniques, pp. 130–135 (1998)
Google Scholar
Palacharla, S., Jouppi, N.P., Smith, J.E.: Complexity-Effective Superscalar Processors. In: ISCA 1997: Proceedings of the 24th annual International Symposium on Computer Architecture, Denver, Colorado, United States, pp. 206–218 (1997)
Google Scholar
Sato, T., Nakamura, Y., Arita, I.: Revisiting Direct Tag Search Algorithm on Superscalar Processors. In: Workshop on Complexity-Effective Design (2001)
Google Scholar
SPEC. Standard Performance Evaluation Committee, http://www.spec.org
Subramanian, S., McKinley, K.S.: HeDGE: Hybrid Dataflow Graph Execution in the Issue Logic. Technical Report 2008-42, Department of Computer Sciences, The University of Texas at Austin (2008)
Google Scholar
Swanson, S., Michelson, K., Schwerin, A., Oskin, M.: WaveScalar. In: MICRO 36: Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture, San Diego, CA, pp. 202–291 (2003)
Google Scholar
Tarjan, D., Thoziyoor, S., Jouppi, N.P.: CACTI 4.0. Technical Report WRL-2006-86, Hewlett-Packard Labs, Palo Alto (June 2006)
Google Scholar
Weiss, S., Smith, J.E.: Instruction Issue Logic for Pipelined Supercomputers. SIGARCH Comput. Archit. News 12(3), 110–118 (1984)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Sciences, The University of Texas at Austin, USA
Suriya Subramanian & Kathryn S. McKinley

Authors

Suriya Subramanian
View author publications
You can also search for this author in PubMed Google Scholar
Kathryn S. McKinley
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

IRISA, Campus de Beaulieu, 35042, Rennes Cedex, France
André Seznec
Intel Corporation, Massachusetts Microprocessor Design Center, 77 Reed Road, MA 01749, Hudson, USA
Joel Emer
School of Informatics, Institute for Computing Systems Architecture, King’ s Buildings, EH9 3JZ, Edinburgh, United Kingdom
Michael O’Boyle
Department of Electrical Engineering, Princeton University, 34 Olden Street, NJ 08544-5263, Princeton, USA
Margaret Martonosi
Department of Computer Science, University of Augsburg, 86135, Augsburg, Germany
Theo Ungerer

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Subramanian, S., McKinley, K.S. (2009). HeDGE: Hybrid Dataflow Graph Execution in the Issue Logic. In: Seznec, A., Emer, J., O’Boyle, M., Martonosi, M., Ungerer, T. (eds) High Performance Embedded Architectures and Compilers. HiPEAC 2009. Lecture Notes in Computer Science, vol 5409. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-92990-1_23

Download citation

DOI: https://doi.org/10.1007/978-3-540-92990-1_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-92989-5
Online ISBN: 978-3-540-92990-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics