Abstract
The role of the instruction scheduler is to supply instructions to functional units in a timely manner so as to avoid data and structural hazards. Current schedulers are based on the broadcast of result register numbers to all instructions waiting in the issue queue and on a global arbiter to select ready instructions from that queue. This approach called broadcast scheduling does not scale well due to its complexity. To reduce the complexity of the broadcast schedulers, data-flow pre-scheduling has been proposed. The basic idea is to predict the issue time of instructions based on the availability of operands and then time them down until they are ready to issue. However, resource conflicts for issue slots and functional units delay the issue time of conflicted instructions, and cause a large amount of replays. We propose to add instruction pre-selection to data-flow pre-schedulers for accurate instruction pre-scheduling. Our pre-scheduler keeps track of the allocation status of resources so that re source conflicts are eliminated. Pre-scheduled instructions are stored in an issue buffer until their issue delay elapses and then issue automatically. Our analysis shows that pre-schedulers with pre-selection result in performance improvements of 60% over current broadcast schedulers in pipeline designs where the scheduler is the bottleneck. In future technologies we expect this result to hold as logic intensive designs with short wires will be preferable to de signs with long wire delays.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Agarwal, V., Hrishikesh, M., Keckler, S., Burger, D.: Clock Rate versus IPC: The End of the Road for Conventional Microprocessors. In: Proceedings of the 27th International Symposium on Computer Architecture (2000)
Hrishikesh, M., Jouppi, N., Farkas, K., Burger, D., Keckler, S., Shivakumar, P.: The Optimal Logic Depth per Pipeline Stage is 6 to 8 FO4 Inverter Delays. In: Proceedings of the 29th International Symposium on Computer Architecture (2002)
Stark, J., Brown, M., Patt, Y.: On Pipelining Dynamic Instruction Sched¬uling Logic. In: Proceedings of the 33rd International Symposium on Microarchitecture (2000)
Michaud, P., Seznec, A.: Data-Flow Prescheduling for Large Issue queues in Out-of-Order Processors. In: Proceedings of the 7th International Symposium on High Performance Computer Architecture (2001)
Raasch, S., Binkert, N., Reinhardt, S.: A Scalable Instruction Queue De¬sign Using Dependence Chains. In: Proceedings of the 29th International Symposium on Computer Architecture (2002)
Liu, Y., Shayesteh, A., Memik, G., Reinman, G.: Scaling the Issue Window with Look-Ahead Latency Prediction. In: Proceedings of the 18th Annual ACM International Conference on Supercomputing (2004)
Liu, Y., Shayesteh, A., Memik, G., Reinman, G.: Tornado Warning: the Perils of Selective Replay in Multithreaded Processors. In: Proceedings of the 19th Annual ACM International Conference on Supercomputing (2005)
Canal, R., González, A.: Reducing the Complexity of the Issue Logic. In: Proceedings of the 15th International Conference on Supercomputing (2001)
Ernst, D., Hamel, A., Austin, T.: Cyclone: A Broadcast-Free Dynamic Instruction Scheduler with Selective Replay. In: Proceedings of the 30th International Symposium on Computer Architecture (2003)
Hu, J., Vijaykrishnan, N., Irwin, M.: Exploring Wakeup-Free Instruction Scheduling. In: Proceedings of the 10th International Symposium on High Performance Computer Architecture (2004)
Kim, I., Lipasti, M.: Understanding Scheduling Replay Schemes. In: Proceedings of the 10th International Symposium on High Performance Computer Architecture (2004)
Yoaz, A., Erez, M., Ronen, R., Jourdan, S.: Speculation Techniques for Improving Load Related Instruction Scheduling. In: Proceedings of the 26th International Symposium on Computer Architecture (1999)
Merchant, A., Sagar, D.: Computer Processor Having a Checker. United States Patent #6,212,626, assigned to Intel Corporation, issued April 3 (2001)
Chrysos, G., Emer, J.: Memory Dependence Prediction Using Store Sets. In: Proceedings of the 25th International Symposium on Computer Architecture (1998)
Kessler, R.: The Alpha 21264 Microprocessor. IEEE Micro. 19(2), 24-36(1999)
Tendler, J., Dodson, S., Fields, S., Le, H., Sinharoy, B.: Power4 System Microarchitecture. IBM Journal of Research and Development 46(1), 5–26 (2002)
Hinton, G., Sager, D., Upton, M., Boggs, D., Carmean, D., Kyker, A., Roussel, P.: The Microarchitecture of the Pentium 4 Processor. Intel Technology Journal, Q1 (2001)
Lebeck, A., Koppanalil, J., Li, T., Patwardhan, J., Rotenberg, E.: A Large, Fast Instruction Window for Tolerating Cache Misses. In: Proceedings of the 29th International Symposium on Computer Architecture (2002)
Palacharla, S., Jouppi, N., Smith, J.: Complexity-Effective Superscalar Processors. In: Proceedings of the 24th International Symposium on Computer Architecture (1997)
Allan, A., Edenfeld, D., Joyner, W., Kahng, A., Rodgers, M., Zorian, Y.: 2001 Technology Roadmap for Semiconductors. IEEE Computer 35(1), 42–53 (2002)
Austin, T., Larson, E., Ernst, D.: SimpleScalar: an Infrastructure for Computer System Modeling. IEEE Computer 35(2), 59–67 (2002)
Standard Performance Evaluation Corporation, http://www.specbench.org
Sherwood, T., Perelman, E., Hamerly, G., Calder, B.: Automatically Char¬acterizing Large Scale Program Behavior. In: Proceedings of the 10th International Conference on Architectural Support for Programming Languages and Operating Systems (2002)
Synopsys Inc., http://www.synopsys.com/products/logic/design_compiler.html
Samsung Electronics Corporation, http://www.samsung.com/products/semiconductor/ASIC/StandardCellLibraries/STDH150E/STDH150E.htm
Choi, W., Park, S., Dubois, M.: Accurate Instruction Pre-scheduling in Dynamically Scheduled Processors,” Technical Report #CENG-2007-3, Depart¬ment of Electrical Engineering - Systems, University of Southern California (March 2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Choi, W., Park, SJ., Dubois, M. (2009). Accurate Instruction Pre-scheduling in Dynamically Scheduled Processors. In: Stenström, P. (eds) Transactions on High-Performance Embedded Architectures and Compilers II. Lecture Notes in Computer Science, vol 5470. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-00904-4_7
Download citation
DOI: https://doi.org/10.1007/978-3-642-00904-4_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-00903-7
Online ISBN: 978-3-642-00904-4
eBook Packages: Computer ScienceComputer Science (R0)