Abstract
Dynamic instruction scheduling logic is one of the most critical and cycle-limiting structures in modern superscalar processors, and it is not easily pipelined without significant losses in performance. However, these performance losses are incurred only due to a small fraction of instructions, which are intolerant to the non-atomic scheduling. We first perform an empirical analysis of the instruction streams to determine which instructions actually require single cycle scheduling. We then propose a Non-Uniform Scheduler – a design that partitions the scheduling logic into two queues, each with dedicated wakeup and selection logic: a small Fast Issue Queue (FIQ) to issue critical instructions in the back-to-back cycles and a large Slow Issue Queue (SIQ) to issue the remaining instructions over two cycles with a one cycle bubble between dependent instructions. Finally, we propose and evaluate several steering mechanisms to effectively distribute instructions between the queues.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Abella, J., Gonzalez, A.: Low-Complexity Distributed Issue Queue. In: Proc. of HPCA (2004)
Akkary, H., Rajwar, R., Srinivasan, S.: Checkpoint Processing and Recovery: Towards Scalable Large Instruction Window Processors. In: Proc. of MICRO (2003)
Brekelbaum, E., et al.: Hierarchical Scheduling Windows. In: 35th Int’l. Symp. on Microarchitecture (2002)
Brown, M., Stark, J., Patt, Y.: Select-Free Instruction Scheduling Logic. In: The 34th International Symposium on Microarchitecture (2001)
Burger, D., Austin, T.: The SimpleScalar tool set: Version 2.0, Tech. Report, Dept. of CS, Univ. of Wisconsin-Madison, and documentation for all Simplescalar releases (June 1997)
Canal, R., Gonzalez, A.: A Low-Complexity Issue Logic. In: Proc. of the International Conference on Supercomputing, ICS (2000)
Canal, R., Gonzalez, A.: Reducing the Complexity of the Issue Logic. In: Proc. of the Int’l. Conf. on Supercomputing, ICS (2001)
Cristal, A., et al.: Out-of-Order Commit Processors. In: The International Symposium on High-Perf. Comp. Arch, HPCA (2004)
Ehrhart, T., Patel, S.: Reducing the Scheduling Critical Cycle using Wakeup Prediction. In: HPCA (2004)
Ernst, D., Hamel, A., Austin, T.: Cyclone: a Broadcast-free Dynamic Instruction Scheduler with Selective Replay. In: Proc. of Int’l. Symp. On Computer Architecture, ISCA (2003)
Ernst, D., Austin, T.: Efficient Dynamic Scheduling Through Tag Elimination. In: The 29th Int’l. Symp. on Comp. Architecture (2002)
Fields, B., Bodik, R., Hill, M.: Slack: Maximizing Performance Under Technological Constraints. In: The 29th International Symposium on Computer Architecture (2002)
Hu, J., Vijaykrishnan, N., Irwin, M.: Exploring Wakeup-Free Instruction Scheduling. In: Proc. of the Int’l. Symp. on High Perf. Computer Architecture, HPCA (2004)
Kim, I., Lipasti, M.: Macro-Op Scheduling: Relaxing Scheduling Loop Constraints. In: The 36th International Symposium on Microarchitecture (2003)
Lebeck, A., et al.: A Large, Fast Instruction Window for Tolerating Cache Misses. In: The 29th Intl. Symp. on Comp. Arch, ISCA (2002)
Michaud, P., Seznec, A.: Data-Flow Prescheduling for Large Instruction Windows in Outof- Order Processors. In: HPCA (2001)
Palacharla, S., Jouppi, N., Smith, J.: Complexity-Effective Superscalar Processors. In: 24th Intl. Symposium on Computer Architecture (1997)
Raasch, S., Binkert, N., Reinhardt, S.: A Scalable Instruction Queue Design Using Dependence Chains. In: Proc. of ISCA (2002)
Sharkey, J., et al.: Instruction Packing: Reducing Power and Delay of the Dynamic Scheduling Logic. In: Proc. of ISLPED (2005)
Stark, J., Brown, M., Patt, Y.: On Pipelining Dynamic Instruction Scheduling Logic. In: 33rd Int’l. Symp. on Microarchitecture (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Sharkey, J.J., Ponomarev, D.V. (2005). Non-uniform Instruction Scheduling. In: Cunha, J.C., Medeiros, P.D. (eds) Euro-Par 2005 Parallel Processing. Euro-Par 2005. Lecture Notes in Computer Science, vol 3648. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11549468_61
Download citation
DOI: https://doi.org/10.1007/11549468_61
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-28700-1
Online ISBN: 978-3-540-31925-2
eBook Packages: Computer ScienceComputer Science (R0)