Abstract
This paper presents a set of efficient graph transformations for local instruction scheduling. These transformations to the data-dependency graph prune redundant and inferior schedules from the solution space of the problem. Optimally scheduling the transformed problems using an enumerative scheduler is faster and the number of problems solved to optimality within a bounded time is increased. Furthermore, heuristic scheduling of the transformed problems often yields improved schedules for hard problems. The basic node-based transformation runs in O(ne) time, where n is the number of nodes and e is the number of edges in the graph. A generalized subgraph-based transformation runs in O(n2 e) time. The transformations are implemented within the Gnu Compiler Collection (GCC) and are evaluated experimentally using the SPEC CPU2000 floating-point benchmarks targeted to various processor models. The results show that the transformations are fast and improve the results of both heuristic and optimal scheduling.
Similar content being viewed by others
References
Berson, D., R. Gupta, and M. L. Soffa, “Resource spackling: A framework for integrating register allocation in local and global schedulers,” in Proceedings of the Conference on Parallel Architectures and Compilation Techniques, 1994.
Berson, D., R. Gupta, and M. L. Soffa, “Integrated instruction scheduling and register allocation techniques,” in Languages and Compilers for Parallel Computing, August 1998, pp. 247–262.
Chou, H.-C. and C.-P. Chung, “An optimal instruction scheduler for superscalar processor,” IEEE Transactions on Parallel and Distributed Systems, 6(3), 303–313 (1995).
Cormen, T., C. Leiserson, R. Rivest, and C. Stein, Introduction to Algorithms, 2nd edn., MIT Press, 2001.
Dorndorf, U., T. Phan, and E. Pesch, “A survey of interval capacity consistency tests for time-and resource-constrained scheduling,” in J. Weglarz (ed), Handbook on Recent Advances in Project Scheduling, Kluwer Academic Publishers, 1998.
Fernandez, E. B. and T. Lang. “Scheduling as a graph transformation,” IBM Journal of Research and Development, 20(6), 551–559 (1976).
Govindarajan, R., H. Yang, J. Amaral, C. Zhang, and G. Gao, “Minimum register instruction sequence problem: Revisiting optimal code generation for DAGs,” in Proceedings of the 15th International Parallel and Distributed Processing Symposium, April 2001.
Govindarajan, R., H. Yang, J. Amaral, C. Zhang, and G. Gao, “Minimum register instruction sequencing to reduce register spills in out-of-order issue superscalar architectures,” IEEE Transactions on Computers, 52(1), (January 2003).
Hartmann, S., Project Scheduling under Limited Resources, Springer-Verlag, 1999.
Henning, J., “SPEC CPU2000: Measuring CPU performance in the new millennium,” IEEE Computer, 33(7), 28–35 (2000).
Hennessy, J. and T. Gross, “Postpass code optimization of pipeline constraints,” ACM Transactions on Programming Languages and Systems, 5, 422–448 (1983).
Hennessy, J. and D. Patterson, Computer Architecture: A Quantitative Approach, 3rd ed., Morgan Kaufmann, 2002.
Hwu, W., S. Mahlke, W. Chen, P. Chang, N. Warter, R. Bringmann, R. Ouellette, R. Hank, T. Kiyohara, G. Haab, J. Holm, and D. Lavery, “The superblock: An effective technique for VLIW and superscalar compilation,” The Journal of Supercomputing, 7(1), (January 1993).
Ibaraki, T., “The power of dominance relations in brand-and-bound algorithms,” Journal of the ACM, 24(2), 264–279 (1977).
Inagaki, T., H. Komatsu, and T. Nakatani, “Integrated prepass scheduling for a Java just-in-time compiler on the IA-64 architecture,” in Proceedings of the International Symposium on Code Generation and Optimization, 2003, pp. 159–168.
Klein, R., Scheduling of Resource-Constrained Projects, Kluwer Academic Publishers, 2000.
Leung, A., K. Palem, and A. Pnueli, “Scheduling time-constrained instructions on pipelined processors,” ACM Transactions on Programming Languages and Systems, 23(1), 73–103 (2001).
Leiserson, C. and J. Saxe, “Optimizing synchronous systems,” Journal of VLSI and Computer Systems, 1(1),41–67 (1983).
Lloyd, E., “Critical path scheduling of task systems with resource and processor constraints,” in Proceedings of the 12th ACM Symposium on Theory of Computing, 1980, pp. 436–446.
Mitchell, M. and A. Samuel, “GCC 3.0: The state of the source,” in Proceedings of the Fourth Annual Linux Showcase and Conference, USENIX Association 2000, October 2004, pp. 187–193.
Muchnick, S., Advanced Compiler Design and Implementation, Morgan Kaufmann, 1997.
Narasimhan, M. and J. Ramanujam, “A fast approach to computing exact solutions to the resource-constrained scheduling problem,” ACM Transactions on Design Automation of Electronic Systems, 6(4), 490–500 (2001).
Ramamoorthy, C., K. Chandy, and M. Gonzalez, Jr. “Optimal scheduling strategies in a multiprocessor system,” IEEE Transactions on Computers, 21(2), 137–146 (1972).
Tongsima, S., T. O’Neil, and E. Sha, “Optimal scheduling of data-flow graphs using extended retiming,” in Proceedings of the ISCA 12th International Conference on Parallel and Distributed Computing Systems, August 1999.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Heffernan, M., Wilken, K. Data-Dependency Graph Transformations for Instruction Scheduling. J Sched 8, 427–451 (2005). https://doi.org/10.1007/s10951-005-2862-8
Issue Date:
DOI: https://doi.org/10.1007/s10951-005-2862-8