skip to main content
10.1145/1168857.1168875acmconferencesArticle/Chapter ViewAbstractPublication PagesasplosConference Proceedingsconference-collections
Article

A spatial path scheduling algorithm for EDGE architectures

Published: 20 October 2006 Publication History

Abstract

Growing on-chip wire delays are motivating architectural features that expose on-chip communication to the compiler. EDGE architectures are one example of communication-exposed microarchitectures in which the compiler forms dataflow graphs that specify how the microarchitecture maps instructions onto a distributed execution substrate. This paper describes a compiler scheduling algorithm called spatial path scheduling that factors in previously fixed locations - called anchor points - for each placement. This algorithm extends easily to different spatial topologies. We augment this basic algorithm with three heuristics: (1) local and global ALU and network link contention modeling, (2) global critical path estimates, and (3) dependence chain path reservation. We use simulated annealing to explore possible performance improvements and to motivate the augmented heuristics and their weighting functions. We show that the spatial path scheduling algorithm augmented with these three heuristics achieves a 21% average performance improvement over the best prior algorithm and comes within an average of 5% of the annealed performance for our benchmarks.

References

[1]
K. Arvind and R.S. Nikhil. Executing a program on the MIT taggedtoken dataflow architecture. IEEE Transactions on Computers, 39(3):300--318, 1990.
[2]
S.J. Beaty and P.H. Sweany. Instruction scheduling using simulated annealing. In International Conference on Massively Parallel Computing Systems, Colorado Springs, CO, Apr. 1998.
[3]
V. Betz and J. Rose. VPR: A new packing, placement and routing tool for FPGA research. In FPL '97: Proceedings of the 7th International Workshop on Field-Programmable Logic and Applications, pages 213--222, London, UK, 1997. Springer-Verlag.
[4]
D. Burger, S.W. Keckler, K.S. McKinley, M. Dahlin, L.K. John, C. Lin, C.R. Moore, J. Burrill, R.G. McDonald, W. Yoder, and others. Scaling to the end of silicon with EDGE architectures. IEEE Computer, pages 44--55, July 2004.
[5]
J.B. Dennis and D.P. Misunas. A preliminary architecture for a basic data-flow processor. In International Symposium on Computer Architecture, pages 126--132, New York, NY, USA, 1975.
[6]
J.R. Ellis. Bulldog: A Compiler for VLIW Architectures. MIT Press, 1986.
[7]
B. Fields, S. Rubin, and R. Bodik. Focusing processor policies via critical-path prediction. In Proceedings of the 28th Annual International Symposium on Computer Architecture, pages 74--85, July 2001.
[8]
J.A. Fisher, J.R. Ellis, J.C. Ruttenberg, and A. Nicolau. Parallel processing: A smart compiler and a dumb machine. In ACM Symposium on Compiler Construction, Montreal, Canada, June 1984.
[9]
E. Gibert, J. Sanchez, and A. Gonzalez. Effective instruction scheduling techniques for an interleaved cache clustered VLIW processor. In Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture, pages 123--133, 2002.
[10]
K. Kailas, K. Ebcioglu, and A.K. Agrawala. CARS: A new code generation framework for clustered ILP processors. In International Symposium on High-Performance Computer Architecture, pages 133--143, Jan. 2001.
[11]
C. Kessler and A. Bednarski. Optimal integrated code generation for clustered VLIWarchitectures. In Proceedings of the Joint Conference on Languages, Compilers and Tools for Embedded Systems, pages 102--111, June 2002.
[12]
S. Kirkpatrick, C.D. Gelatt Jr., and M.P. Vecchi. Optimization by simulated annealing. Science, 220(4598):671--680, 1983.
[13]
R.E. Korf. Depth-first iterative-deepening: an optimal admissible tree search. Artif. Intell., 27(1):97--109, 1985.
[14]
W. Lee, D. Puppin, S. Swanson, and S. Amarasinghe. Convergent scheduling. In International Symposium on Microarchitecture, Istanbul, Turkey, Oct. 2002.
[15]
M. Mercaldi, S. Swanson, A. Peterson, A. Putnam, A. Schwerin, M. Oskin, and S. Eggers. Modeling instruction placement on a spatial architecture. In SPAA '06: Proceedings of the Symposium on Parallel Architectures and Applications, 2006.
[16]
J. Moss, P.E. Utgoff, J. Cavazos, D. Precup, D. Stefanovic, C. Brodley, and D. Scheeff. Learning to schedule straight-line code. In Neural Information Processing Systems - Natural and Synthetic, Denver, CO, Dec. 1997.
[17]
R. Nagarajan, D. Burger, K.S. McKinley, C. Lin, S.W. Keckler, and S.K. Kushwaha. Instruction scheduling for emerging communication-exposed architectures. In The International Conference on Parallel Architectures and Compilation Techniques, pages 74--84, Antibes Juan-les-Pins, France, Oct. 2004.
[18]
R. Nagarajan, X. Chen, R.G. McDonald, D. Burger, and S.W. Keckler. Critical path analysis of the TRIPS architecture. In IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), March 2006.
[19]
E. Ozer, S. Banerjia, and T.M. Conte. Unified assign and schedule: A new approach to scheduling for clustered register file microarchitectures. In International Symposium on Microarchitecture, pages 308--315, December 1998.
[20]
P.G. Paulin and J.P. Knight. Force-directed scheduling in automatic data path synthesis. In DAC '87: Proceedings of the 24th ACM/IEEE conference on Design automation, pages 195--202, New York, NY, USA, 1987. ACM Press.
[21]
Y. Qian, S. Carr, and P. Sweany. Optimizing loop performance for clustered VLIW architectures. In The International Conference on Parallel Architectures and Compilation Techniques, pages 271--280, Charlottesville, VA, Sept. 2002.
[22]
A. Smith, J. Burrill, J. Gibson, B. Maher, N. Nethercote, B. Yoder, D. Burger, and K.S. McKinley. Compiling for EDGE architectures. In International Symposium on Code Generation and Optimization, Manhattan, NY, Mar. 2006.
[23]
S. Swanson, K. Michaelson, A. Schwerin, and M. Oskin. WaveScalar. In Proceedings of the 36th Symposium on Microarchitecture, December 2003.
[24]
S. Swanson, K. Michelson, and M. Oskin. Configuration by combustion: Online simulated annealing for dynamic hardware configuration. In ASPLOS X Wild and Crazy Idea Session, 2002.
[25]
E. Waingold, M. Taylor, D. Srikrishna, V. Sarkar, W. Lee, V. Lee, J. Kim, M. Frank, P. Finch, R. Barua, J. Babb, S. Amarasinghe, and A. Agarwal. Baring it all to software: Raw machines. IEEE Computer, pages 86--93, Sept. 1997.
[26]
J. Zalamea, J. Llosa, E. Ayguade, and M. Valero. Software and hardware techniques to optimize register file utilization in VLIW architectures. In Proceedings of the International Workshop on Advanced Compiler Technology for High Performance and Embedded Systems (IWACT), July 2001.

Cited By

View all
  • (2022)Accelerating Data Transfer in Dataflow Architectures Through a Look-Ahead Acknowledgment MechanismJournal of Computer Science and Technology10.1007/s11390-020-0555-637:4(942-959)Online publication date: 30-Jul-2022
  • (2020)Towards Higher Performance and Robust Compilation for CGRA Modulo SchedulingIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2020.298914931:9(2201-2219)Online publication date: 1-Sep-2020
  • (2018)Optimizing the Efficiency of Data Transfer in Dataflow Architectures2018 IEEE 20th International Conference on High Performance Computing and Communications; IEEE 16th International Conference on Smart City; IEEE 4th International Conference on Data Science and Systems (HPCC/SmartCity/DSS)10.1109/HPCC/SmartCity/DSS.2018.00050(140-149)Online publication date: Jun-2018
  • Show More Cited By

Index Terms

  1. A spatial path scheduling algorithm for EDGE architectures

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ASPLOS XII: Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
    October 2006
    440 pages
    ISBN:1595934510
    DOI:10.1145/1168857
    • cover image ACM SIGPLAN Notices
      ACM SIGPLAN Notices  Volume 41, Issue 11
      Proceedings of the 2006 ASPLOS Conference
      November 2006
      425 pages
      ISSN:0362-1340
      EISSN:1558-1160
      DOI:10.1145/1168918
      Issue’s Table of Contents
    • cover image ACM SIGARCH Computer Architecture News
      ACM SIGARCH Computer Architecture News  Volume 34, Issue 5
      Proceedings of the 2006 ASPLOS Conference
      December 2006
      425 pages
      ISSN:0163-5964
      DOI:10.1145/1168919
      Issue’s Table of Contents
    • cover image ACM SIGOPS Operating Systems Review
      ACM SIGOPS Operating Systems Review  Volume 40, Issue 5
      Proceedings of the 2006 ASPLOS Conference
      December 2006
      425 pages
      ISSN:0163-5980
      DOI:10.1145/1168917
      Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 20 October 2006

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. EDGE architecture
    2. instruction scheduling
    3. path scheduling
    4. simulated annealing

    Qualifiers

    • Article

    Conference

    ASPLOS06

    Acceptance Rates

    ASPLOS XII Paper Acceptance Rate 38 of 158 submissions, 24%;
    Overall Acceptance Rate 535 of 2,713 submissions, 20%

    Upcoming Conference

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)5
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 17 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2022)Accelerating Data Transfer in Dataflow Architectures Through a Look-Ahead Acknowledgment MechanismJournal of Computer Science and Technology10.1007/s11390-020-0555-637:4(942-959)Online publication date: 30-Jul-2022
    • (2020)Towards Higher Performance and Robust Compilation for CGRA Modulo SchedulingIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2020.298914931:9(2201-2219)Online publication date: 1-Sep-2020
    • (2018)Optimizing the Efficiency of Data Transfer in Dataflow Architectures2018 IEEE 20th International Conference on High Performance Computing and Communications; IEEE 16th International Conference on Smart City; IEEE 4th International Conference on Data Science and Systems (HPCC/SmartCity/DSS)10.1109/HPCC/SmartCity/DSS.2018.00050(140-149)Online publication date: Jun-2018
    • (2018)Coarse-Grained Reconfigurable Array ArchitecturesHandbook of Signal Processing Systems10.1007/978-3-319-91734-4_12(427-472)Online publication date: 14-Oct-2018
    • (2017)A static-placement, dynamic-issue framework for CGRA loop acceleratorProceedings of the Conference on Design, Automation & Test in Europe10.5555/3130379.3130697(1348-1353)Online publication date: 27-Mar-2017
    • (2017)A static-placement, dynamic-issue framework for CGRA loop acceleratorDesign, Automation & Test in Europe Conference & Exhibition (DATE), 201710.23919/DATE.2017.7927202(1348-1353)Online publication date: Mar-2017
    • (2017)A simple method to solve the network congestion for spitial architctureJournal of Shanghai Jiaotong University (Science)10.1007/s12204-017-1802-z22:1(72-76)Online publication date: 26-Jan-2017
    • (2016)POSTERProceedings of the 2016 International Conference on Parallel Architectures and Compilation10.1145/2967938.2974054(441-442)Online publication date: 11-Sep-2016
    • (2013)A general constraint-centric scheduling framework for spatial architecturesACM SIGPLAN Notices10.1145/2499370.246216348:6(495-506)Online publication date: 16-Jun-2013
    • (2013)A general constraint-centric scheduling framework for spatial architecturesProceedings of the 34th ACM SIGPLAN Conference on Programming Language Design and Implementation10.1145/2491956.2462163(495-506)Online publication date: 16-Jun-2013
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media