skip to main content
10.1145/3582514.3582521acmconferencesArticle/Chapter ViewAbstractPublication PagesppoppConference Proceedingsconference-collections
research-article
Public Access

Towards Maximum Throughput of Dataflow Software Pipeline under Resource Constraints

Published:25 February 2023Publication History

ABSTRACT

This work proposes a novel algorithm and Integer Linear Programming (ILP) formulation to optimize the pipelined code mapping of dataflow graph under a given budget generated by optimizing compilers. The goal of this optimization technique is to maximize the throughput of dataflow software pipelining under the given budget, i.e. when the minimum number of fifo buffers needed to optimally balance the dataflow graph are not available with the system. A proposed algorithm uses a two-fold solution by combining a well-established optimal dataflow graph balancing ILP formulation which doesn't consider resource budget constraints with our proposed ILP formulation which considers resource budget constraints. Our algorithm efficiently maximizes the throughput of dataflow software pipeline under a given resource budget. Additionally, we introduce a cycle-accurate dataflow graph simulator for the evaluation of various balancing techniques. We perform an experimental evaluation of different optimizing techniques and show that our proposed novel algorithm performs relatively well compared to existing techniques.

References

  1. Dennis Abts, Jonathan Ross, Jonathan Sparling, Mark Wong-VanHaren, Max Baker, Tom Hawkins, Andrew Bell, John Thompson, Temesghen Kahsai, Garrin Kimmell, Jennifer Hwang, Rebekah Leslie-Hurd, Michael Bye, E.R. Creswick, Matthew Boyd, Mahitha Venigalla, Evan Laforge, Jon Purdy, Purushotham Kamath, Dinesh Maheshwari, Michael Beidler, Geert Rosseel, Omar Ahmad, Gleb Gagarin, Richard Czekalski, Ashay Rane, Sahil Parmar, Jeff Werner, Jim Sproch, Adrian Macias, and Brian Kurtz. 2020. Think Fast: A Tensor Streaming Processor (TSP) for Accelerating Deep Learning Workloads. In 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA). 145--158. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Dennis Abts, Jonathan Ross, Jonathan Sparling, Mark Wong-VanHaren, Max Baker, Tom Hawkins, Andrew Bell, John Thompson, Temesghen Kahsai, Garrin Kimmell, Jennifer Hwang, Rebekah Leslie-Hurd, Michael Bye, E. R. Creswick, Matthew Boyd, Mahitha Venigalla, Evan Laforge, Jon Purdy, Purushotham Kamath, Dinesh Maheshwari, Michael Beidler, Geert Rosseel, Omar Ahmad, Gleb Gagarin, Richard Czekalski, Ashay Rane, Sahil Parmar, Jeff Werner, Jim Sproch, Adrian Macias, and Brian Kurtz. 2020. Think Fast: A Tensor Streaming Processor (TSP) for Accelerating Deep Learning Workloads. IEEE Press, 145--158. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Ravindra K. Ahuja, Thomas L. Magnanti, and James B. Orlin. 1993. Network Flows: Theory, Algorithms, and Applications. Prentice-Hall, Inc., USA.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Arvind and R. S. Nikhil. 1990. Executing a program on the MIT tagged-token dataflow architecture. IEEE Trans. Comput. 39, 3 (March 1990), 300--318. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Vladimir Batagelj and Ulrik Brandes. 2005. Efficient generation of large random networks. Phys. Rev. E 71 (Mar 2005), 036113. Issue 3. Google ScholarGoogle ScholarCross RefCross Ref
  6. E. Boros, P.L. Hammer, and R. Shamir. 1992. A polynomial algorithm for balancing acyclic data flow graphs. IEEE Trans. Comput. 41, 11 (1992), 1380--1385. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Endre Boros, Peter L. Hammer, Mark E. Hartmann, and Ron Shamir. 1994. Balancing problems in acyclic networks. Discrete Applied Mathematics 49, 1 (1994), 77--93. Special Volume Viewpoints on Optimization. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. R.H. Dennard, F.H. Gaensslen, Hwa-Nien Yu, V.L. Rideout, E. Bassous, and A.R. LeBlanc. 1974. Design of ion-implanted MOSFET's with very small physical dimensions. IEEE Journal of Solid-State Circuits 9, 5 (1974), 256--268. Google ScholarGoogle ScholarCross RefCross Ref
  9. Steven Diamond and Stephen Boyd. 2016. CVXPY: A Python-Embedded Modeling Language for Convex Optimization. J. Mach. Learn. Res. 17, 1 (Jan. 2016), 2909--2913.Google ScholarGoogle Scholar
  10. Jose M Monsalve Diaz, Kevin Harms, Rafael A. Herrera Guaitero, Diego A. Roa Perdomo, Kalyan Kumaran, and Guang R. Gao. 2022. The SuperCodelet Architecture. In Proceedings of the 1st International Workshop on Extreme Heterogeneity Solutions (Seoul, Republic of Korea) (ExHET '22). Association for Computing Machinery, New York, NY, USA, Article 2, 6 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. John Ellson, Emden R. Gansner, Eleftherios Koutsofios, Stephen C. North, and Gordon Woodhull. 2004. Graphviz and Dynagraph --- Static and Dynamic Graph Drawing Tools. Springer Berlin Heidelberg, Berlin, Heidelberg, 127--148. Google ScholarGoogle ScholarCross RefCross Ref
  12. D. R. Ford and D. R. Fulkerson. 2010. Flows in Networks. Princeton University Press, USA.Google ScholarGoogle Scholar
  13. G. Gao, J. Suetterlein, and S. Zuckerman. 2011. CAPSL Technical Memo 104: Toward an Execution Model for Extreme-Scale Systems - Runnemede and Beyond.Google ScholarGoogle Scholar
  14. Guang R. Gao. 1989. Algorithmic Aspects of Balancing Techniques for Pipelined Data Flow Code Generation. J. Parallel Distrib. Comput. 6, 1 (Feb. 1989), 39--61. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Guang R. Gao. 1990. A Code Mapping Scheme for Dataflow Software Pipelining. Kluwer Academic Publishers, Norwell, MA, USA.Google ScholarGoogle Scholar
  16. G. R. Gao, R. Govindarajan, and P. Panangaden. 1992. Well-behaved dataflow programs for DSP computation. In [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing, Vol. 5. 561--564 vol.5. Google ScholarGoogle ScholarCross RefCross Ref
  17. G. R. Gao and R. Tio. 1989. Instruction set architecture of an efficient pipelined dataflow architecture. In [1989] Proceedings of the Twenty-Second Annual Hawaii International Conference on System Sciences. Volume 1: Architecture Track, Vol. 1. 385--392 vol.1. Google ScholarGoogle ScholarCross RefCross Ref
  18. Al Geist and Robert Lucas. 2009. Major Computer Science Challenges At Exascale. The International Journal of High Performance Computing Applications 23, 4 (2009), 427--436. arXiv:https://doi.org/10.1177/1094342009347445 Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. GNU. 2021. GLPK (GNU Linear Programming Kit). https://www.gnu.org/software/glpk/.Google ScholarGoogle Scholar
  20. R. Govindarajan, Guang R. Gao, and Palash Desai. 2002. Minimizing Buffer Requirements under Rate-Optimal Schedule in Regular Dataflow Networks. Journal of VLSI signal processing systems for signal, image and video technology 31, 3, 207--229. Google ScholarGoogle Scholar
  21. Graphcore. 2021. Graphcore Intelligent Processing Unit. https://www.graphcore.ai/products/ipu.Google ScholarGoogle Scholar
  22. Aric A. Hagberg, Daniel A. Schult, and Pieter J. Swart. 2008. Exploring Network Structure, Dynamics, and Function using NetworkX. In Proceedings of the 7th Python in Science Conference, Gaël Varoquaux, Travis Vaught, and Jarrod Millman (Eds.). Pasadena, CA USA, 11 -- 15.Google ScholarGoogle Scholar
  23. Rehan Hameed, Wajahat Qadeer, Megan Wachs, Omid Azizi, Alex Solomatnikov, Benjamin C. Lee, Stephen Richardson, Christos Kozyrakis, and Mark Horowitz. 2010. Understanding Sources of Inefficiency in General-Purpose Chips. SIGARCH Comput. Archit. News 38, 3 (jun 2010), 37--47. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Charles R. Harris, K. Jarrod Millman, Stéfan J. van der Walt, Ralf Gommers, Pauli Virtanen, David Cournapeau, Eric Wieser, Julian Taylor, Sebastian Berg, Nathaniel J. Smith, Robert Kern, Matti Picus, Stephan Hoyer, Marten H. van Kerkwijk, Matthew Brett, Allan Haldane, Jaime Fernández del Río, Mark Wiebe, Pearu Peterson, Pierre Gérard-Marchant, Kevin Sheppard, Tyler Reddy, Warren Weckesser, Hameer Abbasi, Christoph Gohlke, and Travis E. Oliphant. 2020. Array programming with NumPy. Nature 585, 7825 (Sept. 2020), 357--362. Google ScholarGoogle ScholarCross RefCross Ref
  25. John L. Hennessy and David A. Patterson. 2019. A New Golden Age for Computer Architecture. Commun. ACM 62, 2 (Jan. 2019), 48--60. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Andy Hock. 2020. Cerebras Wafer Scale Engine: An Introduction. https://www.cerebras.net/hello-world/.Google ScholarGoogle Scholar
  27. Donald E. Knuth. 1997. The Art of Computer Programming, Volume 1 (3rd Ed.): Fundamental Algorithms. Addison Wesley Longman Publishing Co., Inc., USA.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Habana Labs. 2021. Habana Gaudi 2. https://habana.ai/wp-content/uploads/pdf/2022/gaudi2-whitepaper.pdf.Google ScholarGoogle Scholar
  29. Siddhisanket Raskar. 2021. Dataflow Graph Simulator GitHub Repository. https://github.com/sraskar/dataflow-simulator.Google ScholarGoogle Scholar
  30. Siddhisanket Raskar and Thomas Applencourt. 2021. Balancing Techniques GitHub Repository. https://github.com/TApplencourt/BalancePoint.Google ScholarGoogle Scholar
  31. Siddhisanket Raskar, Thomas Applencourt, Kalyan Kumaran, and Guang Gao. 2019. Position Paper: Extending Codelet Model for Dataflow Software Pipelining using Software-Hardware Co-Design. In 2019 IEEE 43rd Annual Computer Software and Applications Conference (COMPSAC), Vol. 2. 640--645. Google ScholarGoogle ScholarCross RefCross Ref
  32. John Ruttenberg, G. R. Gao, A. Stoutchinin, and W. Lichtenstein. 1996. Software Pipelining Showdown: Optimal vs. Heuristic Methods in a Production Compiler. In Proceedings of the ACM SIGPLAN 1996 Conference on Programming Language Design and Implementation (Philadelphia, Pennsylvania, USA) (PLDI '96). Association for Computing Machinery, New York, NY, USA, 1--11. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. R.R. Schaller. 1997. Moore's law: past, present and future. IEEE Spectrum 34, 6 (1997), 52--59. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Zuckerman Stéphane, Suetterlein Joshua, Knauerhase Rob, and Gao Guang R. 2011. Using a "Codelet" Program Execution Model for Exascale Machines: Position Paper. In Proceedings of the 1st International Workshop on Adaptive Self-Tuning Computing Systems for the Exaflop Era (San Jose, California, USA) (EXADAPT '11). ACM, New York, NY, USA, 64--69. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. SambaNova Systems. 2021. Accelerated Computing with a Reconfigurable Dataflow Architecture. https://sambanova.ai/.Google ScholarGoogle Scholar

Index Terms

  1. Towards Maximum Throughput of Dataflow Software Pipeline under Resource Constraints

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      PMAM'23: Proceedings of the 14th International Workshop on Programming Models and Applications for Multicores and Manycores
      February 2023
      73 pages
      ISBN:9798400701153
      DOI:10.1145/3582514

      Copyright © 2023 ACM

      Publication rights licensed to ACM. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of the United States government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 25 February 2023

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate53of97submissions,55%
    • Article Metrics

      • Downloads (Last 12 months)62
      • Downloads (Last 6 weeks)6

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader