ABSTRACT
Research on High-Level Synthesis has mainly focused on applications with statically determinable characteristics and current tools often perform poorly in presence of data-dependent memory accesses. The reason is that they rely on conservative static scheduling strategies, which lead to inefficient implementations. In this work, we propose to address this issue by leveraging well-known techniques used in superscalar processors to perform runtime memory disambiguation. Our approach, implemented as a source-to-source transformation at the C level, demonstrates significant performance improvements for a moderate increase in area while retaining portability among HLS tools.
- {BZ06} L. Baugh and C. Zilles. Decomposing the Load-store queue by Function for Power Reduction and Scalability. IBM Journal on Reearch and Development, 50(2/3):287--297, March 2006. Google ScholarDigital Library
- {CBF95} Jean-François Collard, Denis Barthou, and Paul Feautrier. Fuzzy Array Dataflow Analysis. SIGPLAN Notices, 30(8):92--101, 1995. Google ScholarDigital Library
- {Fea91} Paul. Feautrier. Dataflow Analysis of Array and Scalar References. International Journal of Parallel Programming, 1991.Google Scholar
- {GCM+94} David M. Gallagher, William Y. Chen, Scott A. Mahlke, John C. Gyllenhaal, and Wenmei W. Hwu. Dynamic Memory Disambiguation using the Memory Conflict Buffer. ACM SIGOPS Operating Systems Review, 28(5):183--193, December 1994. Google ScholarDigital Library
- {GFL00} Martin Griebl, Paul Feautrier, and Christian Lengauer. Index Set Splitting. International Journal of Parallel Programming, 28(6):607--631, December 2000. Google ScholarCross Ref
- {GSK+01} Sumit Gupta, Nick Savoiu, Sunwoo Kim, Nikil Dutt, Rajesh Gupta, and Alex Nicolau. Speculation Techniques for High Level Synthesis of Control Intensive Designs. In Proceedings of the 38th conference on Design automation, pages 269--272. ACM Press, June 2001. Google ScholarDigital Library
- {HH94} AS Huang and S Httang. Speculative Disambiguation: A Compilation Technique for Dynamic Memory Disambiguation. SIGARCH Computer Architecture News, pages 200--210, 1994. Google ScholarDigital Library
- {KA02} Ken Kennedy and John R. Allen. Optimizing Compilers for Modern Architectures: a Dependence-based Approach. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 2002. Google ScholarDigital Library
- {KW02} Apostolos A. Kountouris and Christophe Wolinski. Efficient Scheduling of Conditional Behaviors for High-level Synthesis. ACM Transactions on Design Automation of Electronic Systems, 7(3):380--412, July 2002. Google ScholarDigital Library
- {MDQ11} Antoine Morvan, Steven Derrien, and Patrice Quinton. Efficient Nested Loop Pipelining in High Level Synthesis using Polyhedral Bubble Insertion. In International Conference on Field-Programmable Technology, pages 1--10. IEEE, December 2011.Google ScholarCross Ref
- {MNJH00} Uma Mahadevan, Kevin Nomura, Roy Dz-ching Ju, and Rick Hank. Applying Data Speculation in Modulo Scheduled Loops. In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, PACT '00, pages 169--. IEEE Computer Society, 2000. Google ScholarDigital Library
- {Nic89} Alexandru Nicolau. Run-Time Disambiguation: Coping with Statically Unpredictable Dependencies. IEEE Transactions On Computers, 38(5):663--678, May 1989. Google ScholarDigital Library
- {OR12} Cosmin E. Oancea and Lawrence Rauchwerger. Logical Inference Techniques for Loop Parallelization. In Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation, volume 47, page 509. ACM Press, June 2012. Google ScholarDigital Library
- {Pug91} William Pugh. The Omega Test: a Fast and Practical Integer Programming Algorithm for Dependence Analysis. In Proceedings of the 1991 ACM/IEEE conference on Supercomputing - Supercomputing '91, pages 4--13. ACM Press, August 1991. Google ScholarDigital Library
- {RB94} Ivan Radivojevic and Forrest Brewer. Incorporating Speculative Execution in Exact Control-dependent Scheduling. In Proceedings of the Design Automation Conference, DAC '94, pages 479--484, 1994. Google ScholarDigital Library
- {RRH03} Silvius Rus, Lawrence Rauchwerger, and Jay Hoeflinger. Hybrid Analysis: Static & Dynamic Memory Reference Analysis. International Journal of Parallel Programming, 31(4):251--283, August 2003. Google ScholarDigital Library
- {SCAV02} Esther Salamí, Jesús Corbal, Carlos Álvarez, and Mateo Valero. Cost Effective Memory Disambiguation for Multimedia Codes. In Proceedings of the 2002 international conference on Compilers, architecture, and synthesis for embedded systems, CASES '02, pages 117--126, New York, NY, USA, 2002. ACM. Google ScholarDigital Library
- {SDB+03} Simha Sethumadhavan, Rajagopalan Desikan, Doug Burger, Charles R. Moore, and Stephen W. Keckler. Scalable Hardware Memory Disambiguation for High ILP Processors. In Proceedings of IEEE/ACM International Symposium on Microarchitecture, pages 399--. IEEE, 2003. Google ScholarDigital Library
- {Smi84} James E. Smith. Decoupled access/execute Computer Architectures. ACM Transaction on Computer Systems, 2(4):289--308, November 1984. Google ScholarDigital Library
- {THK11} Benjamin Thielmann, Jens Huthmann, and Andreas Koch. Precore-A Token-Based Speculation Architecture for High-Level Language to Hardware Compilation. International conference on Field Programmable Logic, pages 123--129, September 2011. Google ScholarDigital Library
- {VCG05} Girish Venkataramani, Tiberiu Chelcea, and Seth Copen SC Goldstein. HLS Support for Unconstrained Memory Accesses. In IEEE International Workshop on Logic Synthesis (IWLS), Lake Arrowhead, CA, 2005.Google Scholar
Index Terms
- Runtime dependency analysis for loop pipelining in high-level synthesis
Recommendations
Dynamic Hazard Resolution for Pipelining Irregular Loops in High-Level Synthesis
FPGA '17: Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate ArraysCurrent pipelining approach in high-level synthesis (HLS) achieves high performance for applications with regular and statically analyzable memory access patterns. However, it cannot effectively handle infrequent data-dependent structural and data ...
Polyhedral-Based Dynamic Loop Pipelining for High-Level Synthesis
Loop pipelining is one of the most important optimization methods in high-level synthesis (HLS) for increasing loop parallelism. There has been considerable work on improving loop pipelining, which mainly focuses on optimizing static operation ...
Flushing-Enabled Loop Pipelining for High-Level Synthesis
DAC '14: Proceedings of the 51st Annual Design Automation ConferenceLoop pipelining is a widely-accepted technique in high-level synthesis to enable pipelined execution of successive loop iterations to achieve high performance. Existing loop pipelining methods provide inadequate support for pipeline flushing. In this ...
Comments