skip to main content
10.1145/3649329.3657352acmconferencesArticle/Chapter ViewAbstractPublication PagesdacConference Proceedingsconference-collections
research-article
Open access

Revisiting Automatic Pipelining: Gate-level Forwarding and Speculation

Published: 07 November 2024 Publication History

Abstract

Pipelining is a widely applied micro-architectural performance optimization and requires non-trivial designs for better execution throughput. The key to pipeline throughput optimization is to resolve data hazards caused by read-after-write (RAW) dependencies, which are traditionally tackled by forwarding and speculation to avoid pipeline stalls. However, existing approaches are conducted based on high-level dataflow analysis, with potential loss of optimization opportunities for lack of analysis of the netlist structures.
We propose an efficient method to resolve RAW dependencies with low-level netlist analysis by gate-level forwarding and speculation. With a greedy search method to detect and resolve short-delay gate-level signal paths for forwarding and an approximate circuit synthesis method with formal verification for gate-level speculation, the method efficiently utilizes the gate-level information to further improve pipeline throughput. Even if the gate-level design space is vast, by assembling the logic gates in the short-delay signal paths in fewer adjacent pipeline stages, the forwarding paths and the speculator bypass invalid pipeline stages and thus conceal the latency of these paths. We conduct experiments on the widely-used ISCAS/EPFL benchmark circuits and a large-scale RISC-V CPU, and our approach can find better designs than human experts.

References

[1]
Mythri Alle, Antoine Morvan, and Steven Derrien. 2013. Runtime dependency analysis for loop pipelining in high-level synthesis. In Proceedings of the DAC. 1--10.
[2]
Pei-Wei Chen, Yu-Ching Huang, Cheng-Lin Lee, and Jie-Hong Roland Jiang. 2020. Circuit learning for logic regression on high dimensional boolean space. In Proceedings of the DAC. IEEE, 1--6.
[3]
Jason Cong, Yiping Fan, and Zhiru Zhang. 2004. Architecture-level synthesis for automatic interconnect pipelining. In Proceedings of the DAC. 602--607.
[4]
Jason Cong and Chang Wu. 1997. FPGA synthesis with retiming and pipelining for clock period minimization of sequential circuits. In Proceedings of the DAC. 644--649.
[5]
Jordi Cortadella, Marc Galceran-Oms, and Mike Kishinevsky. 2010. Elastic systems. In Proceedings of the MEMOCODE. IEEE, 149--158.
[6]
Steven Derrien, Thibaut Marty, Simon Rokicki, and Tomofumi Yuki. 2020. Toward speculative loop pipelining for high-level synthesis. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 39, 11 (2020), 4229--4239.
[7]
Mathias Fleury and Armin Biere. 2022. Scalable Proof Producing Multi-Threaded SAT Solving with Gimsatul through Sharing instead of Copying Clauses. ArXiv abs/2207.13577 (2022).
[8]
Freddy Gabbay and Avi Mendelson. 1996. Speculative execution based on value prediction. Citeseer.
[9]
Marc Galceran-Oms, Jordi Cortadella, Dmitry Bufistov, and Mike Kishinevsky. 2010. Automatic microarchitectural pipelining. In Proceedings of the DATE. IEEE, 961--964.
[10]
Ilya Ganusov, Henri Fraisse, Aaron Ng, Rafael Trapani Possignolo, and Sabya Das. 2016. Automated extra pipeline analysis of applications mapped to Xilinx UltraScale+ FPGAs. In Proceedings of the FPL. IEEE, 1--10.
[11]
Sumit Gulwani, Oleksandr Polozov, Rishabh Singh, et al. 2017. Program synthesis. Foundations and Trends® in Programming Languages 4, 1-2 (2017), 1--119.
[12]
Ching-Yi Huang, Chi-An Rocky Wu, Tung-Yuan Lee, Chih-Jen Jacky Hsu, and Kei-Yong Khoo. 2019. 2019 CAD contest: Logic regression on high dimensional boolean space. In Proceedings of the ICCAD. IEEE, 1--6.
[13]
Matei Iştoan and Florent De Dinechin. 2017. Automating the pipeline of arithmetic datapaths. In Proceedings of the DATE. IEEE, 704--709.
[14]
Timothy Kam, Michael Kishinevsky, Jordi Cortadella, and Marc Galceran-Oms. 2008. Correct-by-construction microarchitectural pipelining. In Proceedings of the ICCAD. IEEE, 434--441.
[15]
Mikko H Lipasti and John Paul Shen. 1996. Exceeding the dataflow limit via value prediction. In Proceedings of the MICRO. IEEE, 226--237.
[16]
Junyi Liu, John Wickerson, Samuel Bayliss, and George A Constantinides. 2017. Polyhedral-based dynamic loop pipelining for high-level synthesis. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 37, 9 (2017), 1802--1815.
[17]
Maria-Cristina V Marinescu and Martin Rinard. 2001. High-level automatic pipelining for sequential circuits. In Proceedings of the ISSS. 215--220.
[18]
Somsubhra Mondal and Seda Öǧrenci Memik. 2005. Resource sharing in pipelined CDFG synthesis. In Proceedings of the ASPDAC. 795--798.
[19]
Syed Waqar Nabi and Wim Vanderbauwhede. 2019. Automatic pipelining and vectorization of scientific code for FPGAs. International Journal of Reconfigurable Computing 2019 (2019), 1--12.
[20]
Asger Munk Nielsen, David W Matula, Chung Nan Lyu, and Guy Even. 2000. An IEEE compliant floating-point adder that conforms with the pipeline packet-forwarding paradigm. IEEE Trans. Comput. 49, 1 (2000), 33--47.
[21]
Eriko Nurvitadhi, James C Hoe, Timothy Kam, and Shih-Lien L Lu. 2011. Automatic pipelining from transactional datapath specifications. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 30, 3 (2011), 441--454.
[22]
Hossein Omidian and Guy GF Lemieux. 2019. Low-level Loop Analysis and Pipelining of Applications mapped to Xilinx FPGAs. In Proceedings of the FPL. IEEE, 391--396.
[23]
Arthur Perais and André Seznec. [n.d.]. Practical data value speculation for future high-end processors. In Proceedings of the HPCA, pages=428--439, year=2014, organization=IEEE.
[24]
Shubham Rai, Walter Lau Neto, Yukio Miyasaka, Xinpei Zhang, Mingfei Yu, Qingyang Yi, Masahiro Fujita, Guilherme B Manske, Matheus F Pontes, Leomar S da Rosa, et al. 2021. Logic synthesis meets machine learning: Trading exactness for generalization. In Proceedings of the DATE. IEEE, 1026--1031.
[25]
Chittoor V Ramamoorthy and Hon Fung Li. 1977. Pipeline architecture. Comput. Surveys 9, 1 (1977), 61--102.
[26]
John Paul Shen and Mikko H Lipasti. 2013. Modern processor design: fundamentals of superscalar processors. Waveland Press. 82 pages.
[27]
Yuzhe Tang and Bugra Gedik. 2012. Autopipelining for data stream processing. IEEE Transactions on Parallel and Distributed Systems 24, 12 (2012), 2344--2354.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
DAC '24: Proceedings of the 61st ACM/IEEE Design Automation Conference
June 2024
2159 pages
ISBN:9798400706011
DOI:10.1145/3649329
This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 November 2024

Check for updates

Qualifiers

  • Research-article

Funding Sources

  • NSF of China
  • CAS Project for Young Scientists in Basic Research
  • Youth Innovation Promotion Association CAS

Conference

DAC '24
Sponsor:
DAC '24: 61st ACM/IEEE Design Automation Conference
June 23 - 27, 2024
CA, San Francisco, USA

Acceptance Rates

Overall Acceptance Rate 1,770 of 5,499 submissions, 32%

Upcoming Conference

DAC '25
62nd ACM/IEEE Design Automation Conference
June 22 - 26, 2025
San Francisco , CA , USA

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 195
    Total Downloads
  • Downloads (Last 12 months)195
  • Downloads (Last 6 weeks)53
Reflects downloads up to 27 Feb 2025

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media