skip to main content
research-article

Quick-Div: Rethinking Integer Divider Design for FPGA-based Soft-processors

Published: 04 February 2022 Publication History

Abstract

In today’s FPGA-based soft-processors, one of the slowest instructions is integer division. Compared to the low single-digit latency of other arithmetic operations, the fixed 32-cycle latency of radix-2 division is substantially longer. Given that today’s soft-processors typically only implement radix-2 division—if they support hardware division at all—there is significant potential to improve the performance of integer dividers.
In this work, we present a set of high-performance, data-dependent, variable-latency integer dividers for FPGA-based soft-processors that we call Quick-Div. We compare them to various radix-N dividers and provide a thorough analysis in terms of latency and resource usage. In addition, we analyze the frequency scaling for such divider designs when (1) treated as a stand-alone unit and (2) integrated as part of a high-performance soft-processor. Moreover, we provide additional theoretical analysis of different dividers’ behaviour and develop a new better-performing Quick-Div variant, called Quick-radix-4. Experimental results show that our Quick-radix-4 design can achieve up to 6.8× better performance and 6.1× better performance-per-LUT over the radix-2 divider for applications such as random number generation. Even in cases where division operations constitute as little as 1% of all executed instructions, Quick-radix-4 provides a performance uplift of 16% compared to the radix-2 divider.

References

[1]
Cobham Gaisler A.B. 2021. GRLIB IP Core User’s Manual. Retrieved from gaisler.com/products/grlib/grip.pdf.
[2]
G. Cornetta and J. Cortadella. 2001. A multi-radix approach to asynchronous division. In Proceedings of the 7th International Symposium on Asynchronous Circuits and Systems (ASYNC’01). 25–34. DOI:
[3]
J. Cortadella and T. Lang. 1994. High-radix division and square-root with speculation. IEEE Trans. Comput. 43, 8 (1994), 919–931. DOI:
[5]
Florent de Dinechin and Laurent-Stéphane Didier. 2012. Table-based division by small integer constants. In Proceedings of the 8th International Conference on Reconfigurable Computing: Architectures, Tools and Applications (ARC’12). Springer-Verlag, Berlin, 53–63. DOI:
[6]
Embench Task Group. 2019. Embench: Open Benchmarks for Embedded Platforms. Retrieved from https://github.com/embench/embench-iot.
[7]
M. D. Ercegovac, T. Lang, J.-M. Muller, and A. Tisserand. 2000. Reciprocation, square root, inverse square root, and some elementary functions using small multipliers. IEEE Trans. Comput. 49, 7 (2000), 628–637. DOI:
[8]
X. Fang and M. Leeser. 2013. Vendor agnostic, high-performance, double precision Floating Point division for FPGAs. In Proceedings of the IEEE High Performance Extreme Computing Conference (HPEC’13). 1–5. DOI:
[9]
Xin Fang and Miriam Leeser. 2016. Open-source variable-precision floating-point library for major commercial FPGAs. ACM Trans. Reconfigurable Technol. Syst. 9, 3, Article 20 (July 2016), 17 pages. DOI:
[10]
Robert Goldschmidt. 1964. Applications of division by convergence. Ph.D. Dissertation.
[11]
C. Heinz, Y. Lavan, J. Hofmann, and A. Koch. 2019. A catalog and in-hardware evaluation of open-source drop-in compatible RISC-V softcore processors. In Proceedings of the International Conference on ReConFigurable Computing and FPGAs (ReConFig’19). 1–8. DOI:
[12]
K. S. Hemmert and K. D. Underwood. 2007. Floating-point divider design for FPGAs. IEEE Trans. Very Large Scale Integr. Syst. 15, 1 (Jan 2007), 115–118. DOI:
[13]
[14]
ISO/IEC 14882:2011 2011. Information Technology—Programming Languages–C++. Standard. International Organization for Standardization, Geneva, CH.
[15]
Salman Khan. 2015. VHDL Implementation and Performance Analysis of two Division Algorithms. Master’s Thesis. University of Victoria.
[16]
Daniel Lemire. 2017. Fast Exact Integer Divisions Using Floating-point Operations. Retrieved from https://lemire.me/blog/2017/11/16/fast-exact-integer-divisions-using-floating-point-operations/.
[17]
B. Liebig and A. Koch. 2014. Low-latency double-precision floating-point division for FPGAs. In Proceedings of the International Conference on Field-Programmable Technology (FPT’14). 107–114. DOI:
[18]
Ligomenides. 1977. The skip-and-set fast-division algorithm. IEEE Trans. Comput. C-26, 10 (1977), 1030–1032. DOI:
[19]
E. Matthews, Z. Aguila, and L. Shannon. 2018. Evaluating the performance efficiency of a soft-processor, variable-length, parallel-execution-unit architecture for FPGAs using the RISC-V ISA. In Proceedings of the IEEE 26th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM’18), Vol. 00. 1–8. DOI:
[20]
E. Matthews, A. Lu, Z. Fang, and L. Shannon. 2019. Rethinking integer divider design for FPGA-based soft-processors. In Proceedings of the IEEE 27th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM’19). 289–297. DOI:
[21]
E. Matthews and L. Shannon. 2017. TAIGA: A new RISC-V soft-processor framework enabling high-performance CPU architectural features. In Proceedings of the 27th International Conference on Field Programmable Logic and Applications (FPL’17). 1–4. DOI:
[22]
P. Montuschi and L. Ciminiera. 1991. Simple radix 2 division and square root with skipping of some addition steps. In Proceedings of the 10th IEEE Symposium on Computer Arithmetic. 202–209. DOI:
[23]
Juan Manuel Torres Palma. 2016. The Simple C RSA-32 Implementation. Retrieved from https://github.com/jmtorrespalma/sc-rsa.
[24]
Charles Papon. [n.d.]. VexRiscv. Retrieved from https://github.com/SpinalHDL/VexRiscv.
[25]
S. K. Park and K. W. Miller. 1988. Random number generators: Good ones are hard to find. Commun. ACM 31, 10 (Oct. 1988), 1192–1201. DOI:
[26]
Microchip Technology Inc. and Ross M. Fosler. 2000. Fast Integer Square Root. Retrieved from http://ww1.microchip.com/downloads/en/AppNotes/91040a.pdf.
[27]
Wilson Snyder. 2018. Verilator 4.008. Retrieved from https://www.veripool.org/ftp/verilator_doc.pdf.
[28]
Gustavo Sutter, Gery Bioul, and Jean-Pierre Deschamps. 2004. Comparative study of SRT-dividers in FPGA. In Field Programmable Logic and Application, Jürgen Becker, Marco Platzner, and Serge Vernalde (Eds.). Springer, Berlin, 209–220.
[29]
G. Sutter and J. Deschamps. 2009. High speed fixed point dividers for FPGAs. In Proceedings of the International Conference on Field Programmable Logic and Applications. 448–452. DOI:
[30]
Rainer K. L. Trummer. 2005. A High-performance Data-dependent Hardware Integer Divider. Master’s Thesis. University of Salzburg.
[31]
VectorBlox. [n.d.]. ORCA: RISC-V by VectorBlox. Retrieved from github.com/VectorBlox/orca(mirror:https://github.com/riscveval/orca-1).
[32]
Xiaojun Wang. 2007. Variable Precision Floating-Point Divide and Square Root for Efficient FPGA, Implementation of Image and Signal Processing Algorithms. Ph.D. Dissertation. EECS Department, Northeastern University.
[33]
Wikipedia. 2021. Square Root. Retrieved from https://en.wikipedia.org/wiki/Square_root.
[34]

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Reconfigurable Technology and Systems
ACM Transactions on Reconfigurable Technology and Systems  Volume 15, Issue 3
September 2022
353 pages
ISSN:1936-7406
EISSN:1936-7414
DOI:10.1145/3508070
  • Editor:
  • Deming Chen
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 February 2022
Accepted: 01 November 2021
Revised: 01 October 2021
Received: 01 May 2021
Published in TRETS Volume 15, Issue 3

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Arithmetic operator
  2. integer divider
  3. variable-latency pipeline
  4. soft-processor

Qualifiers

  • Research-article
  • Refereed

Funding Sources

  • NSERC Discovery
  • NSERC Alliance
  • CFI John R. Evans Leaders Fund
  • Simon Fraser University New Faculty Start-up Grant

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 666
    Total Downloads
  • Downloads (Last 12 months)156
  • Downloads (Last 6 weeks)9
Reflects downloads up to 03 Mar 2025

Other Metrics

Citations

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

Full Text

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media