Loading [a11y]/accessibility-menu.js
Low Latency Floating-Point Division and Square Root Unit | IEEE Journals & Magazine | IEEE Xplore

Low Latency Floating-Point Division and Square Root Unit


Abstract:

Digit-recurrence algorithms are widely used in actual microprocessors to compute floating-point division and square root. These iterative algorithms present a good trade-...Show More

Abstract:

Digit-recurrence algorithms are widely used in actual microprocessors to compute floating-point division and square root. These iterative algorithms present a good trade-off in terms of performance, area and power. We present a floating-point division and square root unit, which implements a radix-64 floating-point division and a radix-16 floating-point square root. To have an affordable implementation, each radix-64 division iteration and radix-16 square root iteration are made of simpler radix-4 iterations: 3 radix-4 iterations in division and 2 in square root. Speculation is used between consecutive radix-4 iterations to get a reduced timing. There are three different parts in digit-recurrence implementations: initialization, digit iterations, and rounding. The digit iteration is the iterative part and it uses the same logic for several cycles. Division and square root share partially the initialization and rounding stages, whereas each one has different logicforthe digit iterations. The result is a low-latency floating-point divider and square root, requiring 11, 6, and 4 cycles for double, single and half-precision division with normalized operands and result, and 15, 8 and 5 cycles for square root. One ortwo additional cycles are needed in case of subnormal operand(s) or result.
Published in: IEEE Transactions on Computers ( Volume: 69, Issue: 2, 01 February 2020)
Page(s): 274 - 287
Date of Publication: 16 October 2019

ISSN Information:


Contact IEEE to Subscribe

References

References is not available for this document.