Abstract
A hardware architecture for quadruple precision floating point division arithmetic with multi-precision support is presented. Division is an important yet far more complex arithmetic operation than addition and multiplication, which demands significant amount of hardware resources for a complete implementation. The proposed architecture also supports the processing of single-, double-, and double-extended precision computations with varied latency. An iterative multiplicative-based architecture for multi-precision quadruple precision division is proposed with small size and promising performance. The proposed mantissa division architecture, the most complex sub-unit, employs a series expansion methodology of division. The architecture follows the standard state-of-the-art flow for floating point division arithmetic with normal as well as subnormal processing. The proposed division architecture is synthesized using UMC 90nm ASIC standard cell library. It is also demonstrated using a Xilinx FPGA-based implementation which is integrated with a wide integer multiplier for mantissa division further optimized for FPGA implementations facilitating the built-in DSP blocks efficiently. When compared to existing quadruple precision divider available in the literature, the proposed architecture has 25% equivalent area saving, 2\({\times }\) improvement in latency with improved speed on FPGA platform; and it has more than 50% area saving, 3\({\times }\) improvement in latency-throughput with better speed on ASIC platform.















Similar content being viewed by others
References
E. Antelo, T. Lang, P. Montuschi, A. Nannarelli, Low latency digit-recurrence reciprocal and square-root reciprocal algorithm and architecture, in 17th IEEE Symposium on Computer Arithmetic (2005), pp. 147–154. doi:10.1109/ARITH.2005.29
D.H. Bailey, R. Barrio, J.M. Borwein, High-precision computation: mathematical physics and dynamics. Appl. Math. Comput. 218(20), 10106–10121 (2012). doi:10.1016/j.amc.2012.03.087
S. Banescu, F. de Dinechin, B. Pasca, R. Tudoran, Multipliers for floating-point double precision and beyond on FPGAs. SIGARCH Comput. Archit. News 38, 73–79 (2011). doi:10.1145/1926367.1926380
L. Dadda, Some schemes for parallel multipliers. Alta Freq. 34, 349–356 (1965)
M.M. Daniel, F.S. Diego, H.L. Carlos, A.R. Mauricio, Tradeoff of FPGA design of a floating-point library for arithmeitic operators. J. Integr. Circuits Syst. 5(1), 42–52 (2010)
F. de Dinechin, Large multipliers with fewer DSP blocks, in International Conference on Field Programmable Logic and Applications (2009), pp. 250–255. doi:10.1109/FPL.2009.5272296
F. de Dinechin, G. Villard, High precision numerical accuracy in physics research. Nucl. Instrum. Methods Phys. Res. A 559(1), 207–210 (2006). doi:10.1016/j.nima.2005.11.140
P. Diniz, G. Govindu, Design of a field-programmable dual-precision floating-point arithmetic unit, in Field Programmable Logic and Applications, 2006. FPL ’06. International Conference on (2006), pp. 1–4. doi:10.1109/FPL.2006.311302
Y. Dou, Y. Lei, G. Wu, S. Guo, J. Zhou, L. Shen, FPGA accelerating double/quad-double high precision floating-point applications for ExaScale computing, in ICS ’10: Proceedings of the 24th ACM International Conference on Supercomputing (ACM, New York, 2010), pp. 325–336. doi:10.1145/1810085.1810129
X. Fang, M. Leeser, Vendor agnostic, high performance, double precision floating point division for FPGAs, in The 17th IEEE High Performance Extreme Computing (HPEC) (Waltham, 2013)
R.E. Goldschmidt, Application of division by convergence. Master’s thesis, Massachusetts Institute of Technology (1964)
K.S. Hemmert, K.D. Underwood, Floating-point divider design for FPGAs. IEEE Trans. Very Large Scale Integr. Syst. 15(1), 115–118 (2007). doi:10.1109/TVLSI.2007.891099
IEEE standard for floating-point arithmetic, IEEE Std 754-2008, 1–70 (2008). doi:10.1109/IEEESTD.2008.4610935
A. Isseven, A. Akkaş, A dual-mode quadruple precision floating-point divider, in Signals, Systems and Computers, 2006. ACSSC ’06. Fortieth Asilomar Conference on (2006), pp. 1697–1701. doi:10.1109/ACSSC.2006.355050
M.K. Jaiswal, R. Cheung, M. Balakrishnan, K. Paul, Series expansion based efficient architectures for double precision floating point division. Circuits Syst. Signal Process. 33(11), 3499–3526 (2014). doi:10.1007/s00034-014-9811-8
M.K. Jaiswal, R.C.C. Cheung, Area-efficient architectures for large integer and quadruple precision floating point multipliers, in The 20th Annual IEEE Symposium on Field-Programmable Custom Computing Machines. IEEE Computer Society, Los Alamitos, CA, USA (2012), pp. 25–28. doi:10.1109/FCCM.2012.14
M.K. Jaiswal, H.K.H. So, architecture for quadruple precision floating point division with multi-precision support, in 2016 IEEE 27th International Conference on Application-specific Systems, Architectures and Processors (ASAP) (2016), pp. 239–240. doi:10.1109/ASAP.2016.7760807
M.K. Jaiswal, H.K.H. So, Taylor series based architecture for quadruple precision floating point division, in 2016 IEEE Computer Society Annual Symposium on VLSI (ISVLSI) (2016), pp. 518–523. doi:10.1109/ISVLSI.2016.10
J.C. Jeong, W.C. Park, W. Jeong, T.D. Han, M.K. Lee, A cost-effective pipelined divider with a small lookup table. IEEE Trans. Comput. 53(4), 489–495 (2004). doi:10.1109/TC.2004.1268407
A. Karatsuba, Y. Ofman, Multiplication of many-digital numbers by automatic computers. Proc. USSR Acad. Sci. 145, 293–294 (1962)
P.M. Kogge, H.S. Stone, A parallel algorithm for the efficient solution of a general class of recurrence equations. IEEE Trans. Comput. C–22(8), 786–793 (1973). doi:10.1109/TC.1973.5009159
S.F. Obermann, M.J. Flynn, Division algorithms and implementations. IEEE Trans. Comput. 46(8), 833–854 (1997). doi:10.1109/12.609274
B. Pasca, Correctly rounded floating-point division for dsp-enabled fpgas, in Field Programmable Logic and Applications (FPL), 2012 22nd International Conference on (2012), pp. 249 –254. doi:10.1109/FPL.2012.6339189
X. Wang, M. Leeser, Vfloat: a variable precision fixed- and floating-point library for reconfigurable hardware. ACM Trans. Reconfig. Technol. Syst. 3(3), 16:1–16:34 (2010). doi:10.1145/1839480.1839486
Author information
Authors and Affiliations
Corresponding author
Additional information
This work is partly supported by the Research Grants Council of Hong Kong (Project GRF 17245716), and the Croucher Foundation (Croucher Innovation Award 2013).
Rights and permissions
About this article
Cite this article
Jaiswal, M.K., So, H.KH. An Unified Architecture for Single, Double, Double-Extended, and Quadruple Precision Division. Circuits Syst Signal Process 37, 383–407 (2018). https://doi.org/10.1007/s00034-017-0559-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00034-017-0559-9