Skip to main content
Log in

An Unified Architecture for Single, Double, Double-Extended, and Quadruple Precision Division

  • Published:
Circuits, Systems, and Signal Processing Aims and scope Submit manuscript

Abstract

A hardware architecture for quadruple precision floating point division arithmetic with multi-precision support is presented. Division is an important yet far more complex arithmetic operation than addition and multiplication, which demands significant amount of hardware resources for a complete implementation. The proposed architecture also supports the processing of single-, double-, and double-extended precision computations with varied latency. An iterative multiplicative-based architecture for multi-precision quadruple precision division is proposed with small size and promising performance. The proposed mantissa division architecture, the most complex sub-unit, employs a series expansion methodology of division. The architecture follows the standard state-of-the-art flow for floating point division arithmetic with normal as well as subnormal processing. The proposed division architecture is synthesized using UMC 90nm ASIC standard cell library. It is also demonstrated using a Xilinx FPGA-based implementation which is integrated with a wide integer multiplier for mantissa division further optimized for FPGA implementations facilitating the built-in DSP blocks efficiently. When compared to existing quadruple precision divider available in the literature, the proposed architecture has 25% equivalent area saving, 2\({\times }\) improvement in latency with improved speed on FPGA platform; and it has more than 50% area saving, 3\({\times }\) improvement in latency-throughput with better speed on ASIC platform.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Similar content being viewed by others

References

  1. E. Antelo, T. Lang, P. Montuschi, A. Nannarelli, Low latency digit-recurrence reciprocal and square-root reciprocal algorithm and architecture, in 17th IEEE Symposium on Computer Arithmetic (2005), pp. 147–154. doi:10.1109/ARITH.2005.29

  2. D.H. Bailey, R. Barrio, J.M. Borwein, High-precision computation: mathematical physics and dynamics. Appl. Math. Comput. 218(20), 10106–10121 (2012). doi:10.1016/j.amc.2012.03.087

    MathSciNet  MATH  Google Scholar 

  3. S. Banescu, F. de Dinechin, B. Pasca, R. Tudoran, Multipliers for floating-point double precision and beyond on FPGAs. SIGARCH Comput. Archit. News 38, 73–79 (2011). doi:10.1145/1926367.1926380

    Article  Google Scholar 

  4. L. Dadda, Some schemes for parallel multipliers. Alta Freq. 34, 349–356 (1965)

    Google Scholar 

  5. M.M. Daniel, F.S. Diego, H.L. Carlos, A.R. Mauricio, Tradeoff of FPGA design of a floating-point library for arithmeitic operators. J. Integr. Circuits Syst. 5(1), 42–52 (2010)

    Google Scholar 

  6. F. de Dinechin, Large multipliers with fewer DSP blocks, in International Conference on Field Programmable Logic and Applications (2009), pp. 250–255. doi:10.1109/FPL.2009.5272296

  7. F. de Dinechin, G. Villard, High precision numerical accuracy in physics research. Nucl. Instrum. Methods Phys. Res. A 559(1), 207–210 (2006). doi:10.1016/j.nima.2005.11.140

    Article  Google Scholar 

  8. P. Diniz, G. Govindu, Design of a field-programmable dual-precision floating-point arithmetic unit, in Field Programmable Logic and Applications, 2006. FPL ’06. International Conference on (2006), pp. 1–4. doi:10.1109/FPL.2006.311302

  9. Y. Dou, Y. Lei, G. Wu, S. Guo, J. Zhou, L. Shen, FPGA accelerating double/quad-double high precision floating-point applications for ExaScale computing, in ICS ’10: Proceedings of the 24th ACM International Conference on Supercomputing (ACM, New York, 2010), pp. 325–336. doi:10.1145/1810085.1810129

  10. X. Fang, M. Leeser, Vendor agnostic, high performance, double precision floating point division for FPGAs, in The 17th IEEE High Performance Extreme Computing (HPEC) (Waltham, 2013)

  11. R.E. Goldschmidt, Application of division by convergence. Master’s thesis, Massachusetts Institute of Technology (1964)

  12. K.S. Hemmert, K.D. Underwood, Floating-point divider design for FPGAs. IEEE Trans. Very Large Scale Integr. Syst. 15(1), 115–118 (2007). doi:10.1109/TVLSI.2007.891099

    Article  Google Scholar 

  13. IEEE standard for floating-point arithmetic, IEEE Std 754-2008, 1–70 (2008). doi:10.1109/IEEESTD.2008.4610935

  14. A. Isseven, A. Akkaş, A dual-mode quadruple precision floating-point divider, in Signals, Systems and Computers, 2006. ACSSC ’06. Fortieth Asilomar Conference on (2006), pp. 1697–1701. doi:10.1109/ACSSC.2006.355050

  15. M.K. Jaiswal, R. Cheung, M. Balakrishnan, K. Paul, Series expansion based efficient architectures for double precision floating point division. Circuits Syst. Signal Process. 33(11), 3499–3526 (2014). doi:10.1007/s00034-014-9811-8

    Article  MATH  Google Scholar 

  16. M.K. Jaiswal, R.C.C. Cheung, Area-efficient architectures for large integer and quadruple precision floating point multipliers, in The 20th Annual IEEE Symposium on Field-Programmable Custom Computing Machines. IEEE Computer Society, Los Alamitos, CA, USA (2012), pp. 25–28. doi:10.1109/FCCM.2012.14

  17. M.K. Jaiswal, H.K.H. So, architecture for quadruple precision floating point division with multi-precision support, in 2016 IEEE 27th International Conference on Application-specific Systems, Architectures and Processors (ASAP) (2016), pp. 239–240. doi:10.1109/ASAP.2016.7760807

  18. M.K. Jaiswal, H.K.H. So, Taylor series based architecture for quadruple precision floating point division, in 2016 IEEE Computer Society Annual Symposium on VLSI (ISVLSI) (2016), pp. 518–523. doi:10.1109/ISVLSI.2016.10

  19. J.C. Jeong, W.C. Park, W. Jeong, T.D. Han, M.K. Lee, A cost-effective pipelined divider with a small lookup table. IEEE Trans. Comput. 53(4), 489–495 (2004). doi:10.1109/TC.2004.1268407

    Google Scholar 

  20. A. Karatsuba, Y. Ofman, Multiplication of many-digital numbers by automatic computers. Proc. USSR Acad. Sci. 145, 293–294 (1962)

    Google Scholar 

  21. P.M. Kogge, H.S. Stone, A parallel algorithm for the efficient solution of a general class of recurrence equations. IEEE Trans. Comput. C–22(8), 786–793 (1973). doi:10.1109/TC.1973.5009159

    Article  MathSciNet  MATH  Google Scholar 

  22. S.F. Obermann, M.J. Flynn, Division algorithms and implementations. IEEE Trans. Comput. 46(8), 833–854 (1997). doi:10.1109/12.609274

    Article  MathSciNet  Google Scholar 

  23. B. Pasca, Correctly rounded floating-point division for dsp-enabled fpgas, in Field Programmable Logic and Applications (FPL), 2012 22nd International Conference on (2012), pp. 249 –254. doi:10.1109/FPL.2012.6339189

  24. X. Wang, M. Leeser, Vfloat: a variable precision fixed- and floating-point library for reconfigurable hardware. ACM Trans. Reconfig. Technol. Syst. 3(3), 16:1–16:34 (2010). doi:10.1145/1839480.1839486

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Manish Kumar Jaiswal.

Additional information

This work is partly supported by the Research Grants Council of Hong Kong (Project GRF 17245716), and the Croucher Foundation (Croucher Innovation Award 2013).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jaiswal, M.K., So, H.KH. An Unified Architecture for Single, Double, Double-Extended, and Quadruple Precision Division. Circuits Syst Signal Process 37, 383–407 (2018). https://doi.org/10.1007/s00034-017-0559-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00034-017-0559-9

Keywords

Navigation