Skip to main content
Log in

Radix-10 Restoring Square Root for 6-input LUTs Programmable Devices

  • Published:
Circuits, Systems, and Signal Processing Aims and scope Submit manuscript

Abstract

This paper proposes efficient fixed-point and floating-point implementations for radix-10 square root in Xilinx FPGAs devices. The method implements digit recurrence with restoring algorithm, which supports the three decimal floating-point (DFP) types specified in the IEEE 754-2008 standard. The technique used for restoring is optimal and novel. The designs use new techniques based on the efficient utilization of dedicated resources in the programmable devices. Implementations were made in Xilinx 7-series devices. For fixed-point square root, they are capable of operating up to 212 MHz for p=7, 197 MHz for p=16, and 190 MHz for p=34. As for DFP square root, the operation frequency obtained is 194 MHz for p=7, 183 MHz for p=16, and 174 MHz for p=34. The proposed architecture achieves better computation times than related works.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. A. Amaricai, O. Boncalo, Fpga implementation of very high radix square root with prescaling, in 2012 19th IEEE International Conference on Electronics, Circuits, and Systems (ICECS 2012), pp. 221–224. IEEE (2012)

  2. J.M. Anderson, C. Tsen, L.K. Wang, K. Compton, J.M. Schulte, Performance analysis of decimal floating-point libraries and its impact on decimal hardware and software solutions, in 2009 IEEE International Conference on Computer Design, pp. 465–471. IEEE (2009)

  3. F. Batista, Decimal data type. World Wide Web. http://www.python.org/dev/peps/pep-0327, version 62268 (2003). Accessed 20 Feb 2020

  4. M. Bhat, J. Crawford, R. Morin, K. Shiv, Performance characterization of decimal arithmetic in commercial java workloads, in 2007 IEEE International Symposium on Performance Analysis of Systems & Software, pp. 54–61. IEEE (2007)

  5. P. Corsonello, S. Perri, High performance square rooting circuit using hybrid radix-2 adders. Electron. Lett. 35(3), 185–186 (1999)

    Article  Google Scholar 

  6. M. Cowlishaw, The decnumber library, v3. 68. World Wide Web http://speleotrove.com/decimal/decnumber.pdf (2010). Accessed 2 Feb 2020

  7. M.F. Cowlishaw, Decimal floating-point: Algorism for computers. In: Proceedings 2003 16th IEEE Symposium on Computer Arithmetic, pp. 104–111. IEEE (2003)

  8. P. Crismer, Eiffel decimal arithmetic library. World Wide Web. http://www.gobosoft.com/eiffel/gobo/math/decimal/index.html (2019). Accessed 18 Feb 2020

  9. D. Currie, Lua decnumber library. World Wide Web. http://files.luaforge.net/releases/ldecnumber/ldecnumber/ldecNumber-21, version 21 (2007). Accessed 19 Jan 2020

  10. F. De Dinechin, M. Joldes, B. Pasca, G. Revy, Multiplicative square root algorithms for fpgas, in 2010 International Conference on Field Programmable Logic and Applications, pp. 574–577. IEEE (2010)

  11. N. Dlodlo, M. Mofolo, L. Masoane, S. Mncwabe, G. Sibiya, L. Mboweni, Research trends in existing technologies that are building blocks to the internet of things. in Innovations and Advances in Computing, Informatics, Systems Sciences, Networking and Engineering, pp. 539–548. Springer (2015)

  12. A.Y. Duale, M.H. Decker, H.G. Zipperer, M. Aharoni, T.J. Bohizic, Decimal floating-point in z9: an implementation and testing perspective. IBM J. Res. Develop. 51(1.2), 217–227 (2007)

    Article  Google Scholar 

  13. L. Eisen, J. Ward, H.W. Tast, N. Mading, J. Leenstra, S.M. Mueller, C. Jacobi, J. Preiss, E.M. Schwarz, S.R. Carlough, Ibm power6 accelerators: Vmx and dfu. IBM J. Res. Develop. 51(6), 1–21 (2007)

    Article  Google Scholar 

  14. M. Ercegovac, J.M. Muller, Digit-recurrence algorithms for division and square root with limited precision primitives, in The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003, vol. 2, pp. 1440–1444. IEEE (2003)

  15. M.D. Ercegovac, T. Lang, Digital Arithmetic (Elsevier, Amsterdam, 2004)

    Google Scholar 

  16. M.D. Ercegovac, R. McIlhenny, Design and fpga implementation of radix-10 algorithm for square root with limited precision primitives, in 2009 Conference Record of the Forty-Third Asilomar Conference on Signals, Systems and Computers, pp. 935–939. IEEE (2009)

  17. M.D. Ercegovac, R. McIlhenny, Design and fpga implementation of radix-10 combined division/square root algorithm with limited precision primitives, in 2010 Conference Record of the Forty Fourth Asilomar Conference on Signals, Systems and Computers, pp. 87–91. IEEE (2010)

  18. M.D. Ercegovac, R. McIlhenny, Shared implementation of radix-10 and radix-16 square root algorithm with limited precision primitives, in 2012 Conference Record of the Forty Sixth Asilomar Conference on Signals, Systems and Computers (ASILOMAR), pp. 345–349. IEEE (2012)

  19. J. Fandrianto, Algorithm for high speed shared radix 4 division and radix 4 square-root, in 1987 IEEE 8th Symposium on Computer Arithmetic (ARITH), pp. 73–79. IEEE (1987)

  20. J. Fandrianto, Algorithm for high speed shared radix 8 division and radix 8 square root, in Proceedings of 9th Symposium on Computer Arithmetic, pp. 68–75 (1989)

  21. A. Hosseiny, G. Jaberipur, Decimal square root: algorithm and hardware implementation. Circuits Syst. Sig. Process. 35(12), 4195–4219 (2016)

    Article  MathSciNet  Google Scholar 

  22. IEEE: Ieee standard for floating-point arithmetic. IEEE Std 754-2008 pp. 1–70 (2008). https://doi.org/10.1109/IEEESTD.2008.4610935

  23. A. Jena, S.K. Panda, Fpga-vhdl implementation of pipelined square root circuit for vlsi signal processing applications. Int. J. Comp. Appl. 975, 8887 (2016)

    Google Scholar 

  24. K. Jun, E.E. Swartzlander, Improved non-restoring square root algorithm with dual path calculation, in 2014 48th Asilomar Conference on Signals, Systems and Computers, pp. 1243–1246. IEEE (2014)

  25. H. Kabuo, T. Taniguchi, A. Miyoshi, H. Yamashita, M. Urano, H. Edamatsu, S. Kuninobu, Accurate rounding scheme for the newton-raphson method using redundant binary representation. IEEE Trans. Comp. 43(1), 43–51 (1994)

    Article  Google Scholar 

  26. A. Kaivani, S.B. Ko, Decimal srt square root: algorithm and architecture. Circuits Syst. Sig. Process. 32(5), 2137–2150 (2013)

    Article  Google Scholar 

  27. M. Kavis, The internet of things will radically change your big data strategy (2014). http://www.forbes.com/sites/mikekavis/2014/06/26/the-internet-of-things-will-radically-change-your-big-data-strategy/. Accessed 17 Feb 2020

  28. T.J. Kwon, J. Draper, Floating-point division and square root implementation using a taylor-series expansion algorithm with reduced look-up tables, in: 2008 51st Midwest Symposium on Circuits and Systems, pp. 954–957. IEEE (2008)

  29. LabSET: Vhdl implementation of radix-10 restoring square root (2019). https://github.com/LabSET-UNICEN/radix-10_sqrt

  30. S. Lachowicz, H.J. Pfleiderer, Fast evaluation of the square root and other nonlinear functions in fpga, in 4th IEEE International Symposium on Electronic Design, Test and Applications (delta 2008), pp. 474–477. IEEE (2008)

  31. T. Lang, P. Montuschi, Very-high radix combined division and square root with prescaling and selection by rounding, in Proceedings of the 12th Symposium on Computer Arithmetic, pp. 124–131. IEEE (1995)

  32. T. Lang, P. Montuschi, Very high radix square root with prescaling and rounding and a combined division/square root unit. IEEE Trans. Comp. 48(8), 827–841 (1999)

    Article  MathSciNet  Google Scholar 

  33. Y. Li, W. Chu, A new non-restoring square root algorithm and its vlsi implementations, in:Proceedings International Conference on Computer Design. VLSI in Computers and Processors, pp. 538–544. IEEE (1996)

  34. Y. Li, W. Chu, Implementation of single precision floating point square root on fpgas, in Proceedings. The 5th Annual IEEE Symposium on Field-Programmable Custom Computing Machines Cat. No. 97TB100186), pp. 226–232. IEEE (1997)

  35. Y. Li, W. Chu, Parallel-array implementations of a non-restoring square root algorithm, in Proceedings International Conference on Computer Design VLSI in Computers and Processors, pp. 690–695. IEEE (1997)

  36. S.E. McQuillan, J.V. McCanny, R. Hamill, New algorithms and vlsi architectures for srt division and square root, in Proceedings of IEEE 11th Symposium on Computer Arithmetic, pp. 80–86. IEEE (1993)

  37. P. Montuschi, L. Ciminiera, Reducing iteration time when result digit is zero for radix 2 srt division and square root with redundant remainders. IEEE Trans. Comp. 42(2), 239–246 (1993)

    Article  Google Scholar 

  38. A. Nannarelli, Decimal engine for energy-efficient multicore processors, in 2014 22nd International Conference on Very Large Scale Integration (VLSI-SoC), pp. 1–6. IEEE (2014)

  39. B. Parhami, Computer arithmetic: algorithms and hardware designsComputer arithmetic: Algorithms and hardware designs (Oxford University Press, Oxford, OxfordOxford, 2000), pp. 512583–512585

    Google Scholar 

  40. J.A. Pineiro, J.D. Bruguera, High-speed double-precision computation of reciprocal, division, square root, and inverse square root. IEEE Trans. Comp. 51(12), 1377–1388 (2002)

    Article  MathSciNet  Google Scholar 

  41. A. Rahman et al., New efficient hardware design methodology for modified non-restoring square root algorithm. In: 2014 International Conference on Informatics, Electronics & Vision (ICIEV), pp. 1–6. IEEE (2014)

  42. C.V. Ramamoorthy, J.R. Goodman, K. Kim, Some properties of iterative square-rooting methods using high-speed multiplication. IEEE Trans. Comp. 100(8), 837–847 (1972)

    Article  Google Scholar 

  43. I. Sajid, M. Ahmed, S.G. Ziavras, Novel pipelined architecture for efficient evaluation of the square root using a modified non-restoring algorithm. J. Sig. Process. Syst. 67(2), 157–166 (2012)

    Article  Google Scholar 

  44. M.J. Schulte, N. Lindberg, A. Laxminarain, Performance evaluation of decimal floating-point arithmetic, in: Proceedings of the 6th IBM Austin Center for Advanced Studies Conference (2005)

  45. E.M. Schwarz, J.S. Kapernick, M.F. Cowlishaw, Decimal floating-point support on the ibm system z10 processor. IBM J. Res. Develop. 53(1), 1–4 (2009)

    Article  Google Scholar 

  46. S. Suresh, S.F. Beldianu, S.G. Ziavras, Fpga and asic square root designs for high performance and power efficiency, in 2013 IEEE 24th International Conference on Application-Specific Systems, Architectures and Processors, pp. 269–272. IEEE (2013)

  47. N. Takagi, K. Takagi, A vlsi algorithm for integer square-rooting, in 2006 International Symposium on Intelligent Signal Processing and Communications, pp. 626–629. IEEE (2006)

  48. A.J. Thakkar, A. Ejnioui, Design and implementation of double precision floating point division and square root on fpgas, in 2006 IEEE Aerospace Conference, pp. 7–pp. IEEE (2006)

  49. Á. Vázquez, J.D. Bruguera, Iterative algorithm and architecture for exponential, logarithm, powering, and root extraction. IEEE Trans. Comp. 62(9), 1721–1731 (2012)

    Article  MathSciNet  Google Scholar 

  50. A. Vázquez, F. de Dinechin, Efficient implementation of parallel bcd multiplication in lut-6 fpgas, in 2010 International Conference on Field-Programmable Technology, pp. 126–133. IEEE (2010)

  51. A. Vazquez, F. de Dinechin, Multi-operand decimal tree adders for fpgas. Research Report (2010)

  52. M. Vázquez, L. Leiva, G. Sutter, Radix-10 decimal logarithm by direct selection for 6-input luts programmable devices. Microprocess. Microsyst. 64, 143–158 (2019)

    Article  Google Scholar 

  53. M. Vázquez, E. Todorovich, Fpga-specific decimal sign-magnitude addition and subtraction. Int. J. Electron. 103(7), 1166–1185 (2016)

    Article  Google Scholar 

  54. M. Vázquez, M. Tosini, Design and implementation of decimal fixed-point square root in lut-6 fpgas, in 2014 IX Southern Conference on Programmable Logic (SPL), pp. 1–6. IEEE (2014)

  55. K. Vijeyakumar, V. Sumathy, P. Vasakipriya, A.D. Babu, Fpga implementation of low power high speed square root circuits, in 2012 IEEE International Conference on Computational Intelligence and Computing Research, pp. 1–5. IEEE (2012)

  56. L.K. Wang, M.J. Schulte, Decimal floating-point square root using newton-raphson iteration, in 2005 IEEE International Conference on Application-Specific Systems, Architecture Processors (ASAP’05), pp. 309–315. IEEE (2005)

  57. X. Wang, B.E. Nelson, Tradeoffs of designing floating-point division and square root on virtex fpgas, in 11th Annual IEEE Symposium on Field-Programmable Custom Computing Machines, 2003. FCCM 2003., pp. 195–203. IEEE (2003)

  58. Xilinx: Ise design suite 14: Release notes, installation and licensing (2013). www.xilinx.com

  59. Xilinx Xst user guide for virtex-6, spartan-6 and 7 series devices (2013). www.xilinx.com

  60. Xilinx 7 series fpgas configuration—user guides (2016). www.xilinx.com

  61. Xilinx Ultrascale architecture and product data sheet: Overview (2018). www.xilinx.com

  62. Xilinx Vivado design suite user guide—design flows overview (2018). www.xilinx.com

  63. Xilinx Vivado design suite user guide—synthesis (2019). www.xilinx.com

  64. B. Yang, D. Wang, L. Liu, Complex division and square-root using cordic, in 2012 2nd International Conference on Consumer Electronics, Communications and Networks (CECNet), pp. 2464–2468. IEEE (2012)

Download references

Acknowledgements

This work was partially supported from investigation projects fund provides by the Research Secretary of the Faculty of Engineering of FASTA University and SeCAT of UNICEN University.

The data that support the findings of this study are openly available in GitHub at https://github.com/LabSET-UNICEN/radix-10_sqrt, reference number 29.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lucas Leiva.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Vázquez, M., Tosini, M. & Leiva, L. Radix-10 Restoring Square Root for 6-input LUTs Programmable Devices. Circuits Syst Signal Process 40, 2335–2360 (2021). https://doi.org/10.1007/s00034-020-01571-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00034-020-01571-y

Keywords

Navigation