Skip to main content
Log in

Improving Floating-Point Performance in Less Area: Fractured Floating Point Units (FFPUs)

  • Published:
Journal of Signal Processing Systems Aims and scope Submit manuscript

Abstract

Embedded systems designers often use fixed-point instead of floating-point due to the performance and area overhead of floating-point units. If the range of floating-point representation is required, the system may use a software-based floating-point library on an integer-only processor to save area—at the cost of much lower performance. Instead, we propose a Fractured Floating Point Unit (FFPU)—a hybrid solution that uses a set of custom hardware instructions to accelerate software-based floating-point emulation. An FFPU is intended as a compromise between software libraries and full FPUs in terms of both area and performance. We present four potential 32-bit FFPU designs for a Nios II soft processor. We compare their performance and area to the baseline Nios II, as well as a Nios II with a complete FPU. We show that an FFPU can improve various floating-point operations, including improving addition and subtraction performance by 24 to 52 percent over the baseline. This performance comes at a resource cost of only an 11 to 29 percent ALM increase, and no increase in DSP blocks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7

Similar content being viewed by others

Notes

  1. The Altera FPU does not include an option for hardware-based square root support.

  2. SoftFloat left-shifts intermediate significands to preserve round, guard, and sticky bits for values stored in integer registers. See the extractSignificand instruction for details.

References

  1. Hauser, J. R. (2002). SoftFloat Release 2b. Available online at http://www.jhauser.us/arithmetic/SoftFloat.html.

  2. Altera Corporation. (2009). Nios II processor reference handbook. San Jose: Altera Corporation.

    Google Scholar 

  3. Floating-Point Arithmetic, IEEE Standard 754-2008, 2008.

  4. Kadlec, J., Bartosinski, R., Danek, M. (2007). Accelerating microblaze floating point operations. International Conference on Field-Programmable Logic and Applications, 621–624.

  5. Beauchamp, M. J., Hauck, S., Underwood, K. D., Hemmert, K. S. (2006). Embedded floating-point units in FPGAs. ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 12–20.

  6. Hemmert, K. S., Underwood, K. D. (2006). Open source high performance floating-point modules. IEEE Symposium on Field-Programmable Custom Computing Machines, 349–350.

  7. Sheldon, D., Kumar, R., Vahid, F., Tullsen, D., Lysecky, R. (2006). Conjoining soft-core FPGA processors. IEEE/ACM International Conference on Computer-Aided Design, 694–701.

  8. Belanovic, P., Leeser, M. (2002). A library of parameterized floating-point modules and their use. International Conference on Field-Programmable Logic and Applications, 657–666.

  9. Brunelli, C., Campi, F., Kylliainen, J., Nurmi, J. (2004). A reconfigurable FPU as IP component for SoCs. International Symposium on System-on-Chip, 103–106.

  10. Karuri, K., Leupers, R., Kedia, M., Ascheid, G., & Meyr, H. (2006). Design and implementation of a modular and portable IEEE 754 compliant floating-point unit. Design, Automation & Test in Europe, 2, 1–6.

    Article  Google Scholar 

  11. Chong, Y. J., Parameswaran, S. (2008). Rapid application specific floating-point unit generation with bit-alignment. Design Automation Conference, 62–67.

  12. Dally, W. J. (1989). Micro-optimization of floating-point operations. ACM SIGARCH Computer Architecture News, 17(2), 283–289.

    Article  Google Scholar 

  13. Chouliaras, V. A., Nunez-Yanez, J. L. (2007). An IEEE 754 floating point engine designed with an electronic system level methodology. Norchip, 1–4.

  14. ARM Limited (2005). ARM1136JF-S and ARM1136J-S Technical Reference Manual, Revision: r1p1, ARM Limited.

  15. IBM (2009). Power ISA Version 2.06, International Business Machines

  16. Intel (2006). Intel Itanium Architecture Software Developer’s Manual, January 2006.

  17. Jeannerod, C.-P., Raina, S. K., Tisserand, A. (2005). High-radix floating-point division algorithms for embedded VLIW integer processors. Proceedings of 17th IMACS World Congress (Scientific Computation, Applied Mathematics and Simulation), July 11–15, Paris, France.

  18. Bertin, C., Brisebarre, N., Dupont de Dinechin, B., Jeannerod, C.-P., Monat, C., Muller, J.-M., et al. (2004). A floating-point library for integer processors. SPIE 49th Annual Meeting, proceedings of SPIE vol. 5559 (Advanced Signal Processing Algorithms, Architectures, and Implementations XIV), August 2–6, Denver, USA.

  19. Nickolls, J. R. (1990). The design of the MasPar MP-1: a cost effective massively parallel computer. IEEE Computer Society International Conference (Compcon), 25–28. March 1990.

  20. Altera Corporation. (2008). Nios II custom instruction user guide. San Jose: Altera Corporation.

    Google Scholar 

  21. Altera Corporation. (2006). Tutorial: Using Nios II custom floating-point custom instructions. San Jose: Altera Corporation.

    Google Scholar 

  22. Altera Corporation. (2008). Application Note 391: Profiling Nios II Systems, Version 1.3. San Jose: Altera Corporation.

    Google Scholar 

  23. Rupnow, K., Rodrigues, A., Underwood, K., Compton, K. (2006). Scientific applications vs. SPEC-FP: A comparison of program behavior. International Conference on Supercomputing, 66–74.

Download references

Acknowledgments

We would like to thank Altera Corporation and Terasic Technologies for the speedy donation of the DE3 board used in this study, Guy Lemieux for suggesting this type of project, and André DeHon for some useful pointers to related work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Katherine Compton.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hockert, N., Compton, K. Improving Floating-Point Performance in Less Area: Fractured Floating Point Units (FFPUs). J Sign Process Syst 67, 31–46 (2012). https://doi.org/10.1007/s11265-010-0561-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11265-010-0561-y

Keywords

Navigation