skip to main content
10.1145/2145694.2145726acmconferencesArticle/Chapter ViewAbstractPublication PagesfpgaConference Proceedingsconference-collections
research-article

A scalable approach for automated precision analysis

Published: 22 February 2012 Publication History

Abstract

The freedom over the choice of numerical precision is one of the key factors that can only be exploited throughout the datapath of an FPGA accelerator, providing the ability to trade the accuracy of the final computational result with the silicon area, power, operating frequency, and latency. However, in order to tune the precision used throughout hardware accelerators automatically, a tool is required to verify that the hardware will meet an error or range specification for a given precision. Existing tools to perform this task typically suffer either from a lack of tightness of bounds or require a large execution time when applied to large scale algorithms; in this work, we propose an approach that can both scale to larger examples and obtain tighter bounds, within a smaller execution time, than the existing methods. The approach we describe also provides a user with the ability to trade the quality of bounds with execution time of the procedure, making it suitable within a word-length optimization framework for both small and large-scale algorithms.
We demonstrate the use of our approach on instances of iterative algorithms to solve a system of linear equations. We show that because our approach can track how the relative error decreases with increasing precision, unlike the existing methods, we can use it to create smaller hardware with guaranteed numerical properties. This results in a saving of 25% of the area in comparison to optimizing the precision using competing analytical techniques, whilst requiring a smaller execution time than the these methods, and saving almost 80% of area in comparison to adopting IEEE double precision arithmetic.

References

[1]
G. Chow, K. Kwok, W. Luk, and P. Leong, "Mixed precision processing in reconfigurable systems," in Proc. IEEE. Symp. on Field-Programmable Custom Computing Machines, 2011, pp. 17--24.
[2]
A. R. Lopes and G. A. Constantinides, "A fused hybrid floating-point and fixed-point dot-product for FPGAs." in Proc. Int. Symp. on Applied Reconfigurable Recomputing, 2010, pp. 157--168.
[3]
M. deLorimier and A. DeHon, "Floating-point sparse matrix-vector multiply for FPGAs," in Proc. Int. Symp. on Field-Programmable Gate Arrays, 2005, pp. 75--85.
[4]
L. Zhuo, G. R. Morris, and V. K. Prasanna, "High-performance reduction circuits using deeply pipelined operators on FPGAs," IEEE Trans. Parallel Distrib. Syst., vol. 18, no. 10, pp. 1377--1392, 2007.
[5]
A. Kinsman and N. Nicolici, "Bit-width allocation for hardware accelerators for scientific computing using SAT-modulo theory," IEEE Trans. Comp.-Aided Des. Integ. Cir. Sys., vol. 29, pp. 405--413, 2010.
[6]
D. Boland and G. Constantinides, "Bounding variable values and round-off effects using handelman representations," IEEE Trans. Comp.-Aided Des. Integ. Cir. Sys., vol. 30, no. 11, pp. 1691--1704, 2011.
[7]
Y. Pang, K. Radecka, and Z. Zilic, "Optimization of imprecise circuits represented by Taylor series and real-valued polynomials," IEEE Trans. Comp.-Aided Des. Integ. Cir. Sys., vol. 29, pp. 1177--1190, August 2010.
[8]
G. Constantinides, P. Cheung, and W. Luk, "Optimum wordlength allocation," Proc. Int. Symp. Field-Programmable Custom Computing Machines, pp. 219--228, 2002.
[9]
D.-U. Lee, A. Gaffar, R. Cheung, O. Mencer, W. Luk, and G. Constantinides, "Accuracy-guaranteed bit-width optimization," IEEE Trans. Comp.-Aided Des. Integ. Cir. Sys., vol. 25, no. 10, pp. 1990--2000, 2006.
[10]
M. L. Chang and S. Hauck, "Automated least-significant bit datapath optimization for FPGAs," Proc. IEEE Symp. on Field-Programmable Custom Computing Machines, pp. 59--67, 2004.
[11]
R. Barrett, M. Berry, T. F. Chan, J. Demmel, J. Donato, J. Dongarra, V. Eijkhout, R. Pozo, C. Romine, and H. V. der Vorst, Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods. Philadelphia: SIAM, 1994.
[12]
G. Constantinides, A. Kinsman, and N. Nicolici, "Numerical data representations for FPGA-based scientific computing," Design & Test of Computers, vol. 28, no. 4, pp. 8--17, 2011.
[13]
Z. Zhao and M. Leeser, "Precision modeling and bit-width optimization of floating-point applications," in High Performance Embedded Computing, 2003, pp. 141--142.
[14]
R. E. Moore, Interval Analysis. Englewood Cliff, NJ: Prentice-Hall, 1966.
[15]
B. Einarsson, Handbook on Accuracy and Reliability in Scientific Computation. Soc for Industrial & Applied Math, 2005, ch. 10, pp. 195 -- 240.
[16]
F. de Dinechin, C. Q. Lauter, and G. Melquiond, "Assisted verification of elementary functions using gappa," in Proc. Symp. Applied computing, 2006, pp. 1318--1322.
[17]
A. Kinsman and N. Nicolici, "Computational bit-width allocation for operations in vector calculus," in IEEE Int. Conf. on Computer Design, oct. 2009, pp. 433--438.
[18]
----, "Robust design methods for hardware accelerators for iterative algorithms in scientific computing," in Proc. Design Automation Conference, 2010, pp. 254--257.
[19]
A. Neumaier, "Taylor forms - use and limits," Reliable Computing, vol. 9, pp. 43--79, 2003.
[20]
N. Courtois, A. Klimov, J. Patarin, and A. Shamir, "Efficient algorithms for solving overdefined systems of multivariate polynomial equations," in Proc. Int. Conf. on Theory and application of cryptographic techniques, 2000, pp. 392--407.
[21]
L. H. de Figueiredo and J. Stolfi, Self-Validated Numerical Methods and Applications. Rio de Janeiro: IMPA/CNPq, 1997.
[22]
K. Makino and M. Berz, "Taylor models and other validated functional inclusion methods," International Journal of Pure and Applied Mathematics, vol. 4, pp. 379--456, 2003.
[23]
J.-M. Muller, Elementary Functions: Algorithms and Implementation. Birkhauser, 2005.
[24]
N. J. Higham, Accuracy and Stability of Numerical Algorithms, 2nd ed. Philadelphia, PA, USA: Soc for Industrial & Applied Math, 2002.
[25]
W. S. Brown, "A simple but realistic model of floating-point computation," ACM Trans. Math. Softw., vol. 7, pp. 445--480, December 1981.
[26]
H. Ratschek, "Centered forms," SIAM Journal on Numerical Analysis, vol. 17, no. 5, pp. pp. 656--662, 1980.
[27]
D. Boland and G. Constantinides, "Optimising memory bandwidth use and performance for matrix-vector multiplication in iterative methods," ACM Trans. Reconfigurable Technol. Syst., vol. 4, pp. 22:1--22:14, 2011.
[28]
----, "An FPGA-based implementation of the MINRES algorithm," in Proc. Int. Conf. Field Programmable Logic and Applications, Sept. 2008, pp. 379--384.

Cited By

View all
  • (2023)Design Space Exploration of Application Specific Number Formats Targeting an FPGA Implementation of SPICEApplied Reconfigurable Computing. Architectures, Tools, and Applications10.1007/978-3-031-42921-7_5(66-80)Online publication date: 16-Sep-2023
  • (2023)Field‐programmable Gate ArraysDesign for Embedded Image Processing on FPGAs10.1002/9781119819820.ch2(19-44)Online publication date: 5-Sep-2023
  • (2021)Programming and Synthesis for Software-defined FPGA Acceleration: Status and Future ProspectsACM Transactions on Reconfigurable Technology and Systems10.1145/346966014:4(1-39)Online publication date: 13-Sep-2021
  • Show More Cited By

Index Terms

  1. A scalable approach for automated precision analysis

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    FPGA '12: Proceedings of the ACM/SIGDA international symposium on Field Programmable Gate Arrays
    February 2012
    352 pages
    ISBN:9781450311557
    DOI:10.1145/2145694
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 22 February 2012

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. precision analysis
    2. range analysis
    3. word-length optimisation

    Qualifiers

    • Research-article

    Conference

    FPGA '12
    Sponsor:

    Acceptance Rates

    FPGA '12 Paper Acceptance Rate 20 of 87 submissions, 23%;
    Overall Acceptance Rate 125 of 627 submissions, 20%

    Upcoming Conference

    FPGA '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)9
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 17 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Design Space Exploration of Application Specific Number Formats Targeting an FPGA Implementation of SPICEApplied Reconfigurable Computing. Architectures, Tools, and Applications10.1007/978-3-031-42921-7_5(66-80)Online publication date: 16-Sep-2023
    • (2023)Field‐programmable Gate ArraysDesign for Embedded Image Processing on FPGAs10.1002/9781119819820.ch2(19-44)Online publication date: 5-Sep-2023
    • (2021)Programming and Synthesis for Software-defined FPGA Acceleration: Status and Future ProspectsACM Transactions on Reconfigurable Technology and Systems10.1145/346966014:4(1-39)Online publication date: 13-Sep-2021
    • (2016)GPU-Accelerated High-Level Synthesis for Bitwidth Optimization of FPGA DatapathsProceedings of the 2016 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays10.1145/2847263.2847266(185-194)Online publication date: 21-Feb-2016
    • (2016)Design and analysis of evolutionary bit-length optimization algorithms for floating to fixed-point conversionApplied Soft Computing10.1016/j.asoc.2016.08.03549:C(447-461)Online publication date: 1-Dec-2016
    • (2015)The advantages and limitations of high level synthesis for FPGA based image processingProceedings of the 9th International Conference on Distributed Smart Cameras10.1145/2789116.2789145(134-139)Online publication date: 8-Sep-2015
    • (2015)A Low Complexity Scaling Method for the Lanczos Kernel in Fixed-Point ArithmeticIEEE Transactions on Computers10.1109/TC.2013.16264:2(303-315)Online publication date: 1-Feb-2015
    • (2015)On the use of programmable hardware and reduced numerical precision in earth‐system modelingJournal of Advances in Modeling Earth Systems10.1002/2015MS0004947:3(1393-1408)Online publication date: 18-Sep-2015
    • (2014)Toward scalable source level accuracy analysis for floating-point to fixed-point conversionProceedings of the 2014 IEEE/ACM International Conference on Computer-Aided Design10.5555/2691365.2691511(726-733)Online publication date: 3-Nov-2014
    • (2014)Toward scalable source level accuracy analysis for floating-point to fixed-point conversion2014 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)10.1109/ICCAD.2014.7001432(726-733)Online publication date: Nov-2014
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media