Abstract
FPGAs have the native feature that reduced resource usage of single operators can be directly translated in additional parallelism. For floating-point (FP) operators, such reduced resource usage can be achieved by reducing the mantissa bit width. The work presented here pursues two objectives: First, the maximum number of operands of a parallel dot product architecture is explored experimentally on an FPGA for different custom precision FP number formats. Given the resources of this FPGA, it is shown that based on non-pipelined basic FP operators, a dot product for input vector size 21, 57 and 123 can be implemented for double-, single- and half-precision, respectively. This corresponds to a respective peak performance of 1, 3.2 and 9.9 GFlop/s. Second, it is shown that the maximum dot product peak performance as a function of used precision can be modeled by a function of the form \(P(p)=c_1+c_2 \cdot p^{c_3}\), given a certain type of FPGA, library and synthesis settings. Fitting experimental data to this model reveals similarities as well as differences among generations of devices.
This work was supported by the Austrian Research Promotion Agency (FFG) under contract 819469 (MixSVM) and by the Austrian Science Fund (FWF) under contract S10608-N13 (NFN SISE).
Chapter PDF
References
de Dinechin, F., Pasca, B., Cret, O., Tudoran, R.: An FPGA-specific approach to floating-point accumulation and sum-of-products. In: International Conference on ICECE Technology, FPT 2008, pp. 33–40 (December 2008)
Langhammer, M., VanCourt, T.: Altera Cooperation. Accelerating floating-point DGEMM on FPGAs. HPEC 2008 Poster (2008), http://www.ll.mit.edu/HPEC/agendas/proc08/Day2/35-Day2-PosterDemoB-VanCourt-abstract.pdf (last accessed: July 30, 2010)
Roldao Lopes, A., Constantinides, G.: A fused hybrid floating-point and fixed-point dot-product for fPGAs. In: Sirisuk, P., Morgan, F., El-Ghazawi, T., Amano, H. (eds.) ARC 2010. LNCS, vol. 5992, pp. 157–168. Springer, Heidelberg (2010)
Kestur, S., Davis, J.D., Williams, O.: BLAS Comparison on FPGA, CPU and GPU. Article, Microsoft Research (2007), http://research.microsoft.com/apps/pubs/default.aspx?id=130834
FloPoCo - Project, http://www.ens-lyon.fr/LIP/Arenaire/Ware/FloPoCo/ (last accessed: July 30, 2010)
Zhuo, L., Prasanna, V.K.: High-performance designs for linear algebra operations on reconfigurable hardware. IEEE Transactions on Computers 57(8), 1057–1071 (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Mücke, M., Lesser, B., Gansterer, W.N. (2011). Peak Performance Model for a Custom Precision Floating-Point Dot Product on FPGAs. In: Guarracino, M.R., et al. Euro-Par 2010 Parallel Processing Workshops. Euro-Par 2010. Lecture Notes in Computer Science, vol 6586. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-21878-1_49
Download citation
DOI: https://doi.org/10.1007/978-3-642-21878-1_49
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-21877-4
Online ISBN: 978-3-642-21878-1
eBook Packages: Computer ScienceComputer Science (R0)