skip to main content
10.1145/1404371.1404392acmconferencesArticle/Chapter ViewAbstractPublication PagessbcciConference Proceedingsconference-collections
research-article

Implementation of a double-precision multiplier accumulator with exception treatment to a dense matrix multiplier module in FPGA

Published:01 September 2008Publication History

ABSTRACT

Recently, the manufactures of supercomputers have made use of FPGAs to accelerate scientific applications [16][17]. Traditionally, the FPGAs were used only on non-scientific applications. The main reasons for this fact are: the floating-point computation complexity; the FPGA logic cells are not sufficient for the scientific cores implementation; the cores complexity prevents them to operate on high frequencies.

Nowadays, the increase of specialized blocks availability in complex operations, as sum and multiplier blocks, implemented directly in FPGA and, the increase of internal RAM blocks (BRAMs) have made possible high performance systems that use FPGA as a processing element for scientific computation [2].

These devices are used as co-processors that execute intensive computation. The emphasis of these architectures is the exploration of parallelism present on scientific computation operations and data reuse.

In major of these applications, the scientific computation uses, in general, operations of big floating-point dense matrices, which are normally operated by MACs.

In this work, we describe the architecture of an accumulative multiplier (MAC) in double precision floating-point, according to IEEE-754 standard and we propose the architecture of a multiplier of matrices that uses developed instances of the MACs and explores the reuse of data through the use of the BRAMs (Blocks of RAM internal to the FPGAs) of a Xilinx Virtex 4 LX200 FPGA. The synthesis results showed that the implemented MAC could reach a performance of 4GFLOPs.

References

  1. Ling Zhuo, Viktor K. Prasanna, Scalable and Modular Algorithms for Floating-Point Matrix Multiplication on Reconfigurable Computing Systems, IEEE Transactions on Parallel and Distributed Systems (TPDS), Vol. 18, No. 4, pp. 433--448, April 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. K.D. Underwood and K.S. Hemmert. Closing the Gap: CPU and FPGA Trends in Sustainable Floating-Point BLAS Performance. In Proc. Of 2004 IEEE Symposium on Field Programmable Custom Computing Machines, California, USA, April 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. A. Chtchelkanova, J. Gunnels, G. Morrow, J. Overfelt and K. D. Underwood and K. S. Hemmert. Closing the Gap: CPU and FPGA Trends in Sustainable Floating-Point BLAS Performance. In Proc. of 2004 IEEE Symposium on Field-Programmable Custom Computing Machines, California,USA, April 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. L. Zhuo and V. K. Prasanna. Scalable and Modular Algorithms for Floating-Point matrix Multiplication on FPGAs. In Proc. of the 18th International Parallel & Distributed Processing Simposium, New Mexico, USA, April 2004.Google ScholarGoogle Scholar
  5. L. Zhuo and V. K. Prasanna. Scalable Hybrid Designs for Linear Algebra on Reconfigurable Computing Systems, submitted to IEEE Transactions on Computers. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. R. Scrofano and V. K. Prasanna. Computing Lennard-Jones Potentials and Forces with Reconfigurable Hardware. In Proc. Int'l Conf. Eng. of Reconfigurable Systems and Algorithms (ERSA'04), pages 284--290, June 2004.Google ScholarGoogle Scholar
  7. K. D. Underwood and K. S. Hemmert. Closing the Gap:CPU and FPGA Trends in Sustainable Floating-Point BLAS Performance. In Proc. IEEE Symp. Field-Programmable Custom Computing Machines (FCCM'04), April 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. L. Zhuo and V. K. Prasanna. Scalable and Modular Algorithms for Floating-Point Matrix Multiplication on FPGAs. In Proc. 18th Int'l Parallel & Distributed Processing Symp. (IPDPS'04), New Mexico, USA, April 2004.Google ScholarGoogle Scholar
  9. IEEE Standard for Binary Floating-Point Arithmetic 1985.Google ScholarGoogle Scholar
  10. C. Babb, J. Blank, I. Castellanos, J. Moskal. Floating Point Multiplier Final Project ECE 587Google ScholarGoogle Scholar
  11. Per Karlström, Andreas Ehliar, Dake Liu. High Performace, Low Latency FPGA based Floating Point Adder and Multiplier Units in a Virtex 4Google ScholarGoogle Scholar
  12. E. Mark. Free Floating-Point MadnessGoogle ScholarGoogle Scholar
  13. Youg Dou, S. Vassiliadis, G. K. Kuzmanov, G. N. Gaydadjiev. 64-bit Floating-Point FPGA Matrix MultiplicationGoogle ScholarGoogle Scholar
  14. Mingw 5.1.3 - http://www.mingw.org/Google ScholarGoogle Scholar
  15. GSL - GNU Scientific LibraryGoogle ScholarGoogle Scholar
  16. Cray Inc. Cray XD1 FPGA Development. http://www.cray.com/.Google ScholarGoogle Scholar
  17. SGI Inc. http://www.sgi.com/>Google ScholarGoogle Scholar
  18. ModelSim. http://www.model.com/Google ScholarGoogle Scholar

Index Terms

  1. Implementation of a double-precision multiplier accumulator with exception treatment to a dense matrix multiplier module in FPGA

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          SBCCI '08: Proceedings of the 21st annual symposium on Integrated circuits and system design
          September 2008
          256 pages
          ISBN:9781605582313
          DOI:10.1145/1404371

          Copyright © 2008 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 1 September 2008

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          Overall Acceptance Rate133of347submissions,38%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader