Skip to main content

An FPGA-Based Parallel Accelerator for Matrix Multiplications in the Newton-Raphson Method

  • Conference paper
Embedded and Ubiquitous Computing – EUC 2005 (EUC 2005)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3824))

Included in the following conference series:

Abstract

Power flow analysis plays an important role in power grid configurations, operating management and contingency analysis. The Newton-Raphson (NR) iterative method is often enlisted for solving power flow analysis problems. However, it involves computation- expensive matrix multiplications (MMs). In this paper we propose an FPGA-based Hierarchical-SIMD (H-SIMD) machine with its codesign of the Hierarchical Instruction Set Architecture (HISA) to speed up MM within each NR iteration. FPGA stands for Field-Programmable Gate Array. HISA is comprised of medium-grain and coarse-grain instructions. The H-SIMD machine also facilitates better mapping of MM onto recent multimillion-gate FPGAs. At each level, any HISA instruction is classified to be of either the communication or computation type. The former are executed by a controller while the latter are issued to lower levels in the hierarchy. Additionally, by using a memory switching scheme and the high-level HISA set to partition applications, the host-FPGA communication overheads can be hidden. Our test results show sustained high performance.

This work was supported in part by the US Department of Energy under grant DE-FG02-03CH11171.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Wirthlin, M.J., Hutchings, B.L., Gilson, K.L.: The Nano Processor: a Low Resource Reconfigurable Processor. In: Proc. IEEE FPGAs Custom Comput., March 1994, pp. 22–30 (1994)

    Google Scholar 

  2. Wang, X., Ziavras, S.G.: Performance Optimization of an FPGA-Based Configurable Multiprocessor for Matrix Operations. In: IEEE Intern. Conf. Field-Programmable Tech., Tokyo (December 2003)

    Google Scholar 

  3. Leong, P.H.W., Sham, C.W., Wong, W.C., Wong, H.Y., Yuen, W.S., Leong, M.P.: A Bitstream Reconfigurable FPGA Implementation of the WSAT Algorithm. IEEE Trans. VLSI Syst. 9(1) (February 2001)

    Google Scholar 

  4. Zhuo, L., Prasanna, V.K.: Scalable and Modular Algorithms for Floating-Point Matrix Multiplication on FPGAs. In: 18th Intern. Parallel Distr. Proc. Symp. (April 2004)

    Google Scholar 

  5. Dou, Y., Vassiliadis, S., Kuzmanov, G.K., Gaydadjiev, G.N.: 64-bit Floating-Point FPGA Matrix Multiplication. In: 2005 ACM/SIGDA 13th Intern. Symp. FPGAs (February 2005)

    Google Scholar 

  6. Underwood, K.D., Hemmert, K.S.: Closing the Gap: CPU and FPGA Trends in Sustainable Floating-Point BLAS Performance. In: IEEE Symp. Field-Progr. Custom Comput. Mach. (April 2004)

    Google Scholar 

  7. Wildstar II Hardware Reference Manual, Annapolis Microsystems, Inc., Annapolis, MD (2004)

    Google Scholar 

  8. Quixilica Floating Point FPGA Cores Datasheet, QinetiQ Ltd. (2004)

    Google Scholar 

  9. Schreiber, R.: Numerical Algorithms for Modern Parallel Computer Architectures, pp. 197–208. Springer, New York (1988)

    Google Scholar 

  10. Jin, D., Ziavras, S.: A Super-Programming Approach for Mining Association Rules in Parallel on PC Clusters. IEEE Trans. on Parallel Distr. Systems 15(9) (September 2004)

    Google Scholar 

  11. Performance Benchmarks for Intel Math Kernel Library, Intel Corporation White Paper (2003)

    Google Scholar 

  12. http://www.intel.com/pressroom/kits/quickreffam.htm#Xeon

  13. Wang, X., Ziavras, S.G.: Parallel LU Factorization of Sparse Matrices on FPGA-Based Configurable Computing Engines. Concurrency and Computation: Practice and Experience 16(4), 319–343 (2004)

    Article  Google Scholar 

  14. Xu, X., Ziavras, S.G.: A Hierarchically-Controlled SIMD Machine for 2D DCT on FPGAs. In: IEEE International Systems-On-Chip Conference, Washington, DC (September 2005)

    Google Scholar 

  15. Wang, X., Ziavras, S.G.: Parallel Direct Solution of Linear Equations on FPGA-Based Machines. In: Workshop on Parallel and Distributed Real-Time Systems (in conjunction with the 17th Annual IEEE International Parallel and Distributed Processing Symposium), Nice, France (April 2003)

    Google Scholar 

  16. Tinney, W.F., Hart, C.E.: Power Flow Solution by Newton’s Method. IEEE Trans. AS, PAS-86(11), 1449–1460 (1967)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Xu, X., Ziavras, S.G., Chang, TG. (2005). An FPGA-Based Parallel Accelerator for Matrix Multiplications in the Newton-Raphson Method. In: Yang, L.T., Amamiya, M., Liu, Z., Guo, M., Rammig, F.J. (eds) Embedded and Ubiquitous Computing – EUC 2005. EUC 2005. Lecture Notes in Computer Science, vol 3824. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11596356_47

Download citation

  • DOI: https://doi.org/10.1007/11596356_47

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-30807-2

  • Online ISBN: 978-3-540-32295-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics