skip to main content
column

Power Performance Profiling of 3-D Stencil Computation on an FPGA Accelerator for Efficient Pipeline Optimization

Published:22 April 2016Publication History
Skip Abstract Section

Abstract

This paper discusses power-performance optimization for 3-D stencil computing on a stream-oriented FPGA accelerator with highlevel synthesis. Taking a heat conduction simulation and an FDTD electromagnetic field simulation as benchmark applications, powerperformance profiling results are presented focusing on the effect of high-level pipeline parameters. As a result, it is shown that the optimal power efficiency can be achieved basically by optimizing the execution performance. The relationship between power efficiency and the clock frequency is also discussed.

References

  1. K. Peter, Editor and Study Lead, "ExaScale Computing Study: Technology Challenges in Achieving Exascale Systems," 2008.Google ScholarGoogle Scholar
  2. T. Ishihara and M. Goudarzi, "System-Level Techniques for Estimating and Reducing Energy Consumption in Real-Time Embedded Systems," International Soc Design Conference, pp. 67--72, 2007.Google ScholarGoogle Scholar
  3. Y. Sato, Y. Inoguchi, W. Luk, and T. Nakamura, "Evaluating reconfigurable dataflow computing using the Himeno benchmark," in Proceedings of International Conference on ReConFigurable Computing and FPGAs, pp. 1--7, 2012.Google ScholarGoogle Scholar
  4. H. Giefers, C. Plessl, and J. Förstner, "Accelerating Finite Difference Time Domain Simulations with Reconfigurable Dataflow Computers," in Proceedings of 4th International Symposium on Highly-Efficient Accelerators and Reconfigurable Technologies, pp. 33--38, 2013.Google ScholarGoogle Scholar
  5. K. Sano, "FPGA-Based Systolic Computational-Memory Array for Scalable Stencil Computations," in High-Performance Computing Using FPGAs (W. Vanderbauwhede and K. Benkrid, eds.), pp. 279--303, Springer New York, 2013.Google ScholarGoogle Scholar
  6. T. Kobori and T. Maruyama, "A High Speed Computation System for 3D FCHC Lattice Gas Model with FPGA," in Field Programmable Logic and Application (P. Cheung and G. Constantinides, eds.), vol. 2778 of Lecture Notes in Computer Science, pp. 755--765, Springer, 2003.Google ScholarGoogle Scholar
  7. R. Soejima, K. Okina, K. Dohi, Y. Shibata, and K. Oguri, "A memory profiling framework for stencil computation on an fpga accelerator with high level synthesis," SIGARCH Comput. Archit. News, vol. 42, pp. 69--74, Dec. 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Impulse Accelerated Technologies, "Impulse C." http://www.impulseaccelerated.com/.{9} Xilinx, "Vivado HLS Design." http://www.xilinx.com/products/design-tools/vivado/integration/esl-design/index.html.Google ScholarGoogle Scholar
  9. K. Dohi, K. Fukumoto, Y. Shibata, and K. Oguri, "Performance modeling and optimization of 3-D stencil computation on a stream-based FPGA accelerator," in Reconfigurable Computing and FPGAs (ReConFig), 2013 International Conference on, pp. 1--6, Dec 2013.Google ScholarGoogle Scholar
  10. K. Dohi, K. Okina, R. Soejima, Y. Shibata, and K. Oguri, "Performance Modeling of Stencil Computing on a Stream-Based FPGA Accelerator for Efficient Design Space Exploration," IEICE TRANSACTIONS on Information and Systems, vol. E98-D, pp. 298--308, 2 2015.Google ScholarGoogle ScholarCross RefCross Ref
  11. Maxeler Technologies, "MaxCompiler." http://www.maxeler.com/.Google ScholarGoogle Scholar
  12. K. H. Tsoi and W. Luk, "Power Profiling and Optimization for Heterogeneous Multi-core Systems," SIGARCH Comput. Archit. News, vol. 39, pp. 8--13, Dec. 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. H. Ding and M. Huang, "Improve Memory Access for Achieving Both Performance and Energy Efficiencies on Heterogeneous Systems," in Field-Programmable Technology (FPT), 2014 International Conference on, Dec 2014.Google ScholarGoogle Scholar
  14. D. Chen, J. Cong, Y. Fan, and L. Wan, "LOPASS: A Low-Power Architectural Synthesis System for FPGAs With Interconnect Estimation and Optimization," Very Large Scale Integration (VLSI) Systems, IEEE Transactions on, vol. 18, pp. 564--577, April 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. J. Nunez-Yanez, "Energy Efficient Reconfigurable Computing with Adaptive Voltage and Logic Scaling," SIGARCH Comput. Archit. News, vol. 42, pp. 87--92, Dec. 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. J. Yao, Y. Nakashima, N. Devisetti, and K. Yoshimura, "A Tightly Coupled General Purpose Reconfigurable Accelerator LAPP and Its Power States for HotSpot-Based Energy Reduction," IEICE TRANSACTIONS on Information and Systems, vol. E97-D, pp. 3092--3100, 12 2014.Google ScholarGoogle ScholarCross RefCross Ref
  17. D. Llamocca and M. Pattichis, "A Dynamically Reconfigurable Pixel Processor System Based on Power/Energy-Performance-Accuracy Optimization," Circuits and Systems for Video Technology, IEEE Transactions on, vol. 23, pp. 488--502, March 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in

Full Access

  • Published in

    cover image ACM SIGARCH Computer Architecture News
    ACM SIGARCH Computer Architecture News  Volume 43, Issue 4
    HEART '15
    September 2015
    98 pages
    ISSN:0163-5964
    DOI:10.1145/2927964
    Issue’s Table of Contents

    Copyright © 2016 Authors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 22 April 2016

    Check for updates

    Qualifiers

    • column

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader