Abstract
Reconfigurable computing (RC) is rapidly emerging as a promising technology for the future of high-performance and embedded computing, enabling systems with the computational density and power of custom-logic hardware and the versatility of software-driven hardware in an optimal mix. Novel methods for rapid virtual prototyping, performance prediction, and evaluation are of critical importance in the engineering of complex reconfigurable systems and applications. These techniques can yield insightful tradeoff analyses while saving valuable time and resources for researchers and engineers alike. The research described herein provides a methodology for mapping arbitrary applications to targeted reconfigurable platforms in a simulation environment called RCSE. By splitting the process into two domains, the application and simulation domains, characterization of each element can occur independently and in parallel, leading to fast and accurate performance prediction results for large and complex systems. This article presents the design of a novel framework for system-level simulative performance prediction of RC systems and applications. The article also presents a set of case studies analyzing two applications, Hyperspectral Imaging (HSI) and Molecular Dynamics (MD), across three disparate RC platforms within the simulation framework. The validation results using each of these applications and systems show that our framework can quickly obtain performance prediction results with reasonable accuracy on a variety of platforms. Finally, a set of simulative case studies are presented to illustrate the various capabilities of the framework to quickly obtain a wide range of performance prediction results and power consumption estimates.
- Alam, S., Agrawal, P., Smith, M., Vetter, J., and Caliga, D. 2007. Using FPGA devices to accelerate biomolecular simulations. IEEE Computer 39, 4, 66--73. Google ScholarDigital Library
- Altera. 2001. Evaluating power for altera devices. Application Note 74 version 3.1.Google Scholar
- Anderson, J. H. and Najim, F. N. 2004. Power estimation techniques for FPGAs. IEEE Trans. VLSI Syst. 12, 10, 1015--1027. Google ScholarDigital Library
- Bakshi, A., Prasanna, V. K., and Ledeczi, A. 2001. Milan: A model based integrated simulation framework for design of embedded systems. In Proceedings of the ACM SIGPLAN workshop on Languages, Compilers and Tools for Embedded Systems (LCTES’01). ACM, 82--93. Google ScholarDigital Library
- Bondalapati, K., and Prasanna, V. K. 2002. Reconfigurable computing systems. Proc. IEEE 90, 7, 1201--1217.Google ScholarCross Ref
- Bondalapati, K. K. 2001. Modeling and mapping for dynamically reconfigurable hybrid architectures. Ph.D. thesis, University of Southern California, Los Angeles, CA. Google ScholarDigital Library
- Browne, S., Dongarra, J., Garner, N., Ho, G., and Mucci, P. 2000. A portable programming interface for performance evaluation on modern processors. Int. J. High Perf. Appli. 14, 3, 189--204. Google ScholarDigital Library
- Burger, D., and Austin, T. M. 1997. The simpleScalar tool set, version 2.0. ACM SIGARCH Comput. Architect. News 25, 3, 13--25. Google ScholarDigital Library
- Chang, C.-I., Ren, H., and Chiang, S.-S. 2004. Real-time processing algorithm for target detection and classification in hyperspectral imagery. IEEE Trans. Geosci. Remote Sensing 39, 4, 760--768.Google ScholarCross Ref
- Enzler, R., Plessl, C., and Platzner, M. 2005. System-level performance evaluation of reconfigurable processors. Microprocess. Microsyst. 29, 2-3, 63--75. (Special Issue on FPGA Tools and Techniques).Google ScholarCross Ref
- Fu, W., and Compton, K. 2006. A simulation platform for reconfigurable computing research. In Proceedings of the International Conference on Field Programmable Logic and Applications. (FPL’06). 1--7.Google Scholar
- Garcia, P., Compton, K., Schulte, M., Blem, E., and Fu, W. 2006. An overview of reconfigurable hardware in embedded systems. EURASIP J. Embed. Syst., 1--19. Google ScholarDigital Library
- Grobelny, E., Bueno, D., Troxel, I., George, A., and Vetter, J. 2007. FASE: A framework for scalable performance prediction of HPC systems and applications. Simulation: Trans. Soc. Model. Simul. Int. 83, 10, 721--745. Google ScholarDigital Library
- Hamerly, G., Perelman, E., Lau, J., and Calder, B. 2005. Simpoint 3.0: Faster and more flexible program phase analysis. J. Instruct.-Level Paral. 7, 1--28.Google Scholar
- Hicks, P., Walnock, M., and Owens, R. M. 1997. Analysis of power consumption in memory hierarchies. In Proceedings of International Symposium on Low Power Electronics and Design. ACM, 239--242. Google ScholarDigital Library
- Holland, B., Nagarajan, K., Conger, C., Jacons, A., and George, A. 2007. RAT: A methodology for predicting performance in application design migration to FPGAs. In Proceedings of High-Performance Reconfigurable Computing Technologies and Apps Workshop (HPRTCA). 1--10. Google ScholarDigital Library
- Lafage, T., and Seznec, A. 2001. Choosing representative slices of program execution for microarchitecture simulations: A preliminary application to the data stream. In Workload Characterization of Emerging Computer Applications, Kluwer International Series in Engineering and Computer Science Series, Kluwer Academic Publishers, 145--163. Google ScholarDigital Library
- Li, F., Chen, D., He, L., and Cong, J. 2003. Architecture evaluation for power-efficient FPGAs. In Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays. 175--184. Google ScholarDigital Library
- Macii, E., Pedram, M., and Somenzi, F. 1998. High-level power modeling, estimation, and optimization. IEEE Trans. Comput.-Aid. Des. Integ. Circ. Syst. 17, 11, 1061--1079. Google ScholarDigital Library
- Mohanty, S., and Prasanna, V. K. 2007. A model-based extensible framework for efficient applicatin design using FPGA. ACM Trans. Des. Autom. Electr. Syst. 12, 2, 13. Google ScholarDigital Library
- Mohanty, S., Prasanna, V. K., Neema, S., and Davis, J. 2002. Rapid design-space exploration of heterogeneous embedded systems using symbolic search and multi-granular simulation. In : Proceedings of the joint conference on Languages, Compilers and Tools for Embedded Systems (LCTES/SCOPES’02). ACM, New York, 18--27. Google ScholarDigital Library
- Pearlman, D. A., Case, D. A., Caldwell, J. W., Ross, W. S., Cheatham III, T. E., DeBolt, S., Ferguson, D., Seibel, G., and Kollman, P. 1995. AMBER, a package of computer programs for applying molecular mechanics, normal mode analysis, molecular dynamics and free energy calculations to simulate the structural and energetic properties of molecules. Comput. Phys. Commun. 91, 1-3, 1--41.Google ScholarCross Ref
- Pllana, S., and Fahringer, T. 2005. Performance prophet: A performance modeling and prediction tool for parallel and distributed programs. In Proceedings of the Internation Conference on Parallel Processing. 509--516. Google ScholarDigital Library
- Poon, K. K., Wilton, S. J., and Yan, A. 2005. A detailed power model for field-programmable gate arrays. ACM Trans. Des. Autom. Elect. Syst. 10, 2, 279--302. Google ScholarDigital Library
- Schorcht, G., Troxel, I., Farhanigan, K., Unger, P., Zinn, D., Mick, C., George, A. D., and Salzwedel, H. 2003. System-level simulation modeling with mldesigner. In Proceedings of the 11th IEEE/ACM International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS). 207--212.Google Scholar
- Smith, M. C. and Peterson, G. D. 2002. Analytical modeling for high-peroformance recongifurable computers. In Proceedings of the SCS International Symposium on Performance Evaluation of Computer and Telecommunications Systems (SPECTS).Google Scholar
- Snavely, A., Carrington, L., Wolter, N., Labarta, J., Badia, R., and Purkayastha, A. 2002. A framework for performance modeling and prediction. In Proceedings of the ACM/IEEE SC2002 Conference. 21--21. Google ScholarDigital Library
- Steffen, C. P. 2007. Parametrization of algorithms and fpga accelerators to predict performance. In Proceedings of the Reconfigurable System Summer Institute (RSSI). 17--20.Google Scholar
- Tessier, R., and Burleson, W. 2001. Recongifugrable computing for digital signal processing: A survey. J. VLSI Signal Proces. 28, 1-2, 7--27. Google ScholarDigital Library
- Uhlig, R. A., and Mudge, T. N. 1997. Trace-driven memory simulation: A survey. ACM Comput. Surv. 29, 2, 128--170. Google ScholarDigital Library
- Walker, D. W. 1994. The design of a standard message passing interface for distributed memory concurrent computers. Paral. Comput. 20, 4, 657--673. Google ScholarDigital Library
- Weiss, K., Oetker, C., Katchan, I., Steckstor, T., and Rosenstiel, W. 2000. Power estimation approach for SRAM-based FPGAs. In Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays. 195--202. Google ScholarDigital Library
- Wunderlich, R. E., Wenisch, T. F., Falsafi, B., and Hoe, J. C. 2006. Statistical sampling of microarchitecture simulation. ACM Trans. Mod. Comput. Simul. 16, 3, 197--224. Google ScholarDigital Library
Index Terms
- A Simulation Framework for Rapid Analysis of Reconfigurable Computing Systems
Recommendations
An analytical model for multilevel performance prediction of Multi-FPGA systems
Power limitations in semiconductors have made explicitly parallel device architectures such as Field-Programmable Gate Arrays (FPGAs) increasingly attractive for use in scalable systems. However, mitigating the significant cost of FPGA development ...
The Promise of High-Performance Reconfigurable Computing
Several high-performance computers now use field-programmable gate arrays as reconfigurable coprocessors. The authors describe the two major contemporary HPRC architectures and explore the pros and cons of each using representative applications from ...
A Mixed-Grained Reconfigurable Computing Platform for Multiple-Standard Video Decoding (Abstract Only)
FPGA '15: Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate ArraysA mixed-grained reconfigurable computing platform targeting multiple-standard video decoding is proposed in this paper. The platform integrates eight coarse-grained Reconfigurable Processing Units (RPUs), each of which consists of 16×16 multi-functional ...
Comments