Abstract
The Cameron Project has developed a system for compiling codes written in a high-level language called SA-C, to FPGA-based reconfigurable computing systems. In order to exploit the parallelism available on the FPGAs, the SA-C compiler performs a large number of optimizations such as full loop unrolling, loop fusion and strip-mining. However, since the area on an FPGA is limited, the compiler needs to know the effect of compiler optimizations on the FPGA area; this information is typically not available until after the synthesis and place and route stage, which can take hours. In this article, we present a compile-time area estimation technique to guide SA-C compiler optimizations. We demonstrate our technique for a variety of benchmarks written in SA-C. Experimental results show that our technique predicts the area required for a design to within 2.5% of actual for small image processing operators and to within 5.0% for larger benchmarks. The estimation time is in the order of milliseconds, compared with minutes for the synthesis tool.
- Annapolis Micro Systems, Inc. 2000. WILDSTAR Reference Manual. Annapolis Micro Systems, Inc., Annapolis, MD. www.annapmicro.com.]]Google Scholar
- Böhm, W., Beveridge, R., Draper, B., Ross, C., Chawathe, M., and Najjar, W. 2002a. Compiling ATR probing codes for execution on FPGA hardware. In Proceedings of the IEEE Symposium on Field-Programmable Custom Computing Machines. IEEE Computer Society Press, Los Alamitos, CA. (Napa Valley, CA). 301--302.]] Google Scholar
- Böhm, W., Draper, B., Najjar, W., Hammes, J., Rinker, R., Chawathe, M., and Ross, C. 2001. One-step compilation on image processing applications to FPGAs. In Proceedings of the IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM) (Rohnert Park, CA). IEEE Computer Society Press, Los Alamitos, CA.]] Google Scholar
- Böhm, W., Hammes, J., Draper, B., Chawathe, M., Ross, C., Rinkeh, R., and Najjar, W. 2002b. Mapping a single assignment programming language to reconfigurable systems. Super-computing 21, 117--130.]] Google ScholarCross Ref
- Cohen, A., Daubechies, I., and Feauveau, J. 1992. Bi-orthogonal bases of compactly supported wavelets. In Commun. Pure Appl. Math. XLV, 485--560.]]Google ScholarCross Ref
- DeHon, A. 2000. The density advantage of reconfigurable computing. IEEE Comput. 33, 4, 41--49.]] Google ScholarDigital Library
- Draper, B., Böhm, W., Hammes, J., Najjar, W., Beveridge, R., Ross, C., Chawathe, M., Desai, M., and Bins, J. 2001. Compiling SA-C programs to FPGAs: Performance results. In Proceedings of the International Conference on Vision Systems (Vancouver, B. C., Canada). 220--235.]] Google Scholar
- Enzler, R., Jeger, T., Cottet, D., and Trster, G. 2000. High-level area and performance estimation of hardware building blocks on FPGAs. In Proceedings of the Roadmap to Reconfigurable Computing, 10th International Workshop on Field-Programmable Logic and Applications. 525--534.]] Google ScholarCross Ref
- Hammes, J., Böhm, W., Ross, C., Chawathe, M., Draper, B., and Najjar, W. 2001a. High performance image processing on FPGAs. In Proceedings of the Los Alamos Computer Science Institute Symposium. Santa Fe, NM.]]Google Scholar
- Hammes, J., Böhm, W., Ross, C., Chawathe, M., Draper, B., Rinker, R., and Najjar, W. 2001b. Loop fusion and temporal common subexpression elimination in window-based loops. In Proceedings of the IPDPS 8th Reconfigurable Architectures Workshop (San Francisco, CA).]] Google Scholar
- Kannan, P., Balachandran, S., and Bhatia, D. 2002. On metrics for comparing routability estimation methods for FPGAs. In Proceedings of the 39th Conference on Design Automation (New Orleans, LA).]] Google Scholar
- Kulkarni, D., Najjar, W., Rinker, R., and Kurhadi, F. 2002. Fast area estimation to support compiler optimizations in FPGA-based reconfigurable systems. In Proceedings of the IEEE Simposium on Field-Programmable Custom. Computing Machines (FCCM) (Napa Valley, CA). IEEE Computer Society Press, Los Alamitos, CA.]] Google Scholar
- Najjar, W. A., Böhm, W., Draper, B., Hammes, J., Rinker, R., Beveridge, R., Chawathe, M., and Ross, C. 2003. From algorithms to hardware---A high-level language abstraction for reconflgurable computing. IEEE Comput. 36, 8 (Aug.), 63--69.]] Google ScholarDigital Library
- Ohm, S., Kurdahi, F., and Dutt, N. 1994. Comprehensive lower bound estimation from behavioral descriptions. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD) (San Jose, CA). IEEE Computer Society Press, Los Alamitos, CA. I82--189.]] Google Scholar
- Ohm. S., Kurdahi, F., Dutt, N., and Xu, M. 1995. A comprehensive estimation technique for high-level synthesis. In Proceedings of the International Symposium on System Synthesis (ISSS).]] Google Scholar
- Rinker, R., Carteh, M., Patel, A., Chawathe, M., Ross, C., Hammes, J., Najjar, W., and Böhm, W. 2001. An automated process for compiling dataflow graphs into reconfigurable hardware. IEEE Trans. VLSI Design 9, 130--139.]] Google ScholarDigital Library
- Shayee, K., Park, J., and Diniz, P. 2003. Performance and area modeling of complete FPGA designs in the presence of loop transformations. In Proceedings of the IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM) (Napa, CA). IEEE Computer Society Press, Los Alamitos, CA.]] Google Scholar
- Weis, K., Oetker, C., Katohan, I., Steckstor, T., and Rosenstiel, W. 2000. Power estimation approach for SRAM-based FPGAs. In Proceedings of the IEEE Symposium on Field-Programmable Logic and Applications (FPGA 2000) (Monterey, CA). IEEE Computer Society Press, Los Alamitos, CA, 195--202.]] Google Scholar
- Xilinx Inc. 2000. Virtex 2.5V Field Progammable Gate Array. Xilinx, Inc. www.xilinx.com.]]Google Scholar
- Xu M. and Kurdahi, F. 1996. Accurate prediction of quality metrics for logic level designs targeted towards lookup table based FPGAs. IEEE Trans. VLSI Systems.]] Google Scholar
Index Terms
- Compile-time area estimation for LUT-based FPGAs
Recommendations
Efficient hardware code generation for FPGAs
The wider acceptance of FPGAs as a computing device requires a higher level of programming abstraction. ROCCC is an optimizing C to HDL compiler. We describe the code generation approach in ROCCC. The smart buffer is a component that reuses input data ...
Automatic translation of software binaries onto FPGAs
DAC '04: Proceedings of the 41st annual Design Automation ConferenceThe introduction of advanced FPGA architectures, with built-in DSP support, has given DSP designers a new hardware alternative. By exploiting its inherent parallelism, it is expected that FPGAs can outperform DSP processors. This paper describes the ...
Logarithmic-Time FPGA Bitstream Analysis: A Step Towards JIT Hardware Compilation
Just-In-Time (JIT) compilation is frequently used in software engineering to accelerate program execution. Parts of the code are translated to machine code at runtime to speedup their execution by exploiting local and dynamic information of the ...
Comments