ABSTRACT
We introduce a novel approach to accelerating functional simulation. The key attributes of our approach are high-performance, low-cost, scalability and low turn-around-time (TAT). We achieve speedups between 25 and 2000x over zero delay event-driven simulation and between 75 and 1000x over cycle-based simulation on benchmark and industrial circuits while maintaining the cost, scalability and TAT advantages of simulation. Owing to these attributes, we believe that such an approach has potential for very wide deployment as replacement or enhancement for existing simulators. Our technology relies on a VLIW-like virtual simulation processor (SimPLE) mapped to a single FPGA on an off-the-shelf PCI board. Primarily responsible for the speed are (i) parallelism in the processor architecture (ii) high pin count on the FPGA enabling large instruction bandwidth and (iii) high speed (124 MHz on Xilinx Virtex-II) single-FPGA implementation of the processor with regularity driven efficient place and route. Companion to the processor is the very fast SimPLE compiler which achieves compilation rates of 4 million gates/hour. In order to simulate the netlist, the compiled instructions are streamed through the FPGA, along with the simulation vectors. This architecture plugs in naturally into any existing HDL simulation environment. We have a working prototype based on a commercially available PCI-based FPGA board.
- J. Abke and E. Barke. A New Placement Method for Direct Mapping into LUT based FPGAs. In International Conference on Field Programmable Logic and Applications, pages 27--36, August 2001]] Google ScholarDigital Library
- S. I. Assn. International Technology Roadmap for Semiconductors. ITRS, 1999. http://public.itrs.net]]Google Scholar
- J. Babb, R. Tessier, and A. Agarwal. Virtual Wires: Overcoming Pin Limitations in FPGA-based Logic Emulators. In Proceedings of the IEEE Workshop on FPGAs for Custom Computing Machines, April 1993]]Google Scholar
- J. Babb, R. Tessier, M. Dahl, S. Hanano, D. Hoki, and A. Agarwal. Logic Emulation with Virtual Wires. In IEEE Transactions on CAD of Integrated Circuits and Systems, June 1997]]Google Scholar
- J. Cong and Y. Ding. An Optimal Technology Mapping Algorithm for Delay Optimization in Lookup-Table based FPGA Designs. In IEEE Transactions on CAD, pages 1--12, January 1994]]Google Scholar
- F. Corno, M. S. Reorda, and G. Squillero. RT-level ITC99 Benchmarks and First ATPG Results. In IEEE Design and Test of Computers, pages 44--53, July 2000]] Google ScholarDigital Library
- S. C. Goldstein, H. Schmit, M. Moe, M. Budiu, S. Cadambi, R. R. Taylor, and R. Laufer. PipeRench: A Coprocessor for Streaming Multimedia Acceleration. In The 26th Annual Internation Symposium on Computer Architecture, pages 28--39, May 1999]] Google ScholarDigital Library
- C. Mulpuri and S. Hauck. Runtime and Quality Tradeoffs in FPGA Placement and Routing. In International Symposium on Field Programmable Gate Arrays, pages 29--36, February 2001]] Google ScholarDigital Library
- E. Shriver and K. Sakallah. Ravel: Assigned-delay Compiled-code Logic Simulation. In International Conference on Computer-Aided Design, pages 364--368, November 1992]] Google ScholarDigital Library
- S. Trimberger. Scheduling Designs into a Time-multiplexed FPGA. In Proceedings of the 1998 ACM/SIGDA Sixth International Symposium on Field Programmable Gate Arrays, February 1998]] Google ScholarDigital Library
- S. Trimberger, D. Carberry, A. Johnson, and J. Wong. A Time-multiplexed FPGA. In IEEE Symposium on FPGAs for Custom Computing Machines, February 1997]] Google ScholarDigital Library
- K. Westgate and D. McInnis. Reducing Simulation Time with Cycle Simulation. Quickturn White Paper, 2000. http://www.quickturn.com/tech/cbs.htm]]Google Scholar
- Xilinx. Virtex-II 1.5v Field Programmable Gate Array: Advance Product Specification. Xilinx Application Databook, October 2001. http://www.xilinx.com/partinfo/databook.htm]]Google Scholar
Index Terms
- A fast, inexpensive and scalable hardware acceleration technique for functional simulation
Recommendations
Communication-efficient hardware acceleration for fast functional simulation
DAC '04: Proceedings of the 41st annual Design Automation ConferenceThis paper presents new technology that accelerates system verification. Traditional methods for verifying functional designs are based on logic simulation, which becomes more time-consuming as design complexity increases. To accelerate functional ...
Hardware acceleration of graphics and imaging algorithms using FPGAs
SCCG '02: Proceedings of the 18th Spring Conference on Computer GraphicsComputer graphics algorithms and algorithms used in image processing are generally computationally expensive. This fact is the reason why people struggle to accelerate such algorithms using any reasonable means. The traditional sources of speedup are ...
A Reconfigurable Architecture for Binary Acceleration of Loops with Memory Accesses
This article presents a reconfigurable hardware/software architecture for binary acceleration of embedded applications. A Reconfigurable Processing Unit (RPU) is used as a coprocessor of the General Purpose Processor (GPP) to accelerate the execution of ...
Comments