ABSTRACT
This paper describes OneChip, a third generation reconfigurable processor architecture that integrates a Reconfigurable Functional Unit (RFU) into a superscalar Reduced Instruction Set Computer (RISC) processor's pipeline. The architecture allows dynamic scheduling and dynamic reconfiguration. It also provides support for pre-loading configurations and for Least Recently Used (LRU) configuration management.
To evaluate the performance of the OneChip architecture, several off-the-shelf software applications were compiled and executed on Sim-OneChip, an architecture simulator for OneChip that includes a software environment for programming the system. The architecture is compared to a similar one but without dynamic scheduling and without an RFU. OneChip achieves a performance improvement and shows a speedup range from 2.16 up to 32 for the different applications and data sizes used. The results show that dynamic scheduling helps performance the most on average, and that the RFU will always improve performance the best when most of the execution is in the RFU.
- 1.D.Burger and T.M.Austin.The SimpleScalar tool set,version 2.0.Tech ical Report 1342,U iversity of Wisco sin-Madison,Computer Sciences Departme t, 1997.]]Google Scholar
- 2.T.J.Callahan,J.R.Hauser,and J.Wawrzynek.The Garp architecture and C compiler.Computer 33(4):62-69,Apr.2000.]] Google ScholarDigital Library
- 3.J.E.Carrillo Esparza.Evaluation of the O eChip reconfigurable processor.Master 's thesis,University of Toronto,2000.]]Google Scholar
- 4.A.DeHon.DPGA-coupled microprocessors: Commodity ICs for the early 21st century.I Proceedings IEEE Workshop on Fiel d-Programmabl e Custom Computing Machines pages 31 -39,Apr.1994.]]Google Scholar
- 5.S.C.Goldstein,H.Schmit,M.Budiu,S.Cadambi, M.Roe, and R.R.Taylor.PipeRench:A recon .gurable architecture and compiler.Computer 33(4):70 -77,Apr.2000.]] Google ScholarDigital Library
- 6.S.Hauck.Con .guratio prefetch for single co text reconfigurable coprocessors.In Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays pages 65 -74,Feb.1998.]] Google ScholarDigital Library
- 7.S.Hauck,T.W.Fry,M.M.Hosler,a d J.P.Kao. The Chimaera reconfigurable functio al unit.In Proceedings of the IEEE Symposium on Field-Programmable Custom Computing Machines pages 87 -96,Apr.1997.]] Google ScholarDigital Library
- 8.S.Hauck,Z.Li,and E.Schwabe.Con .guratio compression for the Xilinx XC6200 FPGA.I Proceedings of the IEEE Symposium on Field-Programmable Custom Computing Machines pages 123 -130,Apr.1998.]] Google ScholarDigital Library
- 9.J.L.Hennessy a d D.A.Patterso .Computer Architecture A Quantitative Morgan Kaufmann Publishers,Inc.,Sa Fra cisco,CA,seco d edition edition,1996.]] Google ScholarDigital Library
- 10.J.A.Jacob.Memory interfacing for the O eChip recon .gurable processor.Master 's thesis,University of Toronto,1998.]]Google Scholar
- 11.J.A.Jacob a d P.Chow.Memory interfacing a d i structio speci .catio for recon .gurable processors. In Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays pages 145 -154,Feb.1999.]] Google ScholarDigital Library
- 12.D.Lau,A.Schneider,M.D.Ercegovac,a d J.Villasenor.FPGA-based structures for o -line FFT and DCT.I Proceedings of the IEEE Symposium on Field-Programmable Custom Computing Machines pages 310 -311,Apr.1999.]] Google ScholarDigital Library
- 13.C.Lee,M.Potkonjak,and W.H.Mangione-Smith. MediaBe ch:A tool for evaluati g and sy thesizing multimedia and communications systems.I Procedings of the IEEE/ACM International Symposium on Microarchitecture (MICRO 30),pages 330 -335,Dec.1997.]] Google ScholarDigital Library
- 14.G.Lu,H.Singh,M.-H.Lee,N.Bagherzadeh, F.Kurdahi,and E.M.C.Filho.The MorphoSys parallel recon .gurable system.In Procedings of Euro-Par 99, Toulouse, France Sept.1999.]] Google ScholarDigital Library
- 15.MIPS Tech ologies,I corporated.MIPS R10000 (T5) Superscalar Microprocessor, technical brief Oct.1994.]]Google Scholar
- 16.T.Miyamori and K.Olukotu .A quantitative a alysis of recon .gurable coprocessors for multimedia applications.I Proceedings of the IEEE Symposium on Field-Programmable Custom Computing Machines pages 2 -11,Apr.1998.]] Google ScholarDigital Library
- 17.V.S.Pai,P.Ranganathan,and S.V.Adve.RSIM:A execution-drive simulator for ILP-based shared-memory multiprocessors a d uniprocessors.In Proceedings of the Third Workshop on Computer Architecture Education Feb.1997.]]Google Scholar
- 18.R.Razdan and M.D.Smith.A high-performa ce microarchitecture with hardware-programmable functio al units.In Proceedings of the 27th Annual International Symposium on Microarchitecture pages 172 -80.IEEE/ACM,Nov.1994.]] Google ScholarDigital Library
- 19.C.R.Rupp,M.Landguth,T.Garverick, E.Gomersall,H.Holt,J.A.Arnold,and M.Gokhale. The NAPA adaptive processing architecture.In Proceedings of the IEEE Symposium on Field-Programmable Custom Computing Machines pages 28 -37,Apr.1998.]] Google ScholarDigital Library
- 20.A.Silberschatz and P.B.Galvin.Operating System Addiso Wesley Lo gma ,Inc.,USA,.fth edition edition,1998.]]Google Scholar
- 21.X.Ta g,M.Aalsma,a d R.Jou.A compiler directed approach to hiding con .guration latency in chameleon processors.I Proceedings of the the 10th International Conference on Field-Programmable Logic and Applications Aug.2000.]] Google ScholarDigital Library
- 22.R.D.Wittig and P.Chow.OneChip:An FPGA processor with recon .gurable logic.In Proceedings of the IEEE Symposium on Field-Programmable Custom Computing Machines pages 126 -135,Mar.1996.]]Google Scholar
- 23.Z.A.Ye,A.Moshovos,S.Hauck,and P.Banerjee. CHIMAERA:A high-performance architecture with a tightly-coupled recon .gurable unit.In Proceedings of the 27th International Symposium on Computer Architecture pages 225 -235,Ju e 2000.]] Google ScholarDigital Library
Index Terms
- The effect of reconfigurable units in superscalar processors
Recommendations
Run-time versus compile-time instruction scheduling in superscalar (RISC) processors: performance and tradeoffs
HIPC '96: Proceedings of the Third International Conference on High-Performance Computing (HiPC '96)The RISC revolution has spurred the development of processors with increasing degrees of instruction level parallelism (ILP). In order to realize the full potential of these processors, multiple instructions must continuously be issued and executed in a ...
Efficient Exploitation of Instruction-Level Parallelism for Superscalar Processors by the Conjugate Register File Scheme
This paper introduces a novel superscalar micro-architecture, called IAS-S, and its related software techniques. We treat two basic problems in superscalar machines. First, we seek a feasible hardware platform which allows the compiler to perform more ...
Improving instruction level parallelism through reconfigurable units in superscalar processors
Special issue on the 2006 reconfigurable and adaptive architecture workshopWith reducing feature sizes, more transistors can be integrated on the chip. The increased transistor budget can be utilized to improve the instruction level parallelism (ILP) exploited from the processor. However, the transistors cannot be used to ...
Comments