Abstract
Modern application-specific instruction-set processors (ASIPs) face the daunting task of delivering high performance for a wide range of applications. For enhancing the performance, architectural features, for example, pipelining, VLIW, are often employed in ASIPs, leading to high design complexity. Integrated ASIP design environments, like template-based approaches and language-driven approaches, provide an answer to this growing design complexity. At the same time, increasing hardware design costs have motivated the processor designers to introduce high flexibility in the processor. Flexibility, in its most effective form, can be introduced to the ASIP by coupling a reconfigurable unit to the base processor. Because of its obvious benefits, several reconfigurable ASIPs (rASIPs) have been designed for years. This design paradigm gained momentum with the advent of coarse-grained FPGAs, where the lack of domain-specific performance common in general-purpose FPGAs are largely overcome by choosing application-dependent basic functional units. These rASIP designs lack a generic flow from high-level specification, resulting in intuitive design decisions and hard-to-retarget processor design tools. Although partial, template-based approaches for rASIP design is existent, a clear design methodology especially for the prefabrication architecture exploration is not present. In order to address this issue, a high-level specification and design methodology for partially reconfigurable VLIW processors is proposed in this article. To show the benefit of this approach, a commercial VLIW processor is used as the base architecture and two domains of applications are studied for potential performance gain.
- ASIP Meister. http://www.eda-meister.org.]]Google Scholar
- Atasu, K., Pozzi, L., and Ienne, P. 2003. Automatic application-specific instruction-set extensions under microarchitectural constraints. In Proceedings of the 40th Conference on Design Automation (DAC'03). ACM Press, New York. 256--261.]] Google ScholarDigital Library
- Athanas, P. M. and Silverman, H. F. 1993. Processor reconfiguration through instruction-set metamorphosis. IEEE Comput. 26, 3, 11--18.]] Google ScholarDigital Library
- Bansal, N., Gupta, S., Dutt, N., and Nicolau, A. 2003. Analysis of the performance of coarse-grain reconfigurable architectures with different processing element configurations. In Workshop on Architecture Specific Processors (WASP).]]Google Scholar
- Bansal, N., Gupta, S., Dutt, N., Nicolau, A., and Gupta, R. 2003. Network topology exploration of mesh-based coarse-grain reconfigurable architectures. Tech. rep., Center for Embedded Computer Systems, University of California, Irvine.]]Google Scholar
- Barat, F., Lauwereins, R., and Deconinck, G. 2002. Reconfigurable instruction set processors from a hardware/software perspective. IEEE Trans. Softw. Engin. 28, 9, 847--862.]] Google ScholarDigital Library
- Biswas, P., Choudhary, V., Atasu, K., Pozzi, L., Ienne, P., and Dutt, N. 2004. Introduction of local memory elements in instruction set extensions. In Proceedings of the 41st Annual Conference on Design Automation (DAC'04).]] Google ScholarDigital Library
- Biswas, P., Banerjee, S., Dutt, N., Pozzi, L., and Ienne, P. 2006. ISEGEN: An iterative improvement-based ISE generation technique for fast customization of processors. IEEE Trans. VLSI Syst. 14, 7.]] Google ScholarDigital Library
- Compton, K. and Hauck, S. 2002. Reconfigurable computing: A survey of systems and software. ACM Comput. Surv. 34, 2, 171--210.]] Google ScholarDigital Library
- de Beeck, P. O., Barat, F., Jayapala, M., and Lauwereins, R. 2001. CRISP: A template for reconfigurable instruction set processors. In Proceedings of the 11th International Conference on Field-Programmable Logic and Applications (FPL'01). Springer-Verlag, Berlin. 296--305.]] Google ScholarDigital Library
- Dimitroulakos, G., Galanis, M. D., Kostaras, N., and Goutis, C. E. 2007. A unified evaluation framework for coarse grained reconfigurable array architectures. In Proceedings of the 4th International Conference on Computing Frontiers (CF'07). ACM Press, New York. 161--172.]] Google ScholarDigital Library
- Dupenloup, G., Lemeunier, T., and Mayr, R. 2006. Transistor abstraction for the functional verification of FPGAs. In Proceedings of the 43rd Annual Conference on Design Automation (DAC '06). ACM Press, New York. 1069--1072.]] Google ScholarDigital Library
- Fauth, A., Praet, J. V., and Freericks, M. 1995. Describing instruction set processors using nML. In Proceedings of the European Design and Test Conference (ED&TC).]] Google ScholarDigital Library
- Graham, P. and Nelson, B. E. 1999. Reconfigurable processors for high-performance, embedded digital signal processing. In Proceedings of the 9th International Workshop on Field-Programmable Logic and Applications. Springer-Verlag, New York. 1--10.]] Google ScholarDigital Library
- Grun, P., Halambi, A., Khare, A., Ganesh, V., Dutt, N., and Nicolau, A. 1998. EXPRESSION: An ADL for system level design exploration. Tech. rep., Department of Information and Computer Science, University of California, Irvine.]]Google Scholar
- Hartenstein, R. 2001. A Decade of reconfigurable computing: A visionary retrospective. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE'01). IEEE Press, Los Alamitos, CA. 642--649.]] Google ScholarDigital Library
- Hoffmann, A., Kogel, T., Nohl, A., Braun, G., Schliebusch, O., Wahlen, O., Wieferink, A., and Meyr, H. 2001. A novel methodology for the design of application specific instruction-set processor using a machine description language. IEEE Trans. Comput.-Aid. Design Integr. Cicuits Syst. 20, 11, 1338--1354.]] Google ScholarDigital Library
- Hoffmann, A., Meyr, H., and Leupers, R. 2002. Architecture Exploration for Embedded Processors with LISA. Kluwer Academic Publishers Novell, MA.]] Google ScholarDigital Library
- Iseli, C. and Sanchez, E. 1995. Spyder: A SURE (SUperscalar and REconfigurable) processor. J. Supercomput. 9, 3, 231--252.]] Google ScholarDigital Library
- Lam, M. S. 1988. Software pipelining: An effective scheduling technique for VLIW machines. SIGPLAN Notices 23, 7, 244--256.]] Google ScholarDigital Library
- Leupers, R., Kraemer, K. K. S., and Pandey, M. 2006. A design flow for configurable embedded processors based on optimized instruction set extension synthesis. In Proceedings of Design, Automation & Test in Europe (DATE). Munich, Germany.]] Google ScholarDigital Library
- Lin, J. Y., Chen, D., and Cong, J. 2006. Optimal simultaneous mapping and clustering for FPGA delay optimization. In Proceedings of the 43rd Annual Conference on Design Automation (DAC'06). ACM Press, New York. 472--477.]] Google ScholarDigital Library
- Lodi, A., Toma, M., Campi, F., Cappelli, A., Canegallo, R., and Guerrieri, R. 2003. A VLIW processor with reconfigurable instruction set for embedded applications. IEEE J. Solid-State Circuits 38, 11, 1876--1886.]]Google ScholarCross Ref
- MathStar. http://www.mathstar.com/.]]Google Scholar
- McMurchie, L. and Ebeling, C. 1995. PathFinder: A negotiation-based performance-driven router for FPGAs. In Proceedings of the ACM 3rd International Symposium on Field-Programmable Gate Arrays (FPGA'95). ACM Press, New York. 111--117.]] Google ScholarDigital Library
- Mei, B., Lambrechts, A., Verkest, D., Mignolet, J., and Lauwereins, R. 2005. Architecture exploration for a reconfigurable architecture template. IEEE Design Test 22, 2, 90--101.]] Google ScholarDigital Library
- Mei, B., Vernalde, S., Verkest, D., and Lauwereins, R. 2004. Design methodology for a tightly coupled VLIW/reconfigurable matrix architecture: A case study. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE'04).]] Google ScholarDigital Library
- Mei, B., Vernalde, S., Verkest, D., Man, H., and Lauwereins, R. 2002. DRESC: A retargetable compiler for coarse-grained reconfigurable architectures. In Proceedings of the International Conference on Field Programmable Technology.]]Google Scholar
- Mucci, C., Campi, F., Deledda, A., Fazzi, A., Ferri, M., and Bocchi, M. 2005. A cycle-accurate ISS for a dynamically reconfigurable processor architecture. In Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05).]] Google ScholarDigital Library
- Murgai, R., Brayton, R., and Sangiovanni-Vincentelli, A. 1991. On clustering for minimum delay/area. In Proceedings of the 1991 IEEE/ACM International Conference on Computer-Aided Design (ICCAD'91). 6--9.]]Google Scholar
- Nexperia. http://www.nxp.com/.]]Google Scholar
- Panainte, E. M., Vassiliadis, S., Wong, S., Gaydadjiev, G., Bertels, K., and Kuzmanov, G. 2004. The MOLEN polymorphic processor. IEEE Trans. Comput. 53, 11, 1363--1375.]] Google ScholarDigital Library
- Razdan, R. and Smith, M. D. 1994. A high-performance microarchitecture with hardware-programmable functional units. In Proceedings of the 27th Annual International Symposium on Microarchitecture. 172--80.]] Google ScholarDigital Library
- Rosa, A. L., Lavagno, L., and Passerone, C. 2001. A software development tool chain for a reconfigurable processor. In Proceedings of the International Conference on Compilers, Architecture, and Synthesis for Embedded Systems.]] Google ScholarDigital Library
- Sahni, S. and Bhatt, A. 1980. The complexity of design automation problems. In Proceedings of the 17th Conference on Design Automation (DAC'80). ACM Press, New York. 402--411.]] Google ScholarDigital Library
- Schliebusch, O., Chattopadhyay, A., Kammler, D., Leupers, R., Ascheid, G., and Meyr, H. 2005. A framework for automated and optimized ASIP implementation supporting multiple hardware description languages. In Proceedings of the ASPDAC. Shanghai, China.]] Google ScholarDigital Library
- Sharma, A., Ebeling, C., and Hauck, S. 2005. Architecture adaptive routability-driven placement for FPGAs. In Proceedings of the ACM/SIGDA 13th International Symposium on Field-programmable Gate Arrays (FPGA'05). ACM Press, New York. 266--266.]] Google ScholarDigital Library
- Stretch. http://www.stretchinc.com.]]Google Scholar
- Synopsys. Design compiler http://www.synopsys.com/products/logic/design_compiler.html.]]Google Scholar
- Tessier, R. and Burleson, W. 2001. Reconfigurable computing for digital signal processing: A survey. J. VLSI Signal Process. Syst. 28, 1--2, 7--27.]] Google ScholarDigital Library
- The Impact Research Group. http://www.crhc.uiuc.edu/Impact/.]]Google Scholar
- von Sydow, T., Korb, M., Neumann, B., Blume, H., and Noll, T. G. 2006a. Modelling and quantitative analysis of coupling mechanisms of programmable processor cores and arithmetic oriented eFPGA macros. In Proceedings of the IEEE International Conference on Reconfigurable Computing and FPGA's. (ReConFig'06b). 1--10.]]Google Scholar
- von Sydow, T., Neumann, B., Blume, H., and Noll, T. G. 2006b. Quantitative analysis of embedded FPGA-architectures for arithmetic. In Proceedings of the 17th International Conference on Application-specific Systems, Architectures and Processors (ASAP'06). IEEE Computer Society, Los Alamitos, CA. 125--131.]] Google ScholarDigital Library
Index Terms
Prefabrication and postfabrication architecture exploration for partially reconfigurable VLIW processors
Recommendations
Code Size Reduction in Heterogeneous-Connectivity-Based DSPs Using Instruction Set Extensions
Existing trend of processors shows a progress toward customizable and reconfigurable architectures. In this paper, we study the benefit of combining the architectural design of a VLIW DSP and the concepts of modern customizable processors like ASIPs (...
Pre- and Post-Fabrication Architecture Exploration for Partially Reconfigurable VLIW Processors
RSP '07: Proceedings of the 18th IEEE/IFIP International Workshop on Rapid System PrototypingModern Application Specific Instruction-set Processors (ASIPs) face the demanding task of delivering high performance for a wide range of applications. For enhancing the performance, architectural features e.g. pipelining, VLIW etc are often employed in ...
Dynamically Scheduling VLIW Instructions
Very long instruction word (VLIW) machines potentially provide the most direct way to exploit instruction-level parallelism; however, they cannot be used to emulate current general-purpose instruction set architectures. In addition, programs scheduled ...
Comments