Abstract
Compiler-in-the-Loop (CiL) architecture exploration is widely accepted as being the right track for fast development of Application Specific Instruction-set Processors (ASIP). In this context, both, automatic application-specific Instruction Set Extension (ISE) and code generation by a compiler have received huge attention in the past. Together, both techniques enable processor designers to quickly adapt a processor’s Instruction Set Architecture (ISA) to the needs of a certain set of applications and to provide an appropriate high-level programming model. This manuscript presents a tool flow for identification and utilization of Custom Instructions (CIs) during architecture exploration in an automated fashion. By embedding this tool flow in an industry-proven architecture exploration framework, a methodology for simultaneous compiler/architecture co-exploration is derived. The advantage of the presented tool flow lies in its ability to develop a reusable ISA and an appropriate compiler for a set of applications and therefore to support the design of programmable architectures. In addition, ASIP architecture exploration is effectively improved since time consuming application analysis and compiler retargeting is automated. Through compilation and simulation of several benchmarks in accordance to extended ISAs, reliable feedback on speedup, code size and usability of identified CIs is provided. Furthermore, results on area consumption for extended ISAs are presented in order to compare the obtained speedup with the invested hardware effort of new CIs.
Similar content being viewed by others
References
aiSee Graph Description Language (GDL) (2008). http://www.aisee.com/gdl/nutshell
Graph Matching Library: VFLib (2008). http://amalfi.dis.unina.it/graph/db/vflib-2.0
Partitioned Boolean Quadratic Programming (PBQP) Solver (2008). http://www.it.usyd.edu.au/~scholz/pbqp.html
An Infrastructure for Research in Instruction-Level Parallelism (2009). http://www.trimaran.com
ACE—Associated Compiler Experts bv (2009) The COSY Compiler Development System. http://www.ace.nl
Aditya S, Kathail V, Rau B (1999) Elcor’s machine description system: Version 3.0. Tech rep, Hewlett-Packard Company
Aho AV, Ganapathi M, Tjiang SWK (1989) Code generation using tree pattern matching and dynamic programing. ACM Trans Program Lang Syst 11(4):491–516
Araujo G, Malik S, Lee M (1996) Using register transfer paths in code generation for heterogeneous memory register architectures. In: Proc of the design automation conference (DAC), pp 591–596
International ARC (2008) ARCtangent Processor. http://www.arc.com
Arnold M, Corporaal H (2001) Designing domain-specific processors. In: Proc of the conference on hardware/software codesign (CODES-ISSS), pp 61–66
Atasu K, Duendar G, Oezturan C (2005) An integer linear programming approach for identifying instruction set extensions. In: Proc of the conference on hardware/software codesign (CODES-ISSS), pp 172–177
Baleani M, Gennari F, Jiang Y, Patel Y, Brayton RK, Sangiovanni-Vincenelli A (2002) HW/SW partitioning and code generation of embedded control applications on a reconfigurable architecture platform. In: Proc of the conference on hardware/software codesign (CODES-ISSS), pp 151–156
Biswas P, Dutt N, Ienne P, Pozzi L (2006) Automatic identification of application-specific functional units with architecturally visible storage. In: Proc of the conference on design, automation & test in Europe (DATE), pp 1–6
Bonzini P, Pozzi L (2006) Code transformation strategies for extensible embedded processors. In: Proc of the conf on compilers, architectures and synthesis for embedded systems (CASES), pp 242–252
Bonzini P, Pozzi L (2007) A retargetable framework for automated discovery of custom instructions. In: Proc of the conference on application specific systems, architectures, and processors (ASAP), pp 334–341
Bonzini P, Pozzi L (2007) Polynomial-time subgraph enumeration for automated instruction set extension. In: Proc of the conference on design, automation & test in Europe (DATE), pp 1331–1336
Bonzini P, Pozzi L (2008) Recurrence-aware instruction set selection for extensible processors. IEEE Trans Very Large Scale Integr (VLSI) Syst 16(10):1259–1267
Brisk P, Kaplan A, Kastner R, Sarrafzadeh M (2002) Instruction generation and regularity extraction for reconfigurable processors. In: Proc of the conf on compilers, architectures and synthesis for embedded systems (CASES), pp 262–269
Brisk P, Verma AK, Ienne P (2007) Rethinking custom ISE identification: a new processor-agnostic method. In: Proc of the conf on compilers, architectures and synthesis for embedded systems (CASES), pp 125–134
Chang P, Mahlke S, Chen W, Warter N, Hwu W (1991) IMPACT: An architectural framework for multiple-instruction-issue processors. ACM Comput Archit News (SIGARCH) 19(3):266–275
Chen X, Maskell DL Sun Y (2007) Fast identification of custom instructions for extensible processors. IEEE Trans Comput-Aided Des 26(2):359–368
Clark N, Hormati A, Mahlke S (2006) Scalable subgraph mapping for acyclic computation accelerators. In: Proc of the conf on compilers, architectures and synthesis for embedded systems (CASES), pp 147–157
Clark N, Zhong H, Mahlke S (2003) Processor acceleration through automated instruction set customisation. In: Proc of the symposium on microarchitecture, p 129
Clark N, Zhong H, Mahlke S (2005) Automated custom instruction generation for domain-specific processor acceleration. IEEE Trans Comput 54(10):1258–1270
Cong J, Fan Y, Han G, Zhang Z (2004) Application-specific instruction generation for configurable processors. In: Proc of the symposium on field programmable gate arrays (FPGA), pp 183–189
Cordella LP, Foggia P, Sansone C, Vento M (2004) A (sub)graph isomorhism algorithm for matching large graphs. IEEE Trans Pattern Mach Intell 26(10):1367–1372
Day JD, Zimmerman H (1983) The OSI reference model. IEEE (USA) 71:12
Eckstein E, Koening O, Scholz B (2003) Code instruction selection based on SSA graphs. In: Proc of the workshop on software and compilers for embedded systems (SCOPES), pp 49–65
Ertl MA (1999) Optimal code selection in DAGs. In: Proc of the symposium on principles of programming languages (POPL ’99), pp 242–249
Fauth A (1995) Beyond tool-specific machine descriptions. In: Marwedel P, Goosens G (eds) Code generation for embedded processors. Kluwer Academic, Amsterdam
Fauth A, Knoll A (1993) Automatic generation of DSP program development tools using a machine description formalism. In: Proc of the conf on acoustics, speech and signal processing (ICASSP)
Fraser CW, Hanson DR, Proebsting TA (1992) Engineering efficient code generators using tree matching and dynamic programming. Tech rep TR-386-92
Freericks M (1993) The nML machine description formalism. Tech rep, Technical University of Berlin, Department of Computer Science
Freericks M, Fauth A, Knoll A (1994) Implementation of complexy DSP systems using high-level design tools. In: Signal Processing: VI theories and applications
Galuzzi C, Bertels K (2008) The instruction-set extension problem: a survey. In: Workshop on applied reconfigurable computing (ARC). Springer, Heidelberg, pp 209–220
Galuzzi C, Panainte EM, Yankova Y, Bertels K, Vassiliadis S (2006) Automatic selection of application-specific instruction-set extensions. In: Proc of the conference on hardware/software codesign (CODES-ISSS), pp 256–261
Garey MR, Johnson DS (1990) Computers and intractability; a guide to the theory of NP-completeness. Freeman, New York. ISBN: 0716710455
Geurts W et al. (1996) Design of DSP systems with chess/checkers. In: Proc of 2nd Int workshop on code generation for embedded processors
Gonzales R (2000) Xtensa: a configurable and extensible processor. IEEE Micro 20(2):60–70
Goodwin D, Petkov D (2003) Automatic generation of application specific processors. In: Proc of the conf on compilers, architectures and synthesis for embedded systems (CASES), pp 137–147
Gries M, Keutzer K (2005) Building ASIPs: the mescal methodology. Springer, Berlin. ISBN: 0-387-26057-9
Grun P, Halambi A, Khare A, Ganesh V, Dutt N, Nicolau A (1998) EXPRESSION: an ADL for system level design exploration. Tech rep 98-29, Department of Information and Computer Science, University of California, Irvine, Sep
Gupta R (1992) Generalized dominators and postdominators. In: Proc of the symposium on principles of programming languages (POPL), pp 246–257
Gyllenhaal J, Rau B, Hwu W (1996) HMDES version 2.0 specification. Tech rep, IMPACT Research Group, Univ of Illinois
Halambi A, Shrivastava A, Dutt N, Nicolau A (2001) A customizable compiler framework for embedded systems. In: Proc of the workshop on software and compilers for embedded systems (SCOPES)
Halambi A, Grun P, Ganesh V, Khare A, Dutt N, Nicolau A (1999) EXPRESSION: A language for architecture exploration through compiler/simulator retargetability. In: Proc of the conference on design, automation & test in Europe (DATE)
Hoffmann A, Kogel T, Nohl A, Braun G, Schliebusch O, Wahlen O, Wieferink A, Meyr H (2001) A novel methodology for the design of application specific instruction set processors (ASIP) using a machine description language. IEEE Trans Comput-Aided Des 20(11):1338–1354
Hoffmann A, Meyr H, Leupers R (2003) Architecture exploration for embedded processors with Lisa. Kluwer Academic, Amsterdam. ISBN:1-4020-73380
Hohenauer M, Scharwaechter H, Karuri K, Wahlen O, Kogel T, Leupers R, Ascheid G, Meyr H (2004) A methodology and tool suite for c compiler generation from ADL models. In: Proc of the conference on design, automation & test in Europe (DATE)
Itoh M, Takeuchi Y, Imai M, Shiomi A (2000) Synthesizable HDL generation for pipelined processors from a micro-operation description. IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences
Khare A (1999) SIMPRESS: a simulator generation environment for system-on-chip exploration. Tech rep, department of information and computer science, University of California, Irvine, Sep
Kobayashi S, Takeuchi Y, Kitajima A, Imai M (2001) Compiler generation in PEAS-III: an ASIP development system. In: Workshop on software and compilers for embedded processors (SCOPES)
Koes DR, Goldstein SC (2008) Near optimal instruction selection on DAGs. In: International symposium on code generation and optimization (CGO)
Lanner D, Praet JV, Kifli A, Schoofs K, Geurts W, Thoen F, Goossens G (1995) Chess: retargetable code generation for embedded DSP processors
Leupers R, Karuri K, Kraemer S, Pandey M (2006) A design flow for configurable embedded processors based on optimized instruction set synthesis. In: Proc of the conference on design, automation & test in Europe (DATE), pp 581–586
Leupers R, Marwedel P (1996) Instruction selection for embedded DSPs with complex instructions. In: Proc of the European conference on design automation (EDAC), pp 200–205
Liao S, Devadas S, Keutzer K, Tjiang S (1995) Instruction selection using binate covering for code size optimization. In: Proc of the conf on computer aided design (ICCAD), pp 393–399
Liem C, May T, Paulin P (1994) Instruction-set matching and selection for DSP and ASIP code generation. In: Proc of the European design and test conference (ED & TC), pp 31–37
Mishra P, Dutt N, Nicolua A (2001) Functional abstraction driven design space exploration of heterogeneous programmable architectures. In: Proc of the symposium on system synthesis (ISSS), pp 256–261
Nilsson NJ (1982) Principles of artificial intelligence. Springer, Berlin. ISBN-13: 9783540113409
Nohl A, Braun G, Schliebusch O, Leupers R, Meyr H (2002) A universal technique for fast and flexible instruction-set architecture simulation. In: Proc of the design automation conference (DAC), pp 22–27
Peymandoust A, Pozzi L, Ienne P, De Micheli G (2003) Automatic instruction set extension and utilization for embedded processors. In: Proc of the conference on application specific systems, architectures, and processors (ASAP), pp 108–118
Pozzi L, Atasu K, Ienne P (2006) Exact and approximate algorithms for the extension of embedded processor instruction sets. IEEE Trans Comput-Aided Des 25(7):1209–1229
Pozzi L, Vuletic M, Ienne P (2002) Automatic topology-based identification of instruction-set extensions for embedded processors. In: Proc of the conference on design, automation & test in Europe (DATE)
Sakai S, Togasaki M, Yamazaki K (2003) A note on greedy algorithms for the maximum weighted independent set problem. Discrete Appl Math 126:313–322
Scharwaechter H, Youn JM, Leupers R, Paek Y, Ascheid G, Meyr H (2007) A code generator generator for multi-output instructions. In: Proc of the conference on hardware/software codesign (CODES-ISSS)
Schliebusch O, Meyr H, Leupers R (2007) Optimized ASIP synthesis from architecture description language models. Springer, Berlin
Scholz B, Eckstein E (2002) Register allocation for irregular architectures. In: Proc of the workshop on software and compilers for embedded systems (SCOPES), pp 129–148
Scholz B, Eckstein E (2003) Address mode selection. In: Proc of the symposium on code generation and optimization (CGO), pp 337–346
Scholz B, Eckstein E (2006) Partitioned Boolean quadratic programming (PBQP). Tech rep, University of Sydney: School of Information Technologies, Jan
Shah N, Keutzer K (2002) Network processors: origin of species. In: Proc of the Int symposium of computer and information science
Synopsys Inc (2011) Synopsys processor designer. Synopsys Inc. http://www.synopsys.com/Systems/BlockDesign/ProcessorDev/Pages/default.aspx
Teich J, Weper R (2000) A joined architecture/compiler design environment for ASIPs. In: Proc of the conf on compilers, architectures and synthesis for embedded systems (CASES)
Teich J, Weper R, Fischer D, Trinkert S (2000) BUILDABONG: a rapid prototyping environment for ASIPs. In: Proc of the DSP Germany (DSPD)
Tensilica (2009) Xtensa configurable processors. http://www.tensilica.com
Tjiang, SWK (1993) An Olive Twig. Tech rep, Synopsys Inc
Ullmann JR (1976) An algorithm for subgraph isomorphism. J ACM 23(1):31–42
Verma AK, Brisk P, Ienne P (2008) Fast, quasi-optimal, and pipelined instruction-set extensions. In: Proc of the conference on design, automation & test in Europe (DATE), pp 334–339
Yang F (1999) ESP: a 10-year retrospective. In: Proc of the embedded systems programming conference
Yu P, Mitra T (2004) Scalable custom instructions identification for instruction set extensible processors. In: Proc of the conf on compilers, architectures and synthesis for embedded systems (CASES), pp 69–78
Yu P, Mitra T (2007) Disjoint pattern enumeration for custom instruction identification. In: Proc of the conf on field programmable logic and applications (FPL), pp 273–278
Author information
Authors and Affiliations
Corresponding author
Additional information
Extension of Conference Paper: An earlier version [66] of this paper appeared in the proceedings of the 5th IEEE/ACM international conference on hardware/software codesign and system synthesis. It introduces a code-generator named CBurg which is now applied for implementing a code-selector engine of the CoSy compiler system from ACE. Additionally, a methodology for recurrence-aware identification of custom instructions is presented that builds on the data flow graphs from the compiler’s intermediate representation. At the same time, it produces a code-generator description which is used to retarget the compiler backend to a new instruction set.
Rights and permissions
About this article
Cite this article
Scharwaechter, H., Kammler, D., Leupers, R. et al. A retargetable framework for compiler/architecture co-development. Des Autom Embed Syst 15, 311–342 (2011). https://doi.org/10.1007/s10617-011-9080-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10617-011-9080-8