Abstract
Due to an increasing need for flexibility, embedded systems embody more and more programmable processors as their core components. Due to silicon area and power considerations, the corresponding instruction sets are often highly encoded to minimize code size for given performance requirements. This has hampered the development of robust optimizing compilers because the resulting irregular instruction set architectures are far from convenient compiler targets. Among other considerations, they introduce an interdependence between the tasks of instruction selection and scheduling. This so-called phase coupling is so strong that, in practice, instruction selection rather than scheduling is responsible for the quality of the schedule, which tends to disappoint. The lack of efficient compilation tools has also severely hampered the design space exploration of code-size efficient instruction sets, and correspondingly, their tuning to the application domain. In this article, we present an approach that reduces the need for explicit instruction selection by transferring constraints implied by the instruction set to static resource constraints. All resulting schedules are then guaranteed to correspond to a valid implementation with given instructions. We also demonstrate the suitability of this model to enable instruction set design (-space exploration) with a simple, well-understood and proven method long used in high-level synthesis (HLS) of ASICs. Experimental results show the efficacy of our approach.
- Aho, A. V. 1989. Code generation using tree matching and dynamic programming. ACM Trans. Program. Lang. Syst. 11, 4 (Oct.), 491--516. Google ScholarDigital Library
- Aho, A. V. and Johnson, S. C. 1976. Optimal code generation for expression trees. J. ACM 23, 3 (July), 488--501. Google ScholarDigital Library
- Araujo, G. and Malik, S. 1995. Optimal code generation for embedded memory non-homogeneous register architectures. In Proceedings of the 8th International Symposium on System Synthesis. IEEE Computer Society Press, Los Alamitos, CA, 36--41. Google ScholarDigital Library
- Araujo, G., Marlik, S., and Lee, M. 1996. Using register-transfer paths in code generation for heterogeneous memory-register architectures. In Proceedings of the 33rd Design Automation Conference. ACM, New York, NY, 591--596. Google ScholarDigital Library
- Avis, D., Bremner, D., and Seidel, R. 1997. How good are convex hull algorithms? Comput. Geom. Theor. Appl. 7, 265--301. Google ScholarDigital Library
- Bala, V. and Rubin, N. 1995. Efficient instruction scheduling using finite-state automata. In Proceedings of the 28th Annual International Symposium on Microarchitecture. IEEE Computer Society Press, Los Alamitos, CA, 46--56. Google ScholarDigital Library
- Bashford, S. and Leupers, R. 1999. Phase-coupled mapping of data flow graphs to irregular data paths. Des. Automa. Embedded Syst. 4, 2/3, 119--165.Google Scholar
- Chazelle, B. 1993. An optimal convex hull algorithm in any fixed dimension. Discrete Comput. Geom. 10, 377--409.Google ScholarDigital Library
- Eisenbeis, C., Chamski, Z., and Rohou, E. 1999. Flexible issue slot assignment for VLIW architectures. In Conference Record of the 4th International Workshop on Software and Compilers for Embedded Systems.Google Scholar
- Emmelmann, H., Schroer, F. W., and Landwehr, R. 1989. Beg-a generator for efficient back ends. SIGPLAN Not. 24, 7 (July), 227--237. Google ScholarDigital Library
- Ercegovac, M., Kirovski, D., and Potkonjak, M. 1999. Low-power behavioral synthesis optimization using multiple precison arithmetic. In Proceedings of the 36th Design Automation Conference. ACM, New York, NY, 568--573. Google ScholarDigital Library
- Fauth, A., van Praet, J., and Freericks, M. 1995. Describing instruction set processors using nML. In Proceedings of the European Design and Test Conference. IEEE Computer Society Press, Los Alamitos, CA, 503--507. Google ScholarDigital Library
- Fraser, C. W., Henry, R., and Proebsting, T. A. 1993. Engineering a simple efficient code-generator generator. ACM Lett. Program. Lang. Syst. 1, 3 (Sept.), 213--226. Google ScholarDigital Library
- Fukuda, K. 2000. Frequently asked questions in polyhedral computation. Version 16, Oct. 2000. http://www.ifor.math.ethz.ch/˜fukuda/fukuda.html.Google Scholar
- Gebotys, C. H. 1997. An efficient model for DSP code generation: performance, code size, estimated energy. In Proceedings of the 10th International Symposium on System Synthesis. IEEE Computer Society Press, Los Alamitos, CA, 41--47. Google ScholarDigital Library
- Gyllenhaal, J. G., Hwu, W. W., and Rau, B. R. 1996. Optimization of machine description for efficient use. In Proceedings of the 29th Annual IEEE/ACM International Symposium on Microarchitecture. IEEE Computer Society Press, Los Alamitos, CA, 349--358. Google ScholarDigital Library
- Hanono, S., Hadjiyiannis, G., and Devadas, S. 1997. Aviv: A retargetable code generator using ISDL. In Proceedings of the 34th Design Automation Conference. ACM, New York, NY, 510--515.Google Scholar
- Hartmann, R. 1992. Combined scheduling and data routing for programmable ASIC systems. In Proceedings of the European Conference on Design Automation. IEEE Computer Society Press, Los Alamitos, CA, 486--490.Google ScholarCross Ref
- Leupers, R. 1997. Retargetable code generation for digital signal processors. Kluwer, Dordrecht, The Netherlands. Google ScholarDigital Library
- Leupers, R. 2000. Register allocation for common subexpression in DSP data paths. In Asia South Pacific Design Automation Conference. IEEE, Piscataway, NJ, 235--240. Google ScholarDigital Library
- Leupers, R. and Marwedel, P. 1996. Instruction selection for embedded DSPs with complex instructions. In Proceedings of the European Design Automation Conference with EURO-VHDL. IEEE Computer Society Press, Los Alamitos, CA, 200--205. Google ScholarDigital Library
- Liem, C., May, T., and Paulin, P. 1994. Instruction-set matching and selection for DSP and ASIP code generation. In Proceedings of the 7th International Symposium on System Synthesis. IEEE Computer Society Press, Los Alamitos, CA, 31--37.Google Scholar
- Lowney, P. G., Freudenberger, A. M., Karzes, T. J., Lichtenstein, W. D., Nix, R. P., and O'Donnell, J. Supercomput. 1993. The multiflow trace scheduling compiler. J. Supercomput. 7, 1-2 (May), 51--142. Google ScholarDigital Library
- Mesman, B., Timmer, A. H., van Meerbergen, J. L., and Jess, J. A. G. 1999. Constraint analysis for DSP code generation. IEEE Trans. Comput.-Aided Des. Integrated Circuits Syst. 18, 1 (Jan.), 44--57. Google ScholarDigital Library
- Novack, S., Nicolau, A., and Dutt, N. 1995. A unified code generation approach using mutation scheduling. In Code generation for Embedded Processors, P. Marwedel and G. Goossens, Eds. Kluwer, Dordrecht. Ch. 12.Google Scholar
- Paulin, P. and Liem, C. 1996. Embedded systems: tools and trends. Tutorial at the European Design and Test Conference (Paris, March 1996).Google Scholar
- Preparata, R. P. and Shamos, M. I. 1985. Computational Geometry: An Introduction. Springer, Heidelberg, Germany. Google ScholarDigital Library
- Proebsting, T. A. and Fraser, C. W. 1994. Detecting pipeline harzards quickly. In Conference Record of the 21st ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. ACM, New York, NY, 280--286. Google ScholarDigital Library
- Rau, B. R. 1996. Iterative modulo scheduling. Int. J. Parallel Program. 24, 1 (Feb.), 3--64.Google ScholarDigital Library
- Rau, B. R. and Glaeser, C. D. 1981. Some scheduling techniques and an easily scheduable horizontal architecture for high performance scientific computing. In Proceedings of the 14th Workshop on Microprogramming. IEEE, New York, NY, 183--198. Google ScholarDigital Library
- Rimey, K. and Hilfinger, P. N. 1988. Lazy data routing and greedy scheduling for application-specific signal processors. In Proceedings of the 21st Annual Workshop on Microprogramming and Microarchitecture. IEEE Computer Society Press, Washington, DC, 111--115. Google ScholarDigital Library
- Timmer, A. H., Strik, M. T. J., van Meerbergen, J. L., and Jess, J. A. G. 1995. Conflict modelling and instruction scheduling in code generation for in-house DSP cores. In Proceedings of the 32nd Design Automation Conference. ACM, New York, NY, 593--598. Google ScholarDigital Library
- van Eijk, C. A. J., Mesman, B., Alba Pinto, C. A., Zhao, Q., Bekooij, M., van Meerbergen, J. L., and Jess, J. A. G. 2000. Constraint analysis for code generation: basic techniques and applications in facts. ACM Trans. Design Automa. Electron. Syst. 5, 4 (Oct.), 774--793. Google ScholarDigital Library
- van Praet, J., Goossens, G., Lanner, D., and De Man, H. 1994. Instruction set definition and instruction selection for ASIPs. In Proceedings of the 7th International Symposium on System Synthesis. IEEE Computer Society Press, Los Alamitos, CA, 11--16. Google ScholarDigital Library
- Wess, B. 1991. Automatic code generation for integrated digital signal processors. In Proceedings of the IEEE International Symposium on Circuits and Systems. IEEE, New York, NY.Google ScholarCross Ref
- Wilson, T., Grewal, G., Halley, B., and Banerji, D. 1994. An integrated approach to retargetable code generation. In Proceedings of the 7th International Symposium on System Synthesis. IEEE Computer Society Press, Los Alamitos, CA, 11--16. Google ScholarDigital Library
- Woudsma, R. 1994. EPICS, a flexible approach to embedded DSP cores. In Proceedings of the International Conference on Signal Processing, Application and Technology. DSP Associates, Waltham, MA, 506--511.Google Scholar
- Zhao, Q., Basten, T., Mesman, B., van Eijk, C. A. J., and Jess, J. A. G. 2001. Static resource models for instruction sets. In Proceedings of the 14th International Symposium on System Synthesis. ACM, New York, NY, 159--164. Google ScholarDigital Library
- Zhao, Q., Mesman, B., and Basten, T. 2002. Practical instruction set design and compiler retargetability using static resource models. In Proceedings of Design Automation and Test in Europe. IEEE Computer Society Press, Los Alamitos, CA, 1021--1026. Google ScholarDigital Library
Index Terms
- Static resource models for code-size efficient embedded processors
Recommendations
Constraint analysis for code generation: basic techniques and applications in FACTS
Code generation methods for digital signal processors are increasingly hampered by the combination of tight timing constraints imposed by signal p processing applications and resource constraints implied by the processor architecture. In particular, ...
A Register Pressure Sensitive Instruction Scheduler for Dynamic Issue Processors
PACT '97: Proceedings of the 1997 International Conference on Parallel Architectures and Compilation TechniquesSeveral modern superscalar processors contain an out of order (OOO) instruction issue mechanism, which resolves dependencies between instructions to expose greater instruction level parallelism (ILP). How to extend a traditional instruction scheduler to ...
Machine-Description Driven Compilers for EPIC and VLIW Processors
In the past, due to the restricted gate count available on an inexpensive chip, embedded DSPs have had limited parallelism, few registers and irregular, incomplete interconnectivity. More recently, with increasing levels of integration, embedded VLIW ...
Comments