skip to main content
10.1145/1140389.1140395acmconferencesArticle/Chapter ViewAbstractPublication PagesscopesConference Proceedingsconference-collections
Article

Generic software pipelining at the assembly level

Published: 29 September 2005 Publication History

Abstract

Software used in embedded systems is subject to strict timing and space constraints. The growing software complexity creates an urgent need for fast program execution under the constraint of very limited code size. However, even modern compilers produce code whose quality often is far away from the optimum. The PROPAN system is a postpass optimization framework that enables high-quality machine-dependent postpass optimizers to be generated from a concise hardware specification. The postpass approach allows to enhance the code quality of existing compilers and offers a smooth integration into existing development tool chains. In this article we present an adaptation of the modulo scheduling software pipelining algorithm to the postpass level. The implementation is fully retargetable and has been incorporated in the PROPAN system. The differences of postpass modulo scheduling compared to the standard version of the algorithm are outlined. Experimental results conducted on the Philips TriMedia TM1000 processor demonstrate that modulo scheduling can be applied at the postpass level and allows to achieve a significant code speedup with moderate code size increase.

References

[1]
A. Aiken and A. Nicolau. Perfect pipelining: A new loop parallelization technique. In H. Ganzinger, editor, ESOP'88, 2nd European Symposium on Programming, volume 300 of LNCS, pages 221--235. Springer, 1988.]]
[2]
V. Allan, R. Jones, R. Lee, and S. Allan. Software pipelining. ACM Computing Surveys, 27(3):367--432, September 1995.]]
[3]
M. Benitez and J. Davidson. A Portable Global Optimizer and Linker. Proceedings of the ACM SIGPLAN '88 Conference on Programming Language Design and Implementation, in SIGPLAN Notices, 23(7):329--338, July 1988.]]
[4]
F. Bodin, Z. Chamski, E. Rohou, and A. Seznec. Functional Specification of SALTO: A Retargetable System for Assembly Language Transformation and Optimization, rev. 1.00 beta. INRIA, 1997.]]
[5]
D. Bradlee. Retargetable Instruction Scheduling for Pipelined Processors. Phd thesis, Technical Report 91-08-07, University of Washington, 1991.]]
[6]
B. D. Bus, B. D. Sutter, L. V. Put, D. Chanet, and K. D. Bosschere. Link-time Optimization of ARM Binaries. In LCTES '04: Proceedings of the 2004 ACM SIGPLAN/SIGBED Conference on Languages, Compilers, and Tools, pages 211--220. ACM Press, 2004.]]
[7]
J. Codina, J. Llosa, and A. Gonzalez. A Comparative Study of Modulo Scheduling Techniques. In/CS, pages 97--106, 2002.]]
[8]
A. Dani, V. Ramanan, and R. Govindarajan. Register-Sensitive Software Pipelining. In Proceedings of the 1st Merged International Parallel Processing Symposium and Symposium on Parallel and Distributed Processing (IPPS/SPDP-98), pages 194 198. IEEE Computer Society, March 1998.]]
[9]
J. Davidson and C. Fraser. The Design and Application of a Retargetable Peephole Optimizer. ACM Transactions on Programming Languages and Systems, 2(2):191 202, Apr. 1980.]]
[10]
B. De Bus, D. Kastner, D. Chanet, L. Van Put, and B. De Sutter Post-Pass Compaction Techniques. Communications of the ACM, Aug. 2003.]]
[11]
A. Fauth. Beyond Tool-Specific Machine Descriptions. In {26}, chapter 8, pages 138--152. Kluwer, 1995.]]
[12]
C. Fraser and D. Hanson. A Retargetable C Compiler: Design And Implementation. Benjamin/Cummings Publishing Company, Inc., 1995.]]
[13]
M. Guthaus, J. Ringenberg, D. Ernst, T. Austin, T. Mudge, and R. Brown. MiBench: A free, commercially representative embedded benchmark suite. In Proceedings of the IEEE 4th Annual Workshop on Workload Characterization, December 2001.]]
[14]
A. Halambi, P. Grun, V. Ganesh, A. Khare, N. Dutt, and A. Nicolau. EXPRESSION: A Language for Architecture Exploration through Compiler/Simulator Retargetability. Proceedings of the DATE99, 1999.]]
[15]
S. Hanono and S. Devadas. Instruction Scheduling, Resource Allocation, and Scheduling in the AVIV Retargetable Code Generator. In Proceedings of the Design Automation Conference 1998, San Francisco, California, 1998. ACM.]]
[16]
R. Heckmann, M. Langenbach, S. Thesing, and R. Wilhelm. The Influence of Processor Architecture on the Design and the Results of WCET Tools. Proceedings of the IEEE, 91(7), July 2003.]]
[17]
R. Huff. Lifetime-sensitive modulo scheduling. ACM SIGPLAN Notices, 28(6):258--267, June 1993.]]
[18]
D. Kästner. PROPAN: A Retargetable System for Postpass Optimisations and Analyses. Proceedings of the ACM SICPLAN Workshop on Languages, Compilers and Tools for Embedded Systems, June 2000.]]
[19]
D. Kästner. TDL: A Hardware Description Language for Retargetable Postpass Optimizations and Analyses. In Proceedings of the Second ACM SIGPLAN/SICSOFT Conference on Generative Programming and Component Engineering (GPCE), 2003.]]
[20]
D. Kästner and S. Wilhelm. Generic Control Flow Reconstruction from Assembly Code. Proceedings of the ACM SIGPLAN Joined Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES'02) and Software and Compilers for Embedded Systems (SCOPES'02), June 2002.]]
[21]
D. Kästner. Retargetable Postpass Optimisation by Integer Linear Programming. PhD thesis, Saarland University, Saarbrücken, 2000.]]
[22]
M. Lam. Software Pipelining: An Effective Scheduling Technique for VLIW Machines. ACM SIGPLAN Notices, 23(7):318 328, July 1988.]]
[23]
M. Langenbach. CRL A Uniform Representation for Control Flow. Technical report, Universitat des Saarlandes, 1998.]]
[24]
D. Lanneer, J. Van Praet, A. Kifli, K. Schoofs, W. Geurts, F. Thoen, and G. Goossens. CHESS: Retargetable Code Generation For Embedded DSP Processors. In {26}, pages 85--102. Kluwer, 1995.]]
[25]
Leupers. Retargetable Code Generation for Digital Signal Processors. Kluwer Academic Publishers, 1997.]]
[26]
P. Marwedel and G. Goossens Code Generation for Embedded Processors. Kluwer, Boston; London; Dortrecht, 1995.]]
[27]
P. Marwedel and W. Schenk. Cooperation of Synthesis, Retargetable Code Generation and Test Generation in the MIMOLA Software System. European Conference on Design Automation, pages 63--69, 1993.]]
[28]
A. Miné. A new numerical abstract domain based on difference-bound matrices. In PADO II, volume 2053 of LNCS, pages 155--172. Springer-Verlag, May 2001.]]
[29]
F. Nielson, H. Nielson, and C. Hankin. Principles of Program Analysis. Springer, 1999.]]
[30]
P. Paulin, C. Liem, T. May, and S. Sutarwala. FLEX-WARE: A Flexible Firmware Development Environment for Embedded Systems. In {26} pages 67--84. Kluwer, 1995.]]
[31]
P. Paulin, C. Liem, T. May, and S. Sutarwala. FLEXWARE: A Flexible Firmware Development Environment for Embedded Systems. In {26} pages 67--84. Kluwer, 1995.]]
[32]
Philips Electronics North America Corporation. TriMedia TM 1000 Preliminary Data Book, 1997.]]
[33]
B. Rau. Iterative Modulo Scheduling: An Algorithm For Software Pipelining Loops. In Proceedings of the 27th Annual International Symposium on Microarchitecture, pages 63--74, November 1994.]]
[34]
B. Rau et al. Code Generation Schema for Modulo Scheduled DO-Loops and WHILE-Loops. HP Labs Technical Report HPL-92-47, Hewlett-Packard Laboratories, 1992.]]
[35]
13. Rau and J. Fisher. Instruction-Level Parallel Processing: History, Overview, and Perspective. The. Journal of Supercomputing, 7:9--50, 1993.]]
[36]
R. Stallman. Using and Porting GNU CC. Free Software Foundation, Cambridge/Massachusetts, n8 Stanford Compiler Group. SUIF Compiler System: The SUIF Library, 1994.]]
[37]
E. Stotzer and E. Leiss. Modulo Scheduling for the TMS320C6x VLIW' DSP Architecture. In Proceedings of the LCTES. pages 28--34. ACM Press, 1999.]]
[38]
A. Sudarsanam. Code Optimization Libraries for Retargetable Compilation for Embedded Digital Signal Processors. PhD thesis, University of Princeton, Nov. 1998.]]
[39]
B. D. Sutter, B. D. Bus, and K. D. Bosschere. Sifting out the Mud: Low Level C++ Code Reuse. In OOPSLA '02: Proceedings of the 17th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications, pages 275--291. ACM Press, 2002.]]
[40]
S. Vegdahl. Local code generation and compaction in optimizing microcode compilers. PhD thesis, CMU, 1982.]]
[41]
S. R. Vegdahl. A Dynamic-Programming Technique for Compacting Loops. In 25th Annual International Symposium on Microarchitecture (MICRO-25), pages 180 188, 1992.]]
[42]
V. Živojnović, J. Martinez, C. Schläger, and H. Meyr. DSPstone: A DSP-oriented benchmarking methodology. In Proc. of ICSPAT'94 -- Dallas, October 1994.]]

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SCOPES '05: Proceedings of the 2005 workshop on Software and compilers for embedded systems
September 2005
132 pages
ISBN:1595932070
DOI:10.1145/1140389
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 29 September 2005

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. PROPAN
  2. modulo scheduling
  3. postpass optimization
  4. software pipelining

Qualifiers

  • Article

Acceptance Rates

Overall Acceptance Rate 38 of 79 submissions, 48%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media