skip to main content
10.1145/1140389.1140397acmconferencesArticle/Chapter ViewAbstractPublication PagesscopesConference Proceedingsconference-collections
Article

Global memory optimisation for embedded systems allowed by code duplication

Published: 29 September 2005 Publication History

Abstract

The data transfers and storage are dominating contributors to the area and power consumption for all modern multimedia embedded systems. Modern high-level memory optimisations can ensure cost-efficient realisation of these systems. An important step in these optimisations are loop transformations performed on a geometrical model. However, these loop transformations traditionally cannot optimise code across data dependent conditions.In this paper we selectively duplicate the code in order to enable global loop transformations across data dependent conditions. We propose a technique which finds in a systematic way the Pareto curve in 2D exploration space: the better memory optimisations vs. the code increase. Our technique has been tested on an MP3 audio decoder. Results show 45.8% decrease in the number of main memory accesses which requires a 16.2% increase of code size.

References

[1]
J. Absar, F. Catthoor, K. Das, "Call-instance based function inlining for increasing data access related optimisati on opportunities", Technical report, IMEC, Leuven, Belgium, 2003.]]
[2]
T. Bill, J. R. Larus, "Efficient Path Profiling", Proceedings of the 29th Annual IEEE/ACM International Symposium on Microarchitecture, pp. 46--57, Paris, France, 1996.]]
[3]
C. Bastoul, et al., "Putting polyhedral loop transformations to work", LCPC'16 International Workshop on Languages and Compilers for Parallel Computers, LNCS 2958, pp. 209--225, College Station, Sept. 2003.]]
[4]
E. T. Bell, "Exponential Numbers", Amer. Math. Monthly, Vol.41, pp. 411--419, 1934.]]
[5]
L. Benini, G. De Micheli, "System-level power optimization techniques and tools", ACM Trans. on Design Automation for Embedded Systems (TODAES), Vol. 5, No. 2, pp. 115--192, April 2000.]]
[6]
E. Brockmeyer, Memoranda, F. Catthoor, H. Corporaal, "Layer Assignment Techniques for Low Energy in Multi-layered Memory Organizations", Proc. 6th ACM/IEEE Design and Test in Europe Conf. (DATE), Munich, Germany, pp. 1070--1075, March 2003.]]
[7]
F. Catthoor, K. Danckaert, C. Kulkarni, E. Brockmeyer, P. G. Kjeldsberg, T. Van Achteren, T. Omnes, "Data access and storage management for embedded programmable processors", ISBN 0-7923-7689-7, Kluwer Acad. Publ., Boston, 2002.]]
[8]
A. Darte, Y. Robert, "Affine-by-statement scheduling of uniform and affine loop nests over parametric domains", Journal of Parallel and Distributed Computing 29(1), 43--59, 1995.]]
[9]
E. De Greef, F. Catthoor, H. De Man, "Program transformation strategies for reduced power and memory size in pseudo-regular multimedia applications", IEEE Trans. on Circuits and Systems for Video Technology, Vol. 8, No. 6, pp. 719--733, Oct. 1998.]]
[10]
P. Feautrier, "Some efficient solutions to the affine scheduling problems", Intnl. J. of Parallel Programming,]]
[11]
J. A. Fisher, "Trace scheduling: a technique for global microcode compaction", IEEE Trans. on Computers, Vol. C-30, No. 7, pp. 478--490, July 1981.]]
[12]
B. Franke, M. O'Boyle, "Array Recovery and High-Level Transformations for DSP Applications", ACM Transactions on Embedded Computing Systems (TECS), Vol. 2, Issue 2, pp. 132--162, May 2003.]]
[13]
V. S. Gheorghita, S. Stuijk, T. Basten, H. Corporaal "Automatic Scenario Detection for Improved WCET Estimation", 42nd Design Automation Conference (DAC), Anaheim, CA, June 2005.]]
[14]
M. Kandemir, J. Ramanujam, A. Choudhary, P. Banerjee, "A layout-conscious iteration space transformation technique", IEEE Transactions on Computers, 50(12):1321--1335, 2001.]]
[15]
K. Lagerström, "Design and Implementation of an MP3 Decoder", M.Sc. thesis, Chalmers University of Technology, Sweden, http://www.kmlager.com/mp3/, May 2001.]]
[16]
S. A. Mahlke, "Exploiting Instruction Level Parallelism in the Presence of Conditional Branches", PhD thesis, Department of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign, 1996.]]
[17]
K. McKinley, S. Carr, C-W. Tseng, "Improving data locality with loop transformations", ACM Trans. on Programming Languages and Systems, Vol. 18, No. 4, pp. 424--453, July 1996.]]
[18]
M. Palkovic, E. Brockmeyer, F. Catthoor, "Hierarchical rewriting and hiding of data dependent conditions to enable global loop transformations", Proc. 2nd Wsh. on Optim. for DSP and Embedded Systems (ODES), Palo Alta CA, March 2004.]]
[19]
M. Palkovic, E. Brockmeyer, P. Vanbroekhoven, H. Corporaal, F. Catthoor, "Systematic Preprocessing of Data Dependent Constructs for Embedded Systems", Proc. IEEE Wsh. on Power and Timing Modeling, Optimization and Simulation (PATMOS), Leuven, Belgium, appears in Lecture Notes Comp. Sc., Sep. 2005.]]
[20]
P. R. Panda, N. D. Dutt, A. Nicolau, "Local memory exploration and optimization in embedded systems", IEEE Trans. on Comp.-aided Design, Vol. CAD-18, No. 1, pp. 3--13, Jan. 1999.]]
[21]
W. Pugh, "The Omega Test: a fast and practical integer programming algorithm for dependence analysis", Communications of the ACM, Vol. 35, No. 8, Aug. 1992.]]
[22]
L. Semeria, G. De Micheli, "SpC: synthesis of pointers in C", Proc. IEEE Intnl. Conf. on Comp. Aided Design, Santa Clara CA, pp. 340--346, Nov. 1998.]]
[23]
P. Vanbroekhoven, G. Janssens, M. Bruynooghe, H. Corporaal, F. Catthoor, "Advanced copy propagation for arrays", Proc. of the SIGPLAN Conf. on Languages, Compilers, and Tools for Embedded Systems (LCTES'03), San Diego CA, pp. 24--33, June 2003.]]
[24]
D. Wilde, "A Library for Doing Polyhedral Operations", M.Sc. thesis, Oregon State Univ., Dec. 1993. In co-operation with IRISA/INRIA, Rennes, France.]]
[25]
M. Wolf, M. Lam, "A data locality optimizing algorithm", Proc. of the SIGPLAN'91 Conf. on Programming Language Design and Implementation, Toronto, Canada, pp. 30--43, June 1991.]]
[26]
P. Yang, P. Marchal, C. Wong, S. Himpe, F. Catthoor, P. David, J. Vounckx, R Lauwereins, "Managing Dynamic Concurrent Tasks in Embedded Real-Time Multimedia Systems", invited paper in Proc. 15th ACM/IEEE Intnl. Symp. on System-Level Synthesis (ISSS), Kyoto, Japan, pp. 112--119, Oct. 2002.]]
[27]
M. Yukish, "Algorithms to Identify Pareto Points in Multi-Dimensional Data Sets", PhD thesis, Pennsylvania State University, Aug. 2004.]]

Cited By

View all
  • (2012)Energy-Efficient Address-Generation Units and Their Design MethodologyEnergy-Aware Memory Management for Embedded Multimedia Systems10.1201/b11418-11(309-341)Online publication date: 4-Jan-2012
  • (2012)Enabling Efficient System Configurations for Dynamic Wireless Applications Using System ScenariosInternational Journal of Wireless Information Networks10.1007/s10776-012-0197-x20:2(140-156)Online publication date: 23-Oct-2012
  • (2011)Enabling efficient system configurations for dynamic wireless baseband engines using system scenarios2011 IEEE Workshop on Signal Processing Systems (SiPS)10.1109/SiPS.2011.6088994(305-310)Online publication date: Oct-2011
  • Show More Cited By
  1. Global memory optimisation for embedded systems allowed by code duplication

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SCOPES '05: Proceedings of the 2005 workshop on Software and compilers for embedded systems
    September 2005
    132 pages
    ISBN:1595932070
    DOI:10.1145/1140389
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 29 September 2005

    Permissions

    Request permissions for this article.

    Check for updates

    Qualifiers

    • Article

    Acceptance Rates

    Overall Acceptance Rate 38 of 79 submissions, 48%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)2
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 05 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2012)Energy-Efficient Address-Generation Units and Their Design MethodologyEnergy-Aware Memory Management for Embedded Multimedia Systems10.1201/b11418-11(309-341)Online publication date: 4-Jan-2012
    • (2012)Enabling Efficient System Configurations for Dynamic Wireless Applications Using System ScenariosInternational Journal of Wireless Information Networks10.1007/s10776-012-0197-x20:2(140-156)Online publication date: 23-Oct-2012
    • (2011)Enabling efficient system configurations for dynamic wireless baseband engines using system scenarios2011 IEEE Workshop on Signal Processing Systems (SiPS)10.1109/SiPS.2011.6088994(305-310)Online publication date: Oct-2011
    • (2010)Modeling and exploiting spatial locality trade-offs in wavelet-based applications under varying resource requirementsACM Transactions on Embedded Computing Systems10.1145/1698772.16987759:3(1-26)Online publication date: 5-Mar-2010
    • (2009)System-scenario-based design of dynamic embedded systemsACM Transactions on Design Automation of Electronic Systems10.1145/1455229.145523214:1(1-45)Online publication date: 23-Jan-2009
    • (2009)Exploiting Varying Resource Requirements in Wavelet-based Applications in Dynamic Execution EnvironmentsJournal of Signal Processing Systems10.1007/s11265-008-0223-556:2-3(125-139)Online publication date: 1-Sep-2009
    • (2008)Application Scenarios in Streaming-Oriented Embedded-System DesignIEEE Design & Test10.1109/MDT.2008.15825:6(581-589)Online publication date: 1-Nov-2008
    • (2008)Spatial locality trade-offs of wavelet-based applications in dynamic execution environments2008 IEEE International Conference on Acoustics, Speech and Signal Processing10.1109/ICASSP.2008.4517896(1461-1464)Online publication date: Mar-2008
    • (2008)Storage Estimation and Design Space Exploration Methodologies for the Memory Management of Signal Processing ApplicationsJournal of Signal Processing Systems10.1007/s11265-008-0244-053:1-2(51-71)Online publication date: 1-Nov-2008
    • (2008)Address Generation Optimization for Embedded High-Performance ProcessorsJournal of Signal Processing Systems10.1007/s11265-008-0165-y53:3(271-284)Online publication date: 1-Dec-2008
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media