skip to main content
research-article

Placement of Linked Dynamic Data Structures over Heterogeneous Memories in Embedded Systems

Published: 17 February 2015 Publication History

Abstract

Software applications use dynamic memory (allocated and deallocated in the system's heap) to handle dynamism in their working conditions. Embedded systems tend to include complex memory organizations but most techniques for dynamic memory management do not deal with the placement of data objects in physical memory modules. Additionally, the performance of hardware-controlled cache memories may be severely hindered when used with linked data structures. We therefore present a methodology to map dynamic data on the multilevel memory subsystem of embedded systems, taking advantage of any available memories (e.g., on-chip SRAMs) and avoiding interference with the cache memories. The resulting data placement uses an exclusive memory model and is compatible with existing techniques for managing static data. Our methodology helps the designer achieve reductions in energy consumption and execution time that can be obtained by an expert in an automated way while keeping control over the process through multiple configuration knobs.

References

[1]
Mohammed Javed Absar, Francesco Poletti, Paul Marchal, Francky Catthoor, and Luca Benini. 2004. Fast and power-efficient dynamic data-layout with DMA-capable memories. In Proceedings of the PACS.
[2]
Nawaaz Ahmed, Nikolay Mateev, and Keshav Pingali. 2000. Tiling imperfectly-nested loop nests. In Proceedings of Supercomputing. IEEE, Washington, DC, Article 31. http://dl.acm.org/citation.cfm?id=370049.370401
[3]
I. Anagnostopoulos, S. Xydis, A. Bartzas, Zhonghai Lu, D. Soudris, and A. Jantsch. 2011. Custom microcoded dynamic memory management for distributed on-chip memory organizations. Embedded Systems Letters 3, 2 (June 2011), 66--69.
[4]
ARM. 2011. Cortex-A15 Technical Reference Manual Rev. r2p0. ARM.
[5]
David Atienza, José M. Mendías, Stylianos Mamagkakis, Dimitrios Soudris, and Francky Catthoor. 2006. Systematic dynamic memory management design methodology for reduced memory footprint. ACM TODAES 11, 2 (2006), 465--489.
[6]
Oren Avissar, Rajeev Barua, and Dave Stewart. 2001. Heterogeneous memory management for embedded systems. In Proceedings of CASES. ACM, 34--43.
[7]
Christos Baloukas, Jose L. Risco-Martin, David Atienza, Christophe Poucet, Lazaros Papadopoulos, Stylianos Mamagkakis, Dimitrios Soudris, J. Ignacio Hidalgo, Francky Catthoor, and Juan Lanchares. 2009. Optimization methodology of dynamic data structures based on genetic algorithms for multimedia embedded systems. Journal of Systems and Software 82, 4 (2009), 590--602.
[8]
Rajeshwari Banakar, Stefan Steinke, Bo-Sik Lee, M. Balakrishnan, and Peter Marwedel. 2002. Scratchpad memory: A design alternative for cache on-chip memory in embedded systems. In Proceedings of CODES. ACM, 73--78.
[9]
Alexandros Bartzas, Miguel Peón-Quirós, Christophe Poucet, Christos Baloukas, Stylianos Mamagkakis, Francky Catthoor, Dimitrios Soudris, and Jose Manuel Mendías. 2010. Software metadata: Systematic characterization of the memory behaviour of dynamic applications. Journal of Systems and Software 83, 6 (2010), 1051--1075. 10.1016/j.jss.2010.01.001
[10]
Luca Benini and Giovanni de Micheli. 2000. System-level power optimization: Techniques and tools. ACM TODAES 5, 2 (2000), 115--192.
[11]
Emery D. Berger, Kathryn S. McKinley, Robert D. Blumofe, and Paul R. Wilson. 2000. Hoard: A scalable memory allocator for multithreaded applications. SIGPLAN Notes 35, 11 (Nov. 2000), 117--128.
[12]
Emery D. Berger, Benjamin G. Zorn, and Kathryn S. McKinley. 2001. Composing high-performance memory allocators. In Proceedings of PLDI. ACM, 114--124.
[13]
Gilles Brassard and T. Bratley. 1996. Fundamentals of Algorithmics (1st (Spanish) ed.). Prentice Hall, 227--230.
[14]
Francky Catthoor, Sven Wuytack, G. E. de Greef, Florin Banica, Lode Nachtergaele, and Arnout Vandecappelle. 1998. Custom Memory Management Methodology: Exploration of Memory Organisation for Embedded Multimedia System Design. Kluwer Academic Publishers.
[15]
Trishul M. Chilimbi, Bob Davidson, and James R. Larus. 1999. Cache-conscious structure definition. In Proceedings of PLDI. ACM, 13--24.
[16]
Minas Dasygenis, Erik Brockmeyer, Bart Durinck, Francky Catthoor, Dimitrios Soudris, and Adonios Thanailakis. 2006. A combined DMA and application-specific prefetching approach for tackling the memory latency bottleneck. IEEE TVLSI 14, 3, 279--291.
[17]
Edgard Daylight, David Atienza, Arnout Vandecappelle, Francky Catthoor, and José Manuel Mendías. 2004. Memory-access-aware data structure transformations for embedded software with dynamic data accesses. IEEE TVLSI 12, 3 (2004), 269--280.
[18]
Hugo De Man. 2004. Connecting E-dreams to deep-submicron realities. In Proceedings of PATMOS. Springer.
[19]
Angel Dominguez, Sumesh Udayakumaran, and Rajeev Barua. 2005. Heap data allocation to scratch-pad memory in embedded systems. Journal of Embedded Computing 1, 4 (2005), 521--540.
[20]
Lieven Eeckhout, H. Vandierendonck, and Koen De Bosschere. 2003. Quantifying the impact of input data sets on program behavior and its applications. Journal of Instruction-Level Parallelism 5 (2003), 1--33.
[21]
Edward Fredkin. 1960. Trie memory. Communications of the ACM 3, 9 (Sept. 1960), 490--499.
[22]
Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides. 1995. Design Patterns: Elements of Reusable Object-Oriented Software. Addison-Wesley.
[23]
Bert Geelen, Erik Brockmeyer, Bart Durinck, Gauthier Lafruit, and Rudy Lauwereins. 2005. Alleviating memory bottlenecks by software-controlled data transfers in a data-parallel wavelet transform on a multicore DSP. In Proceedings of SPS-DARTS. 143--146.
[24]
Stefan Valentin Gheorghita, Martin Palkovic, Juan Hamers, Arnout Vandecappelle, Stelios Mamagkakis, Twan Basten, Lieven Eeckhout, Henk Corporaal, Francky Catthoor, Frederik Vandeputte, and Koen De Bosschere. 2009. System-scenario-based design of dynamic embedded systems. ACM TODAES 14, 1 (2009), 1--45.
[25]
R. González-Alberquilla, Fernando Castro, Luis Piñuel, and Francisco Tirado. 2010. Stack filter: Reducing L1 data cache power consumption. Journal of Systems Architecture 56 (Dec. 2010), 685--695.
[26]
Tristan Henderson, David Kotz, and Ilya Abyzov. 2004. The changing usage of a mature campus-wide wireless network. In Proceedings of MobiCom. ACM, 187--201.
[27]
HP Labs. 2008. CACTI 5.3. Retrieved from http://quid.hpl.hp.com:9081/cacti/.
[28]
Franois Ingelrest, Guillermo Barrenetxea, Gunnar Schaefer, Martin Vetterli, Olivier Couach, and Marc Parlange. 2010. SensorScope: Application-specific sensor network for environmental monitoring. ACM TOSN 6, 2 (2010), 1--32.
[29]
JEDEC. 2011. Low Power Double Data Rate 2 (LPDDR2) - JESD209-2E. JEDEC Solid State Technology Association.
[30]
N. Jouppi and S. Wilton. 1994. Tradeoffs in two-level on-chip caching. In Proceedings of ISCA. IEEE, 34--45.
[31]
Mahmut Kandemir, I. Kadayif, A. Choudhary, J. Ramanujam, and I. Kolcu. 2004. Compiler-directed scratchpad memory optimization for embedded multiprocessors. IEEE TVLSI Systems 12, 3 (2004), 281--287.
[32]
Mahmut Kandemir, J. Ramanujam, J. Irwin, N. Vijaykrishnan, I. Kadayif, and A. Parikh. 2001. Dynamic management of scratch-pad memory space. In Proceedings of DAC. 690--695.
[33]
Chris Lattner and Vikram Adve. 2005. Automatic pool allocation: Improving performance by controlling data structure layout in the heap. In Proceedings of PLDI. ACM, 129--142.
[34]
Doug Lea. 1996. A Memory Allocator. Retrieved from http://g.oswego.edu/dl/html/malloc.html.
[35]
Wentong Li, S. Mohanty, and K. Kavi. 2006. A page-based hybrid (software-hardware) dynamic memory allocator. IEEE CAL 5, 2 (2006), 13--13.
[36]
Amy W. Lim, Shih-Wei Liao, and Monica S. Lam. 2001. Blocking and array contraction across arbitrarily nested loops using affine partitioning. In Proc. of PPoPP. ACM, 103--112.
[37]
Stylianos Mamagkakis, David Atienza, Christophe Poucet, Francky Catthoor, and Dimitrios Soudris. 2006. Energy-efficient dynamic memory allocators at the middleware level of embedded systems. In Proceedings of EMSOFT. ACM, 215--222.
[38]
Paul Marchal, Francky Catthoor, Davide Bruni, Luca Benini, José Ignacio Gómez, and Luis Piñuel. 2004. Integrated task scheduling and data assignment for SDRAMs in dynamic applications. IEEE Design and Test of Computers 21, 5 (2004), 378--387.
[39]
Barry H. Margolin, Richard P. Parmelee, and Martin Schatzoff. 1971. Analysis of free-storage algorithms. IBM Systems Journal 10, 4 (1971), 283--304.
[40]
Ross McIlroy, Peter Dickman, and Joe Sventek. 2008. Efficient dynamic heap allocation of scratch-pad memory. In Proceedings of ISMM. ACM, 31--40.
[41]
MICRON. 2010. Mobile LPSDR SDRAM - MT48H32M32LF/LG Rev. D 1/11 EN. Micron Technology, Inc.
[42]
MICRON. 2012. Mobile LPDDR2 SDRAM - MT42L64M32D1 Rev. N 3/12 EN. Micron Technology, Inc.
[43]
Preeti Ranjan Panda, Nikil D. Dutt, and Alexandru Nicolau. 2000. On-chip vs. off-chip memory: The data partitioning problem in embedded processor-based systems. ACM TODAES 5, 3 (2000), 682--704.
[44]
Francesco Poletti, Paul Marchal, David Atienza, Luca Benini, Francky Catthoor, and José M. Mendías. 2004. An integrated hardware/software approach for run-time scratchpad management. In Proceedings of DAC. 238--243.
[45]
Christophe Poucet, David Atienza, and Francky Catthoor. 2006. Template-based semi-automatic profiling of multimedia applications. In Proceedings of ICME. IEEE, 1061--1064.
[46]
M. Shreedhar and George Varghese. 1996. Efficient fair queueing using deficit round-robin. IEEE/ACM Trans. Networking 4, 3 (1996), 375--385.
[47]
María Soto, André Rossi, and Marc Sevaux. 2012. A mathematical model and a metaheuristic approach for a memory allocation problem. Journal of Heuristics 18, 1 (Feb. 2012), 149--167.
[48]
Stefan Steinke, Lars Wehmeyer, B. Lee, and Peter Marwedel. 2002. Assigning program and data objects to scratchpad for energy reduction. In Proceedings of DATE. 409.
[49]
S. Subha. 2009. An exclusive cache model. In IEEE ITNG. 1715--1716.
[50]
Sumesh Udayakumaran, Angel Dominguez, and Rajeev Barua. 2006. Dynamic allocation for scratch-pad memory using compile-time decisions. ACM TECS 5, 2 (2006), 472--511.
[51]
Manish Verma, Stefan Steinke, and Peter Marwedel. 2003. Data partitioning for maximal scratchpad usage. In Proceedings of ASP-DAC. 77--83.
[52]
Manish Verma, Lars Wehmeyer, and Peter Marwedel. 2004. Cache-aware scratchpad allocation algorithm. In Proceedings of DATE. 21264.
[53]
Paul R. Wilson, Mark S. Johnstone, Michael Neely, and David Boles. 1995. Dynamic storage allocation: A survey and critical review. In Proceedings of IWMM. Springer-Verlag, 1--116.
[54]
Sven Wuytack, Jean-Philippe Diguet, Francky Catthoor, and Hugo De Man. 1998. Formalized methodology for data reuse exploration for low-power hierarchical memory mappings. IEEE TVLSI 6, 4 (Dec. 1998), 529--537.
[55]
Ying Zheng, B. T. Davis, and M. Jordan. 2004. Performance evaluation of exclusive cache hierarchies. In Proceedings of ISPASS. IEEE, Washington, DC, 89--96.

Cited By

View all
  • (2018)MOCA: Memory Object Classification and Allocation in Heterogeneous Memory Systems2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS.2018.00042(326-335)Online publication date: May-2018

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Embedded Computing Systems
ACM Transactions on Embedded Computing Systems  Volume 14, Issue 2
March 2015
472 pages
ISSN:1539-9087
EISSN:1558-3465
DOI:10.1145/2737797
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Journal Family

Publication History

Published: 17 February 2015
Accepted: 01 December 2013
Revised: 01 September 2013
Received: 01 April 2013
Published in TECS Volume 14, Issue 2

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Design
  2. efficiency
  3. embedded
  4. memory management
  5. memory organization

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

  • E. C. Marie Curie Fellowship contract MEST-CT-2004-504767

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)9
  • Downloads (Last 6 weeks)0
Reflects downloads up to 08 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2018)MOCA: Memory Object Classification and Allocation in Heterogeneous Memory Systems2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS.2018.00042(326-335)Online publication date: May-2018

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media