introduction

Free access

Introduction to Special Issue on In/Near Memory and Storage Computing for Embedded Systems

Guest Editorss:

Weichen LiuAuthors Info & Claims

ACM Transactions on Embedded Computing Systems, Volume 23, Issue 6

Article No.: 85, Pages 1 - 3

https://doi.org/10.1145/3677018

Published: 11 September 2024 Publication History

PDF eReader

With the rapid advances in manufacturing and communication technologies, embedded systems have evolved tremendously in recent years. However, embedded systems usually have limited energy, computing power, and memory/storage space. In particular, the data transfer cost between CPU and storage/memory becomes the critical challenge for such systems. Nonetheless, in the past decades, memory and storage technologies have been significantly advanced and gradually support the computing capability, and such a development trend of adding computing functions in/near memory or storage devices to enable “memory and storage computing” provides a new opportunity to resolve the performance bottleneck caused by the massive amount of data movement between CPU and memory/storage units. This has been a hot topic and widely acknowledged by academics and industries.

After a rigorous review process, a set of articles were selected for their expertise on the precise topics of each article. Thus, this special issue represents a collective effort from the research community and industry participants on an international scale. From the many excellent submissions received, ten articles are included in this special issue. The articles appearing in this special issue tackle some of the most recent and impactful design issues of in/near memory and storage computing for embedded systems. These articles are briefly discussed in the rest of the introduction.

With the increasing scale of cloud computing applications of next-generation embedded systems, a major challenge that domain scientists are facing is how to efficiently store and analyze the vast volume of output data. Compression is one of most popular methods to solve this problem. Li et al. in “AMP: Total Variation Reduction for Lossless Compression via Approximate Median-based Preconditioning” present a total variation reduction method to address the issue of most large datasets being in floating-pint format by employing a median-based hyperplane to precondition the data.

EADR feature is introduced recently that guarantees to flush data buffered in CPU cache to persistent memory on a power outage, thereby making the CPU cache a transient persistence domain. Atomic durability has been enabled for applications’ in-persistent memory data. Ye et al. in “Hercules: Enabling Atomic Durability for Persistent Memory with Transient Persistence Domain” propose a hardware logging design for the transaction-level atomic durability to enable atomic durability for persistent memory with transient persistence domain.

As the core operation of lattice cipher, large-scale polynomial multiplication is the biggest computational bottleneck in its realization process. How to quickly calculate polynomial multiplication under resource constraints has become an urgent problem to be solved in the hardware implementation of lattice ciphers. Du et al. in “Analog In-memory Circuit Design of Polynomial Multiplication for Lattice Cipher Acceleration Application” propose an analog in-memory circuit for fast polynomial multiplication calculation.

Existing architectural studies on ReRAM-based processing-in-memory (PIM) DNN accelerators generally assume that all weights of the DNN can be mapped to the crossbar at once. Actually, ReRAM crossbar resources for calculation are limited because of technological limitations, so multiple weight mapping procedures are required during the inference process. Under this restriction, Gao et al in “Static Scheduling of Weight Programming for DNN Acceleration with Resource Constrained PIM” propose a static scheduling framework which generates the mapping between DNN weights and ReRAM cells with minimum runtime weight programming cost.

Recent advancements in the fabrication of ReRAM devices have led to the development of large-scale crossbar structures. In-memory computing architectures relying on ReRAM crossbars aim at mitigating the processor-memory bottleneck that exists with current CMOS technology. However, the verification of the design realized on ReRAM crossbars is done either through manual inspection or using simulation based approaches which cannot be applied to the verification of complex designs on large-scale ReRAM crossbars. Bhunia et al. in “ReSG: A Data Structure for Verification of Majority based In-Memory Computing on ReRAM Crossbars” propose an automatic equivalence checking flow that determines the equivalence between the original function specification and the crossbar micro-operations file formats.

Voltage scaling is one of the most promising approaches for energy efficiency improvement but also brings challenges to fully guaranteeing stable operation in modern VLSI. To tackle these issues, Liang et al. in “A Robust and Energy Efficient Hyperdimensional Computing System for Voltage-scaled Circuits” propose a Hyper-Dimensional Computing (HDC) which can tolerate bit-level memory failure in the low voltage region with high robustness. It is the second version of DependableHD, including the concept of margin enhancement for model retraining, noise injection for improving the robustness, and a dimension-swapping technique.

The data movement in large-scale computing facilities (from compute nodes to data nodes) is categorized as one of the major contributors to high cost and energy utilization. To tackle it, in-storage processing (ISP) within storage devices, such as computational storage drives (CSDs), are widely studied. One of the key challenges of building a CSD-based storage system within a compute node is that commercialized CSDs have different hardware resources and performance characteristics. Byun et al. in “An Analytical Model-based Capacity Planning Approach for Building CSD-based Storage Systems” propose an analytical model-based storage capacity planner for system architects to build performance-effective CSD-based compute nodes.

Near-data processing (NDP) is widely studied to solve the write-amplification issue caused by compaction operations in LSM-tree-based key-value stores (KV stores). However, the performance of NDP frameworks with synchronous parallel schemes is limited by the subsystem that has lower compaction performance. Sun et al. in “An Asynchronous Compaction Acceleration Scheme for Near-Data Processing-enabled LSM-Tree-based KV Stores” propose an asynchronous parallel scheme to solve this problem by designing a multi-tasks queue and three priority-based scheduling methods.

In-memory processing is becoming a popular method to alleviate the memory bottleneck of the von Neumann computing model. Meanwhile, Spintronic Racetrack Memory (RM) is one of the non-volatile memory technologies which is widely studied to meet the requirements of latency and energy cost for in-memory processing. Bera et al. in “SPIMulator: A Spintronic Processing-In-Memory Simulator for Racetracks” propose a spintronic PIM simulator that can simulate the storage and PIM architecture of executing PIM commands in Racetrack memory.

Energy-harvesting technology-based Internet of Things (IoT) devices have received attentions due to their advantages of green and low-carbon economy, convenient maintenance, and theoretical infinite lifetime, and so on. Meanwhile, ReRAM-based convolutional neural networks (CNN) accelerators are widely studied to solve the problem caused by unstable harvested energy. By considering the mismatch between the power requirement of different CNN layers and variation of harvested power, Zhou et al. in “REC: REtime Convolutional layers to fully exploit harvested energy for ReRAM-based CNN accelerators” propose a novel strategy that retimes convolutional layers of CNN inferences to improve the performance and energy efficiency of energy harvesting ReRAM-based accelerators.

The guest editors thank the reviewers for their valuable time, expertise, and constructive feedback in their reviews. We also thank all the authors for their submissions and their accommodation of the publication deadlines and constraints. Finally, we would also like to thank the Editor-in-Chief of ACM Transactions on Embedded Computing Systems, Professor Tulika Mitra, whose help made this special issue possible.

Liang Shi

East China Normal University, China

Jingtong Hu

University of Pittsburgh, USA

Hussam Amrouch

University of Stuttgart, Germany

Kuan-Hsun Chen

University of Twente, Netherlands

Mengying Zhao

Shangdong University, China

Weichen Liu

Nanyang Technological University, Singapore

Guest Editors

Index Terms

Introduction to Special Issue on In/Near Memory and Storage Computing for Embedded Systems

Index terms have been assigned to the content through auto-classification.

Recommendations

Optimizing Emerging Storage Primitives with Virtualization for Flash Memory Storage Systems
HPCC-CSS-ICESS '15: Proceedings of the 2015 IEEE 17th International Conference on High Performance Computing and Communications, 2015 IEEE 7th International Symposium on Cyberspace Safety and Security, and 2015 IEEE 12th International Conf on Embedded Software and Systems

NAND flash memory has become the mainstream storage medium for both enterprise high performance computersand embedded systems. However, over the past several decades, the storage primitives that access secondary storage have remained unchanged, forcing ...
Write-activity-aware nand flash memory management for pcm-based embedded systems
Real-time garbage collection for flash-memory storage systems of real-time embedded systems

Flash-memory technology is becoming critical in building embedded systems applications because of its shock-resistant, power economic, and nonvolatile nature. With the recent technology breakthroughs in both capacity and reliability, flash-memory ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Embedded Computing Systems

ACM Transactions on Embedded Computing Systems Volume 23, Issue 6

November 2024

505 pages

EISSN:1558-3465

DOI:10.1145/3613645

Editor:
Tulika Mitra
National University of Singapore, Singapore

Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Journal Family

ACM Journals for the Design of Smart and Connected Systems

Publication History

Published: 11 September 2024

Published in TECS Volume 23, Issue 6

Check for updates

Qualifiers

Introduction

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
601
Total Downloads

Downloads (Last 12 months)601
Downloads (Last 6 weeks)78

Reflects downloads up to 03 Mar 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Index Terms

Recommendations

Optimizing Emerging Storage Primitives with Virtualization for Flash Memory Storage Systems

Write-activity-aware nand flash memory management for pcm-based embedded systems

Real-time garbage collection for flash-memory storage systems of real-time embedded systems

Comments

Information

Published In

Publisher

Journal Family

Publication History

Check for updates

Qualifiers

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

View options

PDF

eReader

Login options

Full Access

Share

Share this Publication link

Share on social media

Affiliations