research-article

An LSTM Acceleration Method Based on Embedded Neural Network Accelerator

Authors:

Zhong MaAuthors Info & Claims

ACAI '21: Proceedings of the 2021 4th International Conference on Algorithms, Computing and Artificial Intelligence

Article No.: 103, Pages 1 - 6

https://doi.org/10.1145/3508546.3508649

Published: 25 February 2022 Publication History

Abstract

With the maturity of neural network technology, chips to accelerate neural network inference are emerging endlessly. Faced with the emerging complex neural network operators (such as LSTM) that are constantly evolving in neural network algorithms, it is unrealistic to modify the hardware design of the neural network inference chip to support the evolving new operators. Therefore, it has important research significance and practical value to make existing hardware support new operators through software. We propose an LSTM acceleration method based on an embedded neural network accelerator. Split the LSTM operator into multiple basic operators supported by the neural network accelerator by software, and optimize it. Finally, the embedded neural network accelerator supports LSTM operators quickly and efficiently. Experimental results show that the execution efficiency of LRCN model deployed on a low-power accelerator is x1.6 and X1.3 higher than that on CPU and GPU, respectively.

Supplementary Material

Poster (LSTM_poster.pdf)

Download
615.69 KB

References

[1]

Dario Amodei, Sundaram Ananthanarayanan, Rishita Anubhai, Jingliang Bai, Eric Battenberg, Carl Case, Jared Casper, Bryan Catanzaro, Qiang Cheng, Guoliang Chen, 2016. Deep speech 2: End-to-end speech recognition in english and mandarin. In International conference on machine learning. PMLR, 173–182.

[2]

Jeffrey Donahue, Lisa Anne Hendricks, Sergio Guadarrama, Marcus Rohrbach, Subhashini Venugopalan, Kate Saenko, and Trevor Darrell. 2015. Long-term recurrent convolutional networks for visual recognition and description. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2625–2634.

[3]

Chang Gao, Daniel Neil, Enea Ceolini, Shih-Chii Liu, and Tobi Delbruck. 2018. DeltaRNN: A power-efficient recurrent neural network accelerator. In Proceedings of the 2018 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. 21–30.

Digital Library

[4]

Klaus Greff, Rupesh K Srivastava, Jan Koutník, Bas R Steunebrink, and Jürgen Schmidhuber. 2016. LSTM: A search space odyssey. IEEE transactions on neural networks and learning systems 28, 10(2016), 2222–2232.

[5]

Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation 9, 8 (1997), 1735–1780.

Digital Library

[6]

Norman P Jouppi, Cliff Young, Nishant Patil, David Patterson, Gaurav Agrawal, Raminder Bajwa, Sarah Bates, Suresh Bhatia, Nan Boden, Al Borchers, 2017. In-datacenter performance analysis of a tensor processing unit. In Proceedings of the 44th annual international symposium on computer architecture. 1–12.

Digital Library

[7]

Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2017. ImageNet classification with deep convolutional neural networks. Commun. ACM 60, 6 (2017), 84–90.

Digital Library

[8]

Heng Liao, Jiajin Tu, Jing Xia, and Xiping Zhou. 2019. Davinci: A scalable architecture for neural network computing. In 2019 IEEE Hot Chips 31 Symposium (HCS). IEEE Computer Society, 1–44.

[9]

David MQ Nelson, Adriano CM Pereira, and Renato A de Oliveira. 2017. Stock market’s price movement prediction with LSTM neural networks. In 2017 International joint conference on neural networks (IJCNN). IEEE, 1419–1426.

[10]

Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, 2019. Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019), 8026–8037.

[11]

Albert Reuther, Peter Michaleas, Michael Jones, Vijay Gadepally, Siddharth Samsi, and Jeremy Kepner. 2021. AI Accelerator Survey and Trends. arXiv preprint arXiv:2109.08957(2021).

[12]

Khurram Soomro, Amir Roshan Zamir, and Mubarak Shah. 2012. UCF101: A dataset of 101 human actions classes from videos in the wild. arXiv preprint arXiv:1212.0402(2012).

[13]

Hao Xue, Du Q Huynh, and Mark Reynolds. 2018. SS-LSTM: A hierarchical LSTM model for pedestrian trajectory prediction. In 2018 IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE, 1186–1194.

[14]

Pu Zhang, Wanli Ouyang, Pengfei Zhang, Jianru Xue, and Nanning Zheng. 2019. Sr-lstm: State refinement for lstm towards pedestrian trajectory prediction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 12085–12094.

[15]

Xiaofan Zhang, Yuan Ma, Jinjun Xiong, Wen-mei Hwu, Volodymyr Kindratenko, and Deming Chen. 2021. Exploring HW/SW Co-Design for Video Analysis on CPU-FPGA Heterogeneous Systems. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (2021).

[16]

Yiwei Zhang, Chao Wang, Lei Gong, Yuntao Lu, Fan Sun, Chongchong Xu, Xi Li, and Xuehai Zhou. 2017. Implementation and optimization of the accelerator based on fpga hardware for lstm network. In 2017 IEEE International Symposium on Parallel and Distributed Processing with Applications and 2017 IEEE International Conference on Ubiquitous Computing and Communications (ISPA/IUCC). IEEE, 614–621.

[17]

Yiwei Zhang, Chao Wang, Lei Gong, Yuntao Lu, Fan Sun, Chongchong Xu, Xi Li, and Xuehai Zhou. 2017. A power-efficient accelerator based on FPGAs for LSTM network. In 2017 IEEE International Conference on Cluster Computing (CLUSTER). IEEE, 629–630.

[18]

Xuda Zhou, Zidong Du, Qi Guo, Shaoli Liu, Chengsi Liu, Chao Wang, Xuehai Zhou, Ling Li, Tianshi Chen, and Yunji Chen. 2018. Cambricon-S: Addressing irregularity in sparse neural networks through a cooperative software/hardware approach. In 2018 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). IEEE, 15–28.

Digital Library

Cited By

Liu HZhu KYou MLi YLiu JLin Z(2024)High precision temperature measurement for cryogenic temperature sensors based on deep learning technologyCryogenics10.1016/j.cryogenics.2024.103830140(103830)Online publication date: Jun-2024
https://doi.org/10.1016/j.cryogenics.2024.103830
Wang YMa ZYang Z(2022)Sequential Characteristics Based Operators Disassembly Quantization Method for LSTM LayersApplied Sciences10.3390/app12241274412:24(12744)Online publication date: 12-Dec-2022
https://doi.org/10.3390/app122412744
Wang YMa ZYang Z(2022)A New Quantization Deployment Method of Neural Network Models Integrating LSTM Layers2022 5th International Conference on Pattern Recognition and Artificial Intelligence (PRAI)10.1109/PRAI55851.2022.9904120(1299-1303)Online publication date: 19-Aug-2022
https://doi.org/10.1109/PRAI55851.2022.9904120

Recommendations

Soil moisture prediction model based on LSTM and Elman neural network
AISS '22: Proceedings of the 4th International Conference on Advanced Information Science and System

China is a large agricultural country, and in the process of agricultural production, it is very important to make accurate prediction of soil moisture. To address the problems of local minimization and slow convergence of traditional BP (back ...
Time Series Prediction Method Based on Variant LSTM Recurrent Neural Network
Abstract
Time series prediction problems are a difficult type of predictive modeling problem. In this paper, we propose a time series prediction method based on a variant long short-term memory (LSTM) recurrent neural network. In the proposed method, we ...
Prediction of air pollution using LSTM-based recurrent neural networks

This paper proposes a system that predicts the pollution level at some hour at a place. It also infers about the various parameters associated with the increasing pollution across the globe, its ill effects and the future scenario of the same. An air ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

ACAI '21: Proceedings of the 2021 4th International Conference on Algorithms, Computing and Artificial Intelligence

December 2021

699 pages

ISBN:9781450385053

DOI:10.1145/3508546

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 February 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

ACAI'21

ACAI'21: 2021 4th International Conference on Algorithms, Computing and Artificial Intelligence

December 22 - 24, 2021

Sanya, China

Acceptance Rates

Overall Acceptance Rate 173 of 395 submissions, 44%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3
Total Citations
View Citations
119
Total Downloads

Downloads (Last 12 months)32
Downloads (Last 6 weeks)6

Reflects downloads up to 13 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Liu HZhu KYou MLi YLiu JLin Z(2024)High precision temperature measurement for cryogenic temperature sensors based on deep learning technologyCryogenics10.1016/j.cryogenics.2024.103830140(103830)Online publication date: Jun-2024
https://doi.org/10.1016/j.cryogenics.2024.103830
Wang YMa ZYang Z(2022)Sequential Characteristics Based Operators Disassembly Quantization Method for LSTM LayersApplied Sciences10.3390/app12241274412:24(12744)Online publication date: 12-Dec-2022
https://doi.org/10.3390/app122412744
Wang YMa ZYang Z(2022)A New Quantization Deployment Method of Neural Network Models Integrating LSTM Layers2022 5th International Conference on Pattern Recognition and Artificial Intelligence (PRAI)10.1109/PRAI55851.2022.9904120(1299-1303)Online publication date: 19-Aug-2022
https://doi.org/10.1109/PRAI55851.2022.9904120

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten