skip to main content
10.1145/3290420.3290443acmotherconferencesArticle/Chapter ViewAbstractPublication PagesiccipConference Proceedingsconference-collections
research-article

Log-quantization on GRU networks

Published:02 November 2018Publication History

ABSTRACT

Today, recurrent neural network (RNN) is used in various applications like image captioning, speech recognition and machine translation. However, because of data dependencies, recurrent neural network is hard to parallelize. Furthermore, to increase network's accuracy, recurrent neural network uses complicated cell units such as long short-term memory (LSTM) and gated recurrent unit (GRU). To run such models on an embedded system, the size of the network model and the amount of computation need to be reduced to achieve low power consumption and low required memory bandwidth. In this paper, implementation of RNN based on GRU with a logarithmic quantization method is proposed. The proposed implementation is synthesized using high-level synthesis (HLS) targeting Xilinx ZCU102 FPGA running at 100MHz. The proposed implementation with an 8-bit log-quantization achieves 90.57% accuracy without re-training or fine-tuning. And the memory usage is 31% lower than that for an implementation with 32-bit floating point data representation.

References

  1. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun 2016. Deep Residual Learning for Image Recognition. The IEEE Conference on Computer Vision and Pattern Recognition (June 2016), 770--778Google ScholarGoogle Scholar
  2. Christian Szegedy, Wei Liu, Yangqing Jia, Pierre S., et al. 2015. Going Deeper With Convolutions. The IEEE Conference on Computer Vision and Pattern Recognition, (June 2015), 1--9Google ScholarGoogle Scholar
  3. Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. 2014. Sequence to sequence learning with neural networks. In Proceedings of the 27th International Conference on Neural Information Processing Systems - Volume 2, 3104--3112 (Dec. 2014) DOI=https://dl.acm.org/citation.cfm?id=2969033.2969173 Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Douglas Eck and Juergen Schmidhuber. 2002. A First Look at Music Composition Using LSTM Recurrent Neural Networks. Technical Report. Istituto Dalle Molle Di Studi Sull Intelligenza Artificiale. Google ScholarGoogle Scholar
  5. Song Han, Huizi Mao, William J. Dally. 2016. Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding. International Conference on Learning Representations (May 2016), DOI= https://arxiv.org/pdf/1510.00149v5.pdfGoogle ScholarGoogle Scholar
  6. Forrest N. Iandola, Song Han, Matthew W. Moskewicz, Khalid Ashraf, William J. Dally and Kurt Keutzer. 2016. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size. DOI=https://arxiv.org/abs/1602.07360Google ScholarGoogle Scholar
  7. Alex Krizhevsky, Ilya Sutskever and Geoffrey E. Hinton. 2012. ImageNet Classification with Deep Convolutional Neural Networks. Advances in Neural Information Processing Systems (Dec. 2012), 1097--1105, DOI=https://dl.acm.org/citation.cfm?id=2999257 Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Song Han, Junlong Kang, Huizi Mao, Yiming Hu, Xin Li, Yubin Li, Dongliang Xie, Hong Luo, Song Yao, Yu Wang, Huazhong Yang, and William (Bill) J. Dally. 2017. ESE: Efficient Speech Recognition Engine with Sparse LSTM on FPGA. In Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (Feb. 2017), 75--84 Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Daisuke Miyashita, Edward H. Lee and Boris Murmann. 2016. Convolutional Neural Networks using Logarithmic Data Representation. DOI=https://arxiv.org/abs/1603.01025Google ScholarGoogle Scholar
  10. Karen Simonyan and Andrew Zisserman. 2014 Very Deep Convolutional Networks for Large-Scale Image Recognition. DOI=https://arxiv.org/abs/1409.1556Google ScholarGoogle Scholar
  11. Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long Short-Term Memory. Neural Comput. 9, 8 (November 1997), 1735--1780. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Cho Kyunghyun, Bart van Merrienboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. DOI= https://arxiv.org/abs/1406.1078Google ScholarGoogle Scholar
  13. Junyoung Chung, Caglar Gulcehre, KyungHyun Cho and Yoshua Bengio. 2014. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling. Advances in Neural Information Processing Systems (Dec. 2014), DOI= https://arxiv.org/abs/1412.3555Google ScholarGoogle Scholar
  14. Song Han, X. Liu, H. Mao, J. Pu, A. Pedram, M. Horowitz, W. J. Dally. 2016. EIE: Efficient Inference Engine on Compressed Deep Neural Network. ACM/IEEE 43rd Annual International Symposium on Computer Architecture (June 2016), 243--254 Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. ZCU102 Board User Guide, https://www.xilinx.com/support/documentation/boards_and_kits/zcu102/ug1182-zcu102-eval-bd.pdfGoogle ScholarGoogle Scholar
  16. William Shakespeare Plays Datasets, https://ocw.mit.edu/ans7870/6/6.006/s08/lecturenotes/files/t8.shakespeare.txtGoogle ScholarGoogle Scholar
  17. Martín Abadi, Paul Barham, Jianmin Chen et al. 2016. TensorFlow: a system for large-scale machine learning. In Proceedings of the 12th USENIX conference on Operating Systems Design and Implementation (Nov. 2016), 265--283, https://www.tensorflow.org Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Log-quantization on GRU networks

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in
            • Published in

              cover image ACM Other conferences
              ICCIP '18: Proceedings of the 4th International Conference on Communication and Information Processing
              November 2018
              326 pages
              ISBN:9781450365345
              DOI:10.1145/3290420
              • Conference Chairs:
              • Jalel Ben-Othman,
              • Hui Yu,
              • Program Chairs:
              • Herwig Unger,
              • Masayuki Arai

              Copyright © 2018 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 2 November 2018

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • research-article

              Acceptance Rates

              Overall Acceptance Rate61of301submissions,20%
            • Article Metrics

              • Downloads (Last 12 months)13
              • Downloads (Last 6 weeks)1

              Other Metrics

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader