Redundancy-Reduced MobileNet Acceleration on Reconfigurable Logic for ImageNet Classification

Su, Jiang; Faraone, Julian; Liu, Junyi; Zhao, Yiren; Thomas, David B.; Leong, Philip H. W.; Cheung, Peter Y. K.

doi:10.1007/978-3-319-78890-6_2

Jiang Su^19,20,
Julian Faraone^19,20,
Junyi Liu^19,20,
Yiren Zhao^19,20,
David B. Thomas^19,20,
Philip H. W. Leong^19,20 &
…
Peter Y. K. Cheung^19,20

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10824))

Included in the following conference series:

International Symposium on Applied Reconfigurable Computing

3713 Accesses
30 Citations

Abstract

Modern Convolutional Neural Networks (CNNs) excel in image classification and recognition applications on large-scale datasets such as ImageNet, compared to many conventional feature-based computer vision algorithms. However, the high computational complexity of CNN models can lead to low system performance in power-efficient applications. In this work, we firstly highlight two levels of model redundancy which widely exist in modern CNNs. Additionally, we use MobileNet as a design example and propose an efficient system design for a Redundancy-Reduced MobileNet (RR-MobileNet) in which off-chip memory traffic is only used for inputs/outputs transfer while parameters and intermediate values are saved in on-chip BRAM blocks. Compared to AlexNet, our RR-mobileNet has 25\(\times \) less parameters, 3.2\(\times \) less operations per image inference but 9%/5.2% higher Top1/Top5 classification accuracy on ImageNet classification task. The latency of a single image inference is only 7.85 ms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Parashar, A., et al.: SCNN: an accelerator for compressed-sparse convolutional neural networks. CoRR, vol. abs/1708.04485 (2017)
Google Scholar
Ma, Y., et al.: Scalable and modularized RTL compilation of convolutional neural networks onto FPGA. In: FPL (2016)
Google Scholar
Umuroglu, Y., Fraser, N.J., Gambardella, G., Blott, M., Leong, P.H.W., Jahre, M., Vissers, K.A.: FINN: a framework for fast, scalable binarized neural network inference. CoRR, vol. abs/1612.07119 (2016)
Google Scholar
Qiu, J., et al.: Going deeper with embedded FPGA platform for convolutional neural network. In: Proceedings of ACM/SIGDA ISFPGA, pp. 26–35 (2016)
Google Scholar
Han, S., et al.: EIE: efficient inference engine on compressed deep neural network (2016)
Google Scholar
Howard, A., et al.: Mobilenets: efficient convolutional neural networks for mobile vision applications. CoRR, vol. abs/1704.04861 (2017)
Google Scholar
Han, S., Mao, H., Dally, W.J.: Deep compression: compressing deep neural network with pruning, trained quantization and huffman coding. CoRR, vol. abs/1510.00149 (2015)
Google Scholar
Anwar, S., et al.: Structured pruning of deep convolutional neural networks. CoRR, vol. abs/1512.08571 (2015)
Google Scholar
Zhou, S., Ni, Z., Zhou, X., Wen, H., Wu, Y., Zou, Y.: DoReFa-Net: training low bitwidth convolutional neural networks with low bitwidth gradients. CoRR, vol. abs/1606.06160 (2016)
Google Scholar
Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: Proceedings of ICML, pp. 448–456 (2015)
Google Scholar

Download references

Author information

Authors and Affiliations

Imperial College London, London, UK
Jiang Su, Julian Faraone, Junyi Liu, Yiren Zhao, David B. Thomas, Philip H. W. Leong & Peter Y. K. Cheung
University of Sydney, Sydney, Australia
Jiang Su, Julian Faraone, Junyi Liu, Yiren Zhao, David B. Thomas, Philip H. W. Leong & Peter Y. K. Cheung

Authors

Jiang Su
View author publications
You can also search for this author in PubMed Google Scholar
Julian Faraone
View author publications
You can also search for this author in PubMed Google Scholar
Junyi Liu
View author publications
You can also search for this author in PubMed Google Scholar
Yiren Zhao
View author publications
You can also search for this author in PubMed Google Scholar
David B. Thomas
View author publications
You can also search for this author in PubMed Google Scholar
Philip H. W. Leong
View author publications
You can also search for this author in PubMed Google Scholar
Peter Y. K. Cheung
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jiang Su .

Editor information

Editors and Affiliations

Technological Educational Institute of Western Greece, Antirrio, Greece
Nikolaos Voros
Ruhr-Universität Bochum, Bochum, Germany
Michael Huebner
Technological Educational Institute of Western Greece, Antirrio, Greece
Georgios Keramidas
Technische Universität Dresden, Dresden, Germany
Diana Goehringer
Technological Educational Institute of Western Greece, Antirio, Greece
Christos Antonopoulos
INESC-ID, Lisbon, Portugal
Pedro C. Diniz

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Su, J. et al. (2018). Redundancy-Reduced MobileNet Acceleration on Reconfigurable Logic for ImageNet Classification. In: Voros, N., Huebner, M., Keramidas, G., Goehringer, D., Antonopoulos, C., Diniz, P. (eds) Applied Reconfigurable Computing. Architectures, Tools, and Applications. ARC 2018. Lecture Notes in Computer Science(), vol 10824. Springer, Cham. https://doi.org/10.1007/978-3-319-78890-6_2

Download citation

DOI: https://doi.org/10.1007/978-3-319-78890-6_2
Published: 08 April 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-78889-0
Online ISBN: 978-3-319-78890-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics