Abstract
Modern Convolutional Neural Networks (CNNs) excel in image classification and recognition applications on large-scale datasets such as ImageNet, compared to many conventional feature-based computer vision algorithms. However, the high computational complexity of CNN models can lead to low system performance in power-efficient applications. In this work, we firstly highlight two levels of model redundancy which widely exist in modern CNNs. Additionally, we use MobileNet as a design example and propose an efficient system design for a Redundancy-Reduced MobileNet (RR-MobileNet) in which off-chip memory traffic is only used for inputs/outputs transfer while parameters and intermediate values are saved in on-chip BRAM blocks. Compared to AlexNet, our RR-mobileNet has 25\(\times \) less parameters, 3.2\(\times \) less operations per image inference but 9%/5.2% higher Top1/Top5 classification accuracy on ImageNet classification task. The latency of a single image inference is only 7.85 ms.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Parashar, A., et al.: SCNN: an accelerator for compressed-sparse convolutional neural networks. CoRR, vol. abs/1708.04485 (2017)
Ma, Y., et al.: Scalable and modularized RTL compilation of convolutional neural networks onto FPGA. In: FPL (2016)
Umuroglu, Y., Fraser, N.J., Gambardella, G., Blott, M., Leong, P.H.W., Jahre, M., Vissers, K.A.: FINN: a framework for fast, scalable binarized neural network inference. CoRR, vol. abs/1612.07119 (2016)
Qiu, J., et al.: Going deeper with embedded FPGA platform for convolutional neural network. In: Proceedings of ACM/SIGDA ISFPGA, pp. 26–35 (2016)
Han, S., et al.: EIE: efficient inference engine on compressed deep neural network (2016)
Howard, A., et al.: Mobilenets: efficient convolutional neural networks for mobile vision applications. CoRR, vol. abs/1704.04861 (2017)
Han, S., Mao, H., Dally, W.J.: Deep compression: compressing deep neural network with pruning, trained quantization and huffman coding. CoRR, vol. abs/1510.00149 (2015)
Anwar, S., et al.: Structured pruning of deep convolutional neural networks. CoRR, vol. abs/1512.08571 (2015)
Zhou, S., Ni, Z., Zhou, X., Wen, H., Wu, Y., Zou, Y.: DoReFa-Net: training low bitwidth convolutional neural networks with low bitwidth gradients. CoRR, vol. abs/1606.06160 (2016)
Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: Proceedings of ICML, pp. 448–456 (2015)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Su, J. et al. (2018). Redundancy-Reduced MobileNet Acceleration on Reconfigurable Logic for ImageNet Classification. In: Voros, N., Huebner, M., Keramidas, G., Goehringer, D., Antonopoulos, C., Diniz, P. (eds) Applied Reconfigurable Computing. Architectures, Tools, and Applications. ARC 2018. Lecture Notes in Computer Science(), vol 10824. Springer, Cham. https://doi.org/10.1007/978-3-319-78890-6_2
Download citation
DOI: https://doi.org/10.1007/978-3-319-78890-6_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-78889-0
Online ISBN: 978-3-319-78890-6
eBook Packages: Computer ScienceComputer Science (R0)