Abstract
This paper presents a hardware-aware sparse fully convolutional network (SFCN) for semantic segmentation on an FPGA. Semantic segmentation attracts interest since for self-driving car it is important to recognize road and obstacles in pixel level. However, it is hard to implement the system on embedded systems since the number of weights for the SFCN is so large that embedded systems cannot store them using limited on-chip memory. To realize good a trade-off between speed and accuracy, we construct an AlexNet-based SFCN which has no skip connections and deconvolution layers to reduce the computation costs and the latency. Furthermore, we propose a filter-wise pruning technique that sorts the weights of each filter by their absolute values and prunes them by a preset percent filter-by-filter from a small order. It is more suitable for the hardware implementation since the number of computation of each filter becomes equal. We trained the AlexNet-based SFCN by using Camvid image dataset and implemented on Xilinx zcu102 evaluation board. The results show that the FPGA system is 10.14 times faster than a mobile GPU one, and its performance per power consumption is 24.49 times higher than the GPU counterpart.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Lecun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)
Lyu, Y., Bai, L., Huang, X.: Real-time road segmentation using LiDAR data processing on an FPGA. In: 2018 IEEE International Symposium on Circuits and Systems (ISCAS), pp. 1–5, May 2018
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Proceedings of the 25th International Conference on Neural Information Processing Systems - Volume 1, NIPS 2012, pp. 1097–1105. Curran Associates Inc., USA, (2012)
Song, H., Mao, H., Dally, W.J.: Deep compression: compressing deep neural network with pruning, trained quantization and huffman coding. CoRR, abs/1510.00149 (2015)
Shelhamer, E., Long, J., Darrell, T.: Fully convolutional networks for semantic segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(4), 640–651 (2017)
Badrinarayanan, V., Kendall, A., Cipolla, R.: SegNet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(12), 2481–2495 (2017)
Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J.: Pyramid scene parsing network. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6230–6239, July 2017
Zhao, H., Qi, X., Shen, X., Shi, J., Jia, J.: ICNet for real-time semantic segmentation on high-resolution images. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11207, pp. 418–434. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01219-9_25
Zhu, M., Gupta, S.: To prune, or not to prune: exploring the efficacy of pruning for model compression. CoRR, abs/1710.01878 (2017)
Molchanov, D., Ashukha, A., Vetrov, D.: Variational dropout sparsifies deep neural networks. arXiv preprint arXiv:1701.05369 (2017)
Alvarez, J.M., Salzmann, M.: Compression-aware training of deep networks. In: Advances in Neural Information Processing Systems, pp. 856–867 (2017)
Fujii, T., Sato, S., Nakahara, H., Motomura, M.: An FPGA realization of a deep convolutional neural network using a threshold neuron pruning. In: Wong, S., Beck, A.C., Bertels, K., Carro, L. (eds.) ARC 2017. LNCS, vol. 10216, pp. 268–280. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-56258-2_23
Yu, J., Lukefahr, A., Palframan, D., Dasika, G., Das, R., Mahlke, S.: Scalpel: customizing DNN pruning to the underlying hardware parallelism. In: Proceedings of the 44th Annual International Symposium on Computer Architecture, ISCA 2017, pp. 548–560, New York, USA. ACM (2017)
Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167 (2015)
Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015)
Gao, J., Li, Z., Nevatia, R., et al.: Knowledge concentration: learning 100k object classifiers in a single CNN. arXiv preprint arXiv:1711.07607 (2017)
Chen, G., Choi, W., Yu, X., Han, T., Chandraker, M.: Learning efficient object detection models with knowledge distillation. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems, vol. 30, pp. 742–751. Curran Associates Inc. (2017)
Tokui, S., Oono, K., Hido, S., Clayton, J.: Chainer: a next-generation open source framework for deep learning. In: Proceedings of Workshop on Machine Learning Systems (LearningSys) in The Twenty-ninth Annual Conference on Neural Information Processing Systems (NIPS) (2015)
Niitani, Y., Ogawa, T., Saito, S., Saito, M.: ChainerCV: a library for deep learning in computer vision. In: ACM Multimedia (2017)
Brostow, G.J., Fauqueur, J., Cipolla, R.: Semantic object classes in video: a high-definition ground truth database. Pattern Recognit. Lett. 30, 88–97 (2009)
Acknowledgments
This research is supported in part by the Grants in Aid for Scientific Research from JSPS, and the New Energy and Industrial Technology Development Organization (NEDO). In addition, thanks are extended to the Xilinx University Program (XUP), the Intel University Program, and NVidia Corp. for their support.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Shimoda, M., Sada, Y., Nakahara, H. (2019). Filter-Wise Pruning Approach to FPGA Implementation of Fully Convolutional Network for Semantic Segmentation. In: Hochberger, C., Nelson, B., Koch, A., Woods, R., Diniz, P. (eds) Applied Reconfigurable Computing. ARC 2019. Lecture Notes in Computer Science(), vol 11444. Springer, Cham. https://doi.org/10.1007/978-3-030-17227-5_26
Download citation
DOI: https://doi.org/10.1007/978-3-030-17227-5_26
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-17226-8
Online ISBN: 978-3-030-17227-5
eBook Packages: Computer ScienceComputer Science (R0)