Sparse Direct Convolutional Neural Network

Daultani, Vijay; Ohno, Yoshiyuki; Ishizaka, Kazuhisa

doi:10.1007/978-3-319-59072-1_35

Vijay Daultani¹⁶,
Yoshiyuki Ohno¹⁶ &
Kazuhisa Ishizaka¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10261))

Included in the following conference series:

International Symposium on Neural Networks

2541 Accesses
1 Citations
1 Altmetric

Abstract

We propose a new computation and memory efficient algorithm to speed up Convolutional Neural Networks (CNNs). Equipped with several millions of parameters, leveraging large datasets, CNNs have achieved state-of-the-art recognition accuracy. Recently utilizing sparsity of parameters, several acceleration techniques for CNNs have been introduced, causing a paradigm shift in type of computation from dense to sparse, leading to opportunity for designing a new convolution algorithm suiting high bandwidth performance architecture like SX-ACE.

In this paper we propose a new computation and memory efficient convolution algorithm for inference phase, Sparse Direct Convolution (SDC) and a new representation for sparse filters, Compressed Sparse Offset (CSO). We evaluate our implementation of SDC together with CSO on high bandwidth SX-ACE architecture and show inference time of single convolution layer can reduce with up to 95% of lenet, 65% of alexnet and 69% of VGG-16 without drop in accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. In: Proceedings of the IEEE (1998)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.: ImageNet classification with deep convolutional neural networks. In: NIPS (2012)
Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: ICLR (2015)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: arXiv (2015)
Google Scholar
Jia, Y.: Learning semantic image representations at a large scale. Ph.D. dissertation: UC Berkeley (2014)
Google Scholar
Matsuoka, S., Amano, H., Nakajima, K., Inoue, K., Kudoh, T., Maruyama, N., Taura, K., Iwashita, T., Katagiri, T., Hanawa, T., Endo, T.: From FLOPS to BYTES: disruptive change in high-performance computing towards the post-moore era. In: CF (2016)
Google Scholar
Williams, S.W., Waterman, A., Patterson, A.: Roofline: an insightful visual performance model for floating-point program and multicore architecture. Technical report No. UCB/EECS-2008-134. https://www2.eecs.berkeley.edu/Pubs/TechRpts/2008/EECS-2008-134.pdf
cuda-convnet. https://code.google.com/p/cuda-convnet/
Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: Convolutional Architecture for Fast Feature Embedding. https://arxiv.org/abs/1408.5093
Chetlur, S., Woolley, C., Vandermersch, P., Cohen, J., Tran, J., Catanzaro, B., Shelhamer, E.: cuDNN: Efficient Primitives for Deep Learning. https://arxiv.org/abs/1410.0759
Mathieu, M., Henaff, M., LeCun, Y.: Fast training of convolutional networks through FFTs. In: arxiv (2013)
Google Scholar
Vasilache, N., Johnson, J., Mathieu, M., Chintala, S., Piantino, S., LeCun, Y.: Fast convolutional nets with fbfft: a GPU performance evaluation. In: ICLR (2015)
Google Scholar
Han, S., Mao, H., Dally, W.J.A.: Deep compression: compressing deep neural networks with pruning, trained quantization and Huffman coding. In: ICLR (2016)
Google Scholar
Li, C., Yang, Y., Feng, M., Chakradhar, S., Han, H.Z.: Optimizing memory efficiency for deep convolutional neural networks on GPUs. In: SC (2016)
Google Scholar
Han, S., Liu, X., Mao, H., Pu, J., Pedram, A., Horowitz, M., Dally, J.: EIE: efficient inference engine on compressed deep neural network. In: ISCA (2016)
Google Scholar
Liu, B., Wang, M., Foroosh, H., Tappen, M., Penksy, M.: Sparse convolutional neural networks. In: CVPR (2015)
Google Scholar
Vuduc, R.W.: Automatic performance tuning of sparse matrix kernels. Ph.D. dissertation: UC Berkeley (2003)
Google Scholar
MathKeisan. http://www.mathkeisan.com/

Download references

Author information

Authors and Affiliations

System Platform Research Laboratories, NEC Corporation, Tokyo, Japan
Vijay Daultani, Yoshiyuki Ohno & Kazuhisa Ishizaka

Authors

Vijay Daultani
View author publications
You can also search for this author in PubMed Google Scholar
Yoshiyuki Ohno
View author publications
You can also search for this author in PubMed Google Scholar
Kazuhisa Ishizaka
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Vijay Daultani .

Editor information

Editors and Affiliations

Dalian University of Technology, Dalian, China
Fengyu Cong
City University of Hong Kong, Kowloon Tong, Hong Kong
Andrew Leung
Chinese Academy of Sciences, Beijing, China
Qinglai Wei

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Daultani, V., Ohno, Y., Ishizaka, K. (2017). Sparse Direct Convolutional Neural Network. In: Cong, F., Leung, A., Wei, Q. (eds) Advances in Neural Networks - ISNN 2017. ISNN 2017. Lecture Notes in Computer Science(), vol 10261. Springer, Cham. https://doi.org/10.1007/978-3-319-59072-1_35

Download citation

DOI: https://doi.org/10.1007/978-3-319-59072-1_35
Published: 31 May 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-59071-4
Online ISBN: 978-3-319-59072-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics