Skip to main content

Sparse Direct Convolutional Neural Network

  • Conference paper
  • First Online:
Advances in Neural Networks - ISNN 2017 (ISNN 2017)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10261))

Included in the following conference series:

Abstract

We propose a new computation and memory efficient algorithm to speed up Convolutional Neural Networks (CNNs). Equipped with several millions of parameters, leveraging large datasets, CNNs have achieved state-of-the-art recognition accuracy. Recently utilizing sparsity of parameters, several acceleration techniques for CNNs have been introduced, causing a paradigm shift in type of computation from dense to sparse, leading to opportunity for designing a new convolution algorithm suiting high bandwidth performance architecture like SX-ACE.

In this paper we propose a new computation and memory efficient convolution algorithm for inference phase, Sparse Direct Convolution (SDC) and a new representation for sparse filters, Compressed Sparse Offset (CSO). We evaluate our implementation of SDC together with CSO on high bandwidth SX-ACE architecture and show inference time of single convolution layer can reduce with up to 95% of lenet, 65% of alexnet and 69% of VGG-16 without drop in accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. In: Proceedings of the IEEE (1998)

    Google Scholar 

  2. Krizhevsky, A., Sutskever, I., Hinton, G.: ImageNet classification with deep convolutional neural networks. In: NIPS (2012)

    Google Scholar 

  3. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: ICLR (2015)

    Google Scholar 

  4. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: arXiv (2015)

    Google Scholar 

  5. Jia, Y.: Learning semantic image representations at a large scale. Ph.D. dissertation: UC Berkeley (2014)

    Google Scholar 

  6. Matsuoka, S., Amano, H., Nakajima, K., Inoue, K., Kudoh, T., Maruyama, N., Taura, K., Iwashita, T., Katagiri, T., Hanawa, T., Endo, T.: From FLOPS to BYTES: disruptive change in high-performance computing towards the post-moore era. In: CF (2016)

    Google Scholar 

  7. Williams, S.W., Waterman, A., Patterson, A.: Roofline: an insightful visual performance model for floating-point program and multicore architecture. Technical report No. UCB/EECS-2008-134. https://www2.eecs.berkeley.edu/Pubs/TechRpts/2008/EECS-2008-134.pdf

  8. cuda-convnet. https://code.google.com/p/cuda-convnet/

  9. Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: Convolutional Architecture for Fast Feature Embedding. https://arxiv.org/abs/1408.5093

  10. Chetlur, S., Woolley, C., Vandermersch, P., Cohen, J., Tran, J., Catanzaro, B., Shelhamer, E.: cuDNN: Efficient Primitives for Deep Learning. https://arxiv.org/abs/1410.0759

  11. Mathieu, M., Henaff, M., LeCun, Y.: Fast training of convolutional networks through FFTs. In: arxiv (2013)

    Google Scholar 

  12. Vasilache, N., Johnson, J., Mathieu, M., Chintala, S., Piantino, S., LeCun, Y.: Fast convolutional nets with fbfft: a GPU performance evaluation. In: ICLR (2015)

    Google Scholar 

  13. Han, S., Mao, H., Dally, W.J.A.: Deep compression: compressing deep neural networks with pruning, trained quantization and Huffman coding. In: ICLR (2016)

    Google Scholar 

  14. Li, C., Yang, Y., Feng, M., Chakradhar, S., Han, H.Z.: Optimizing memory efficiency for deep convolutional neural networks on GPUs. In: SC (2016)

    Google Scholar 

  15. Han, S., Liu, X., Mao, H., Pu, J., Pedram, A., Horowitz, M., Dally, J.: EIE: efficient inference engine on compressed deep neural network. In: ISCA (2016)

    Google Scholar 

  16. Liu, B., Wang, M., Foroosh, H., Tappen, M., Penksy, M.: Sparse convolutional neural networks. In: CVPR (2015)

    Google Scholar 

  17. Vuduc, R.W.: Automatic performance tuning of sparse matrix kernels. Ph.D. dissertation: UC Berkeley (2003)

    Google Scholar 

  18. MathKeisan. http://www.mathkeisan.com/

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vijay Daultani .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Daultani, V., Ohno, Y., Ishizaka, K. (2017). Sparse Direct Convolutional Neural Network. In: Cong, F., Leung, A., Wei, Q. (eds) Advances in Neural Networks - ISNN 2017. ISNN 2017. Lecture Notes in Computer Science(), vol 10261. Springer, Cham. https://doi.org/10.1007/978-3-319-59072-1_35

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-59072-1_35

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-59071-4

  • Online ISBN: 978-3-319-59072-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics