Optimization of GPU Memory Usage for Training Deep Neural Networks

Hung, Che-Lun; Hsin, Chine-fu; Wang, Hsiao-Hsi; Tang, Chuan Yi

doi:10.1007/978-3-030-30143-9_23

Optimization of GPU Memory Usage for Training Deep Neural Networks

Che-Lun Hung^9,10,
Chine-fu Hsin^9,10,
Hsiao-Hsi Wang^9,10 &
…
Chuan Yi Tang¹⁰

Conference paper
First Online: 27 November 2019

1045 Accesses
1 Citations

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1080))

Abstract

Recently, Deep Neural Networks have been successfully utilized in many domains; especially in computer vision. Many famous convolutional neural networks, such as VGG, ResNet, Inception, and so forth, are used for image classification, object detection, and so forth. The architecture of these state-of-the-art neural networks has become deeper and complicated than ever. In this paper, we propose a method to solve the problem of large memory requirement in the process of training a model. The experimental result shows that the proposed algorithm is able to reduce the GPU memory significantly.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Krizhevsky, A., Ilya, S., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. arXiv:1512.03385 (2015)
Ba, J., Rich, C.: Do deep nets really need to be deep? In: Advances in Neural Information Processing Systems (2014)
Google Scholar
Urban, G., et al.: Do Deep Convolutional Nets Really Need to be Deep and Convolutional? arXiv:1603.05691 (2016)
Hinton, G.E., Osindero, S., Teh, Y.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)
Article MathSciNet Google Scholar
Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)
Article MathSciNet Google Scholar
Bengio, Y., Lamblin, P., Popovici, D., Larochelle, H.: Greedy layer-wise training of deep networks. In: Platt, J.C., Koller, D., Singer, Y., Roweis, S.T. (eds.) Advances in Neural Information Processing Systems, vol. 19, pp. 2814–2822 (2006)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Pereira, F., Burges, C.J.C., Bottou, L., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 25, pp. 1097–1105 (2012)
Google Scholar
Deng, L., Yu, D.: Deep learning: methods and applications. Found. Trends Signal Process. 7(3–4), 197–387 (2014)
Article MathSciNet Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556 (2014)
Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part I. LNCS, vol. 8689, pp. 818–833. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_53
Chapter Google Scholar
Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015)
Article MathSciNet Google Scholar
Szegedy, C., et al.: Going Deeper with Convolutions. arXiv:1409.4842 (2014)
Hu, J., Shen, L., Albanie, S., Sun, G., Wu, E.: Squeeze-and-Excitation Networks. arXiv:1709.01507 (2017)
Chen, T., Xu, B., Zhang, C., Guestrin, C.: Training Deep Nets with Sublinear Memory Cost. arXiv:1604.06174 (2016)

Download references

Acknowledgement

This research was partially supported by the Ministry of Science and Technology under the grants MOST 106-2221-E-126-001-MY2, MOST 108-2221-E-182-031-MY3 and MOST 108-2218-E-126-003.

Author information

Authors and Affiliations

Chang Gung University, Taoyuan, 33302, Taiwan
Che-Lun Hung, Chine-fu Hsin & Hsiao-Hsi Wang
Providence University, Taichung, 43301, Taiwan
Che-Lun Hung, Chine-fu Hsin, Hsiao-Hsi Wang & Chuan Yi Tang

Authors

Che-Lun Hung
View author publications
You can also search for this author in PubMed Google Scholar
Chine-fu Hsin
View author publications
You can also search for this author in PubMed Google Scholar
Hsiao-Hsi Wang
View author publications
You can also search for this author in PubMed Google Scholar
Chuan Yi Tang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Che-Lun Hung .

Editor information

Editors and Affiliations

University of Naples Federico II, Naples, Napoli, Italy
Christian Esposito
Soongsil University, Seoul, Korea (Republic of)
Jiman Hong
The University of Texas at San Antonio, San Antonio, TX, USA
Kim-Kwang Raymond Choo

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hung, CL., Hsin, Cf., Wang, HH., Tang, C.Y. (2019). Optimization of GPU Memory Usage for Training Deep Neural Networks. In: Esposito, C., Hong, J., Choo, KK. (eds) Pervasive Systems, Algorithms and Networks. I-SPAN 2019. Communications in Computer and Information Science, vol 1080. Springer, Cham. https://doi.org/10.1007/978-3-030-30143-9_23

Download citation

DOI: https://doi.org/10.1007/978-3-030-30143-9_23
Published: 27 November 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-30142-2
Online ISBN: 978-3-030-30143-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics