Euge: Effective Utilization of GPU Resources for Serving DNN-Based Video Analysis

Chen, Qihang; Ding, Guangyao; Xu, Chen; Qian, Weining; Zhou, Aoying

doi:10.1007/978-3-030-60290-1_40

Qihang Chen¹³,
Guangyao Ding¹³,
Chen Xu¹³,
Weining Qian¹³ &
…
Aoying Zhou¹³

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12318))

Included in the following conference series:

Asia-Pacific Web (APWeb) and Web-Age Information Management (WAIM) Joint International Conference on Web and Big Data

1234 Accesses
1 Citations

Abstract

Deep Neural Network (DNN) has been widely adopted in video analysis application. The computation involved in DNN is more efficient on GPUs than on CPUs. However, recent serving systems involve the low utilization of GPU, due to limited process parallelism and storage overhead of DNN model. We propose Euge, which introduces multi-process service (MPS) and model sharing technology to support effective utilization of GPU. With MPS technology, multiple processes overcome the obstacle of GPU context and execute DNN-based video analysis on one GPU in parallel. Furthermore, by sharing the DNN-based model among threads within a process, Euge reduces the GPU memory overhead. We implement Euge on Spark and demonstrate the performance of vehicle detection workload.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Multi-Process Service. https://docs.nvidia.com/deploy/mps/index.html
Baylor, D., Breck, E., Cheng, H., Fiedel, N., Foo, C.Y., et al.: TFX: a tensorflow-based production-scale machine learning platform. In: Proceedings of the 23rd ACM International Conference on Knowledge Discovery and Data Mining (SIGKDD), pp. 1387–1395 (2017)
Google Scholar
Redmon, J.: Darknet: open source neural networks in C (2013–2016). http://pjreddie.com/darknet/
Redmon, J., Farhadi, A.: YOLOv3: an incremental improvement. CoRR abs/1804.02767 (2018)
Google Scholar
Ruvo, P.D., Distante, A., Stella, E., Marino, F.: A GPU-based vision system for real time detection of fastening elements in railway inspection. In: Proceedings of the 16th International Conference on Image Processing (ICIP), pp. 2333–2336 (2009)
Google Scholar
Shen, H., Chen, L., Jin, Y., Zhao, L., Kong, B., et al.: Nexus: a GPU cluster engine for accelerating DNN-based video analysis. In: Proceedings of the 27th ACM Symposium on Operating Systems Principles (SOSP), pp. 322–337 (2019)
Google Scholar
Yuan, Y., Salmi, M.F., Huai, Y., Wang, K., Lee, R., Zhang, X.: Spark-GPU: an accelerated in-memory data processing engine on clusters. In: Proceedings of the 4th IEEE International Conference on Big Data (BigData), pp. 273–283 (2016)
Google Scholar
Zaharia, M., Chowdhury, M., Franklin, M.J., Shenker, S., Stoica, I.: Spark: cluster computing with working sets. In: Proceedings of the 2nd USENIX Workshop on Hot Topics in Cloud Computing (HotCloud) (2010)
Google Scholar

Download references

Acknowledgment

This work has been supported through grants by the National Key Research & Development Program of China (No. 2018YFB1003400), National Natural Science Foundation of China (No. 61902128, 61732014) and Shanghai Sailing Program (No. 19YF1414200).

Author information

Authors and Affiliations

School of Data Science and Engineering, East China Normal University, Shanghai, 200062, China
Qihang Chen, Guangyao Ding, Chen Xu, Weining Qian & Aoying Zhou

Authors

Qihang Chen
View author publications
You can also search for this author in PubMed Google Scholar
Guangyao Ding
View author publications
You can also search for this author in PubMed Google Scholar
Chen Xu
View author publications
You can also search for this author in PubMed Google Scholar
Weining Qian
View author publications
You can also search for this author in PubMed Google Scholar
Aoying Zhou
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chen Xu .

Editor information

Editors and Affiliations

Tianjin University, Tianjin, China
Xin Wang
University of Melbourne, Melbourn, NSW, Australia
Rui Zhang
Kyung Hee University, Yongin, Korea (Democratic People's Republic of)
Young-Koo Lee
Nanjing University of Information Science and Technology, Nanjing, China
Le Sun
Kangwon National University, Chunchon, Korea (Republic of)
Yang-Sae Moon

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chen, Q., Ding, G., Xu, C., Qian, W., Zhou, A. (2020). Euge: Effective Utilization of GPU Resources for Serving DNN-Based Video Analysis. In: Wang, X., Zhang, R., Lee, YK., Sun, L., Moon, YS. (eds) Web and Big Data. APWeb-WAIM 2020. Lecture Notes in Computer Science(), vol 12318. Springer, Cham. https://doi.org/10.1007/978-3-030-60290-1_40

Download citation

DOI: https://doi.org/10.1007/978-3-030-60290-1_40
Published: 14 October 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-60289-5
Online ISBN: 978-3-030-60290-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics