Skip to main content

Machine Learning Using Virtualized GPUs in Cloud Environments

  • Conference paper
  • First Online:
High Performance Computing (ISC High Performance 2017)

Abstract

Using graphic processing units (GPU) to accelerate machine learning applications has become a focus of high performance computing (HPC) in recent years. In cloud environments, many different cloud-based GPU solutions have been introduced to seamlessly and securely use GPU resources without sacrificing their performance benefits. Among them are two main approaches: using direct pass-through technologies available on hypervisors and using virtual GPU technologies introduced by GPU vendors. In this paper, we present a performance study of these two GPU virtualization solutions for machine learning in the cloud. We evaluate the advantages and disadvantages of each solution and introduce new findings of their performance impact on machine learning applications in different real-world use-case scenarios. We also examine the benefits of virtual GPUs for machine learning alone and for machine learning applications running together with other GPU-based applications like 3D-graphics on the same server with multiple GPUs to better leverage computing resources. Based on our experimental results benchmarking machine learning applications developed with TensorFlow, we discuss the scaling from one to multiple GPUs and compare the performance between two virtual GPU solutions. Finally, we show that mixing machine learning and other GPU-based workloads can help to reduce combined execution time as compared to running these workloads sequentially.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

References

  1. Díaz, M., Martín, C., Rubio, B.: State-of-the-art, challenges, and open issues in the integration of internet of things and cloud computing. J. Netw. Comput. Appl. 67, 99–117 (2016). doi:10.1016/j.jnca.2016.01.010

    Article  Google Scholar 

  2. Canny, J., Zhao, H.: Big Data analytics with small footprint—squaring the cloud. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 95–103 (2013)

    Google Scholar 

  3. Jouppi, N., et al.: Datacenter performance analysis of a tensor processing unit. In: Proceedings of 44th International Symposium on Computer Architecture, Toronto, Canada (June 26, 2017)

    Google Scholar 

  4. Qiu, J., Wu, Q., Ding, G., Xu, Y., Feng,S.: A Survey of Machine Learning for Big Data Processing. J. Adv. Sig. Process. (2016). doi:10.1186/s13634-016-0355-x

  5. VMware Directpath I/O, https://communities.vmware.com/docs/DOC-11089

  6. NVIDIA GRID virtual GPU technology, http://www.nvidia.com/object/grid-technology.html

  7. AMD Virtualization Solution, http://www.amd.com/en-us/solutions/professional/virtualization

  8. Bittman, T., Dawson, P., Warrilow, M.: Magic Quadrant for x86 Server Virtualization Infrastructure. In: Gartner Research Report, 3 August (2016)

    Google Scholar 

  9. Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning. Data Mining, Inference, and Prediction, 2nd edn. Springer, New York (2009)

    Google Scholar 

  10. Docker Containers Performance in VMware vSphere, https://blogs.vmware.com/performance/2014/10/docker-containers-performance-vmware-vsphere.html

  11. Vu, L., Sivaraman, H., Bidarkar, R.: GPU Virtualization for High Performance General Purpose Computing on the ESX hypervisor. In: Proceedings of the 22nd High Performance Computing Symposium (2014)

    Google Scholar 

  12. Big Data Performance on vSphere 6, http://www.vmware.com/content/dam/digitalmarketing/vmware/en/pdf/techpaper/bigdata-perf-vsphere6.pdf

  13. Zaremba, W., Sutskever, I., Vinyals, O.: Recurrent Neural Network Regularization. arXiv:1409.2329 (2014)

  14. Taylor, A., Marcus, M., Santorini, B.: The penn treebank: an overview. In: Abeille, A. (ed.) Treebanks: the State of the Art in Syntactically Annotated Corpora. Kluwer (2003)

    Google Scholar 

  15. Tensorflow Homepage, https://www.tensorflow.org

  16. Walters, J.P., Younge, A.J., Kang, D.I., Yao, K.T., Kang, M., Crago, S.P., Fox, G.C.: GPU passthrough performance: a comparison of KVM, Xen, VMWare ESXi, and LXC for CUDA and OpenCL Applications. In: Proceedings of 2014 IEEE 7th International Conference on Cloud Computing (2014)

    Google Scholar 

  17. LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)

    Article  Google Scholar 

  18. Multiple Layers of Features from Tiny Images, https://www.cs.toronto.edu/~kriz/cifar.html

  19. Pandey, A., Vu, L., Puthiyaveettil, V., Sivaraman, H., Kurkure, U., Bappanadu, A.: An automation framework for benchmarking and optimizing performance of remote desktops in the cloud. In: To appear in Proceedings of the 2017 International Conference on High Performance Computing & Simulation (2017)

    Google Scholar 

  20. SPECapc for 3ds Max (2015), https://www.spec.org/gwpg/apc.static/max2015info.html

Download references

Acknowledgements

The authors would like to thank Josh Simons, Na Zhang, Julie Brodeur, Aravind Bappanadu, and Bruce Herndon for their support for this project.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Uday Kurkure .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Kurkure, U., Sivaraman, H., Vu, L. (2017). Machine Learning Using Virtualized GPUs in Cloud Environments. In: Kunkel, J., Yokota, R., Taufer, M., Shalf, J. (eds) High Performance Computing. ISC High Performance 2017. Lecture Notes in Computer Science(), vol 10524. Springer, Cham. https://doi.org/10.1007/978-3-319-67630-2_41

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-67630-2_41

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-67629-6

  • Online ISBN: 978-3-319-67630-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.