Skip to main content
Log in

VBMq: pursuit baremetal performance by embracing block I/O parallelism in virtualization

  • Research Article
  • Published:
Frontiers of Computer Science Aims and scope Submit manuscript

Abstract

Barely acceptable block I/O performance prevents virtualization from being widely used in the High-Performance Computing field. Although the virtio paravirtual framework brings great I/O performance improvement, there is a sharp performance degradation when accessing high-performance NAND-flash-based devices in the virtual machine due to their data parallel design. The primary cause of this fact is the deficiency of block I/O parallelism in hypervisor, such as KVM and Xen. In this paper, we propose a novel design of block I/O layer for virtualization, named VBMq. VBMq is based on virtio paravirtual I/O model, aiming to solve the block I/O parallelism issue in virtualization. It uses multiple dedicated I/O threads to handle I/O requests in parallel. In the meanwhile, we use polling mechanism to alleviate overheads caused by the frequent context switches of the VM’s notification to and from its hypervisor. Each dedicated I/O thread is assigned to a non-overlapping core to improve performance by avoiding unnecessary scheduling. In addition, we configure CPU affinity to optimize I/O completion for each request. The CPU affinity setting is very helpful to reduce CPU cache miss rate and increase CPU efficiency. The prototype system is based on Linux 4.1 kernel and QEMU 2.3.1. Our measurements show that the proposed method scales graciously in the multi-core environment, and provides performance which is 39.6x better than the baseline at most, and approaches bare-metal performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Walters J P, Chaudhary V, Cha M, Guercio Jr S, Gallo S. A comparison of virtualization technologies for HPC. In: Proceedings of the 22nd IEEE International Conference on Advanced Information Networking and Applications. 2008, 861–868

    Google Scholar 

  2. Mergen M F, Uhlig V, Krieger O, Xenidis J. Virtualization for highperformance computing. Journal of ACM SIGOPS Operating Systems Review, 2006, 40(2): 8–11

    Article  Google Scholar 

  3. Huang W, Liu J, Abali B, Panda D K. A case for high performance computing with virtual machines. In: Proceedings of the 20th ACM Annual International Conference on Supercomputing. 2006, 125–134

    Chapter  Google Scholar 

  4. Bjorling M, Axboe J, Nellans D, Bonnet P. Linux block IO: introducing multi-queue SSD access on multi-core systems. In: Proceedings of the 6th ACM International Systems and Storage Conference. 2013

    Google Scholar 

  5. Bilas A. Scaling I/O in virtualized multicore servers: how much I/O in 10 years and how to get there. In: Proceedings of the 6th ACM International Workshop on Virtualization Technologies in Distributed Computing Date. 2012

    Google Scholar 

  6. Tezuka H, O’Carroll F, Hori A, Ishikawa Y. Pin-down cache: a virtual memory management technique for zero-copy communication. In: Proceedings of the 1st IEEE Merged International and Symposium on Parallel and Distributed Processing. 1998, 308–314

    Chapter  Google Scholar 

  7. Huffman A. NVM express, revision 1.0 c. Intel Corporation, 2012

    Google Scholar 

  8. Gordon A, Har’El N, Landau A, Ben-Yehuda M, Traeger A. Towards exitless and efficient paravirtual I/O. In: Proceedings of the 5th ACM Annual International Systems and Storage Conference. 2012

    Google Scholar 

  9. Adams K, Agesen O. A comparison of software and hardware techniques for x86 virtualization. Journal of ACM SIGOPS Operating Systems Review, 2006, 40(5): 2–13

    Article  Google Scholar 

  10. Har’El N, Gordon A, Landau A, Ben-Yehuda M, Traeger A, Ladelsky R. Efficient and scalable paravirtualI/O system. In: Proceedings of USENIX Annual Technical Conference. 2013, 231–242

    Google Scholar 

  11. Maquelin O, Gao G R, Hum H H J, Theobald K B, Tian X M. Polling watchdog: combining polling and interrupts for efficient message handling. ACM SIGARCH Computer Architecture News. 1996, 24(2): 179–188

    Article  Google Scholar 

  12. Dovrolis C, Thayer B, Ramanathan P. HIP: hybrid interrupt-polling for the network interface. Journal of ACM SIGOPS Operating Systems Review, 2001, 35(4): 50–60

    Article  Google Scholar 

  13. Russell R. Virtio: towards a de-facto standard for virtual I/O devices. Journal of ACM SIGOPS Operating Systems Review, 2008, 42(5): 95–103

    Article  Google Scholar 

  14. Russell R, Tsirkin M S, Huck C, Moll P. Virtual I/O Device (VIRTIO) Version 1.0. OASIS Standard, OASIS Committee Specification, 2015, 2

    Google Scholar 

  15. Martinez A, Chapple J, Sethi P, Bennett J. Circuitry to selectively produce MSI signals. U.S. Patent Application 10/881,076, 2004–6-29

  16. Sinharoy B, Van Norstrand J A, Eickemeyer R J, Le H Q, Leenstra J, Nguyen D Q, Konigsburg B, Ward K, Brown M D, Moreira J E, Levitan D, Tung S, Hrusecky D, Bishop J W, Gschwind M, Boersma M, Kroener M, Kaltenbach M, Karkhanis T, Fernsler K M. IBM POWER8 processor core micro architecture. IBMJournal of Research and Development, 2015, 59(1): 2

    Article  Google Scholar 

  17. Hung J J, Bu K, Sun Z L, Diao J T, Liu J B. PCI express-based NVMe solid state disk. In: Proceedings of Applied Mechanics and Materials. 2014, 365–368

    Google Scholar 

  18. Dong Y, Dai J, Huang Z, Guan H, Tian K, Jiang Y. Towards highquality I/Ovirtualization. In: Proceedings of SYSTOR 2009: The Israeli Experimental Systems Conference. 2009

    Google Scholar 

  19. Ben-Yehuda M, Mason J, Krieger O, Krieger O, Van Doorn L, Nakajima J, Wahlig E. Utilizing IOMMUs for virtualizationin Linux and Xen. In: Proceedings of the 2006 Ottawa Linux Symposium. 2006, 71–86

    Google Scholar 

  20. AMD I, Virtualization O. Technology (IOMMU) Specification. 2007

    Google Scholar 

  21. Ben-Yehuda M, Xenidis J, Ostrowski M, Rister K, Bruemmer A, Van Doorn L. The price of safety: evaluating IOMMU performance. In: Proceedings of the Ottawa Linux Symposium. 2007, 9–20

    Google Scholar 

  22. Yassour B A, Ben-Yehuda M, Wasserman O. Direct device assignment for untrusted fully-virtualized virtual machines. IBM Research Report, 2008

    Google Scholar 

  23. Zhai E, Cummings G D, Dong Y. Live migration with pass-through device for Linux VM. In: Proceedings of the 2008 Ottawa Linux Symposium. 2008, 261–268

    Google Scholar 

  24. Ben-Yehuda M, Day M D, Dubitzky Z, Factor M, Har’El N, Gordon A, Liguori A, Wasserman O, Yassour, B. A. The turtles project: design and implementation of nested virtualization. In: Proceedings of the 9th USENIX Conference on Operating Systems Design and Implementation. 2010, 423–436

    Google Scholar 

  25. SIG PCI. Single Root I/O Virtualization and Sharing Specification, Revision 1.0, 2008

    Google Scholar 

  26. Cully B, Wires J, Meyer D, Jamieson K, Fraser K, Deegan T, Stodden D, Lefebvre G, Ferstay D, Warfield A. Strata: high-performance scalable storage on virtualized non-volatile memory. In: Proceedings of the 12th USENIX conference on File and Storage Technologies. 2014, 17–31

    Google Scholar 

  27. Clark C, Fraser K, Hand S, Hansen J G, Jul E, Limpach C, Pratt I, Warfield A. Live migration of virtual machines. In: Proceedings of the 2nd Conference on Symposium on Networked Systems Design and Implementation-Volume 2. 2005, 273–286

    Google Scholar 

  28. Yu W, Vetter J S. Xen-based HPC: a parallel I/O perspective. In: Proceedings of 8th IEEE International Symposium on Cluster Computing and the Grid. 2008, 154–161

    Google Scholar 

  29. Gordon A, Amit N, Har’El N, Ben-Yehuda M, Landau A, Schuster A, Tsafrir D. ELI: bare-metal performance forI/O virtualization. Journal of ACM SIGPLAN Notices, 2012, 47(4): 411–422

    Google Scholar 

  30. Lei M. Virtio Blk multi-queue conversion. 2014

    Google Scholar 

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (Grant No. 61321491).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Diming Zhang.

Additional information

Diming Zhang received his BE degree in science from the Soochow University, China in 2009, and the Master degree in software engineering from the Southeast University, China in 2011. In 2011, he joined College of Computer Engineering, Jiangsu University of Science and Technology, China as a lecturer. His current research interests are operation system, parallel computing and architecture.

Fei Xue received his BE degree in science from the Nanjing University (NJU), China in 2015. He is a PhD candidate at Faculty of Computer Science and Technology of NJU. His current research interests are operation system, parallel computing and architecture.

Hao Huang received his ME degree from Xiamen University, China in 1982 and his PhD degree from Nanjing University (NJU), China in 1999. He is now a professor in Faculty of Computer Science and Technology at NJU. He research interests include computer architecture, information security, and formal verification.

Shaodi You received his PhD and ME degrees from The University of Tokyo, Japan in 2015 and 2012 and his Bachelor’s degree from Tsinghua University, China in 2009. He is currently a research scientist at Data61-CSIRO (formerly known as NICTA), Australia. He also serves as adjunct lecturer at Australian National University, Australia. His research interests are physics based vision, deep learning and high-performance computing. He is best known for physics based vision in bad weather, especially in rainy scenes. He serves as a reviewer for TPAMI, IJCV, TIP, CVPR, ICCV, SIGGRAPH, etc. He is currently the Chair of IEEE Computer Society, Australian Capital Territory Section, Australia.

Electronic supplementary material

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, D., Xue, F., Huang, H. et al. VBMq: pursuit baremetal performance by embracing block I/O parallelism in virtualization. Front. Comput. Sci. 12, 873–886 (2018). https://doi.org/10.1007/s11704-017-6466-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11704-017-6466-1

Keywords

Navigation