research-article

Open access

Improving Virtualized I/O Performance by Expanding the Polled I/O Path of Linux

Authors:

Nikil DuttAuthors Info & Claims

HotStorage '24: Proceedings of the 16th ACM Workshop on Hot Topics in Storage and File Systems

Pages 31 - 37

https://doi.org/10.1145/3655038.3665944

Published: 08 July 2024 Publication History

Abstract

The continuing advancement of storage technology has introduced ultra-low latency (ULL) SSDs that feature 20 μs or less access latency. Therefore, the context switching overhead of interrupts has become more pronounced on these SSDs, prompting consideration of polling as an alternative to mitigate this overhead. At the same time, the high price of ULL SSDs is a major issue preventing the wide adoption of polling.

We claim that virtualized systems can benefit from polling even without ULL SSDs. Since the host page cache is located in the DRAM main memory, it can deliver even higher throughput than ULL SSDs. However, the guest operating system in virtualized environments cannot use polled I/Os when accessing the host page cache, failing to exploit the performance advantage of DRAM. To resolve this inefficiency, we propose to expand the polled I/O path of the Linux kernel I/O stack. Our approach allows guest applications to use I/O polling for buffered I/Os and memory mapped I/Os. The expanded I/O path can significantly improve the I/O performance of virtualized systems without modifying the guest application or the backend of the virtual block device. Our proposed buffered I/O path with polling improves the 4 KB random read throughput between guest applications and the host page cache by 3.2×.

References

[1]

2021. Intel Optane SSD 900P Series Product Brief. https://www.intel.com/content/dam/www/public/us/en/documents/product-briefs/optane-ssd-900p-brief.pdf.

[2]

2021. Ultra-Low Latency with Samsung Z-NAND SSD. https://www.samsung.com/semiconductor/global.semi.static/Ultra-Low_Latency_with_Samsung_Z-NAND_SSD-0.pdf.

[3]

2021. VirtualBox Source Code Repository. https://www.virtualbox.org/browser/vbox/trunk.

[4]

Ameen Akel, Adrian M. Caulfield, Todor I. Mollov, Rajesh K. Gupta, and Steven Swanson. 2011. Onyx: A Prototype Phase Change Memory Storage Array. In Proceedings of the 3rd Workshop on Hot Topics in Storage and File Systems (HotStorage 11).

[5]

Alibaba. 2024. Alibaba Virtualization Disk Formats. https://www.alibabacloud.com/help/en/ecs/user-guide/common-image-formats Accessed: 2024-05-17.

[6]

Jens Axboe. 2021. blktrace(8) - Linux man page. https://linux.die.net/man/8/blktrace.

[7]

Jens Axboe. 2021. Faster IO through io_uring. https://kernel-recipes.org/en/2019/talks/faster- io- through- io_uring/.

[8]

Jens Axboe. 2021. Flexible I/O Tester. https://github.com/axboe/fio.

[9]

Adrian M. Caulfield, Arup De, Joel Coburn, Todor I. Mollow, Rajesh K. Gupta, and Steven Swanson. 2010. Moneta: A High-performance Storage Array Architecture for Next-Generation, Non-volatile Memories. In Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

Digital Library

[10]

Assaf Eisenman, Darryl Gardner, Islam AbdelRahman, Jens Axboe, Siying Dong, Kim Hazelwood, Chris Petersen, Asaf Cidon, and Sachin Katti. 2018. Reducing DRAM Footprint with NVM in Facebook. In Proceedings of the Thirteenth EuroSys Conference (Porto, Portugal). Article 42, 13 pages.

Digital Library

[11]

WU Fengguang, XI Hongsheng, and XU Chenfeng. 2008. On the Design of a New Linux Readahead Framework. SIGOPS Oper. Syst. Rev. 42, 5 (July 2008), 75--84.

Digital Library

[12]

Stefan Hajnoczi. 2020. Optimizing for NVMe Drives: The 10 Microsecond Challenge. https://vmsplice.net/~stefan/stefanha-kvm-forum-2020.pdf.

[13]

Alex Handy. 2021. Linux 4.10 arrives. https://sdtimes.com/coding/linux-4-10-arrives/.

[14]

Asias He. 2012. Virtio-blk Performance Improvement. https://www.linux-kvm.org/images/f/f9/2012-forum-virtio-blk-performance-improvement.pdf.

[15]

Christoph Hellwig and Ming Lei. 2022. [v2] block: ignore RWF_HIPRI hint for sync dio. https://patchwork.kernel.org/project/linux-mm/patch/[email protected]/#24824449.

[16]

Sooman Jeong, Kisung Lee, Seongjin Lee, Seoungbum Son, and Youjip Won. 2013. I/O Stack Optimization for Smartphones. In Proceedings of the 2013 USENIX Annual Technical Conference (USENIX ATC 13). San Jose, CA, 309--320. https://www.usenix.org/conference/atc13/technical-sessions/presentation/jeong

[17]

Yongsoo Joo, Junhee Ryu, Sangsoo Park, and Kang G. Shin. 2011. FAST: Quick Application Launch on Solid-State Drives. In Proceeding of the 9th USENIX Conference on File and Storage Technologies (FAST 12) (San Jose, CA, USA). 259--272.

[18]

Hyeong-Jun Kim, Young-Sik Lee, and Jin-Soo Kim. 2016. NVMeDirect: A User-space I/O Framework for Application-specific Optimization on NVMe SSDs. In 8th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage 16). Denver, CO. https://www.usenix.org/conference/hotstorage16/workshop-program/presentation/kim

Digital Library

[19]

Sewoog Kim, Heekwon Park, and Jongmoo Choi. 2021. Direct-Virtio: A New Direct Virtualized I/O Framework for NVMe SSDs. Electronics 10, 17 (2021). https://doi.org/10.3390/electronics10172058

[20]

Sungjoon Koh, Junhyeok Jang, Changrim Lee, Miryeong Kwon, Jie Zhang, and Myoungsoo Jung. 2019. Faster than flash: An in-depth study of system challenges for emerging ultra-low latency SSDs. In 2019 IEEE International Symposium on Workload Characterization (IISWC). IEEE, 216--227.

[21]

Duy Le, Hai Huang, and Haining Wang. 2012. Understanding Performance Implications of Nested File Systems in a Virtualized Environment. In Proceedings of the 10th USENIX Conference on File and Storage Technologies (FAST 12). San Jose, CA. https://www.usenix.org/conference/fast12/understanding-performance-implications-nested-file-systems-virtualized-environment

Digital Library

[22]

Gyusun Lee, Seokha Shin, and Jinkyu Jeong. 2022. Efficient hybrid polling for ultra-low latency storage devices. Journal of Systems Architecture 122 (2022), 102--338.

Digital Library

[23]

Gyusun Lee, Seokha Shin, Wonsuk Song, Tae Jun Ham, Jae W. Lee, and Jinkyu Jeong. 2019. Asynchronous I/O Stack: A Low-latency Kernel I/O Stack for Ultra-Low Latency SSDs. In Proceedings of the 2019 USENIX Annual Technical Conference (USENIX ATC 19). Renton, WA, 603--616. https://www.usenix.org/conference/atc19/presentation/lee-gyusun

[24]

Jinhong Li, Qiuping Wang, Patrick PC Lee, and Chao Shi. 2020. An in-depth analysis of cloud block storage workloads in large-scale production. In 2020 IEEE International Symposium on Workload Characterization (IISWC). IEEE, 37--47.

[25]

Tao Lu, Ping Huang, Xubin He, and Ming Zhang. 2016. Understanding the Impact of Cache Locations on Storage Performance and Energy Consumption of Virtualization Systems. In Proceedings of the USENIX Workshop on Cool Topics on Sustainable Data Centers (CoolDC 16). Santa Clara, CA. https://www.usenix.org/conference/cooldc16/workshop-program/presentation/lu

[26]

Tao Lu, Ping Huang, Morgan Stuart, Yuhua Guo, Xubin He, and Ming Zhang. 2016. Successor: Proactive cache warm-up of destination hosts in virtual machine migration contexts. In Proceedings of the 35th Annual IEEE International Conference on Computer Communications (INFOCOM). 1--9. https://doi.org/10.1109/INFOCOM.2016.7524537

Digital Library

[27]

Rui Miao, Lingjun Zhu, Shu Ma, Kun Qian, Shujun Zhuang, Bo Li, Shuguang Cheng, Jiaqi Gao, Yan Zhuang, Pengcheng Zhang, et al. 2022. From luna to solar: the evolutions of the compute-to-storage networks in alibaba cloud. In Proceedings of the ACM SIGCOMM 2022 Conference. 753--766.

Digital Library

[28]

Damien Le Moal. 2017. I/O Latency Optimization with Polling. In Proc. 2017 Linux Storage and Filesystems Conference (VAULT).

[29]

Prateek Sharma, Purushottam Kulkarni, and Prashant Shenoy. 2016. Per-VM page cache partitioning for cloud computing platforms. In Proceedings of the 2016 8th International Conference on Communication Systems and Networks (COMSNETS). 1--8. https://doi.org/10.1109/COMSNETS.2016.7439971

[30]

Woong Shin, Qichen Chen, Myoungwon Oh, Hyeonsang Eom, and Heon Y. Yeom. 2014. OS I/O Path Optimizations for Flash Solid-state Drives. In Proceedings of the 2014 USENIX Annual Technical Conference (USENIX ATC 14). Philadelphia, PA, 483--488. https://www.usenix.org/conference/atc14/technical-sessions/presentation/shin

[31]

Yongju Song and Young Ik Eom. 2019. HyPI: Reducing CPU Consumption of the I/O Completion Method in High-Performance Storage Systems. In Proceedings of the 13th International Conference on Ubiquitous Information Management and Communication (IMCOM). 646--653.

[32]

Billy Tallis. 2021. How We Test PCIe 4.0 Storage: The AnandTech 2021 SSD Benchmark Suite. https://www.anandtech.com/show/16458/2021-ssd-benchmark-suite/4.

[33]

Jörg Thalheim, Harshavardhan Unnibhavi, Christian Priebe, Pramod Bhatotia, and Peter Pietzuch. 2021. Rkt-Io: A Direct I/O Stack for Shielded Execution. In Proceedings of the Sixteenth European Conference on Computer Systems (EuroSys) (Online Event, United Kingdom). 490--506.

Digital Library

[34]

Carl A. Waldspurger. 2002. Memory Resource Management in VMware ESX Server. In Proceedings of the 5th Symposium on Operating Systems Design and Implementation (OSDI 02). Boston, MA. https://www.usenix.org/conference/osdi-02/memory-resource-management-vmware-esx-server

[35]

Qiuping Wang, Jinhong Li, Patrick PC Lee, Tao Ouyang, Chao Shi, and Lilong Huang. 2022. Separating data via block invalidation time inference for write amplification reduction in Log-Structured storage. In 20th USENIX Conference on File and Storage Technologies (FAST 22). 429--444.

[36]

Jisoo Yang, Dave B. Minturn, and Frank Hady. 2012. When Poll Is Better than Interrupt. In Proceeding of the 10th USENIX Conference on File and Storage Technologies (FAST 12).

Digital Library

[37]

Ziye Yang, James R. Harris, Benjamin Walker, Daniel Verkamp, Chang-peng Liu, Cunyin Chang, Gang Cao, Jonathan Stern, Vishal Verma, and Luse E. Paul. 2017. SPDK: A Development Kit to Build High Performance Storage Applications. In Proceedings of the 2017 IEEE International Conference on Cloud Computing Technology and Science (CloudCom). 154--161.

[38]

Jie Zhang, Miryeong Kwon, Donghyun Gouk, Sungjoon Koh, Changlim Lee, Mohammad Alian, Myoungjun Chun, Mahmut Taylan Kandemir, Nam Sung Kim, Jihong Kim, and Myoungsoo Jung. 2018. FlashShare: Punching Through Server Storage Stack from Kernel to Firmware for Ultra-Low Latency SSDs. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI). Carlsbad, CA, 477--492. https://www.usenix.org/conference/osdi18/presentation/zhang

[39]

Zhe Zhang, Han Chen, and Hui Lei. 2012. Small Is Big: Functionally Partitioned File Caching in Virtualized Environments. In Proceedings of the 4th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud 12). Boston, MA. https://www.usenix.org/conference/hotcloud12/workshop-program/presentation/zhang_zhe

Cited By

Khan BKoch A(2024)DeLiBA-K: Speeding-up Hardware-Accelerated Distributed Storage Access by Tighter Linux Kernel Integration and Use of a Modern APIProceedings of the SC '24 Workshops of the International Conference on High Performance Computing, Network, Storage, and Analysis10.1109/SCW63240.2024.00075(531-544)Online publication date: 17-Nov-2024
https://dl.acm.org/doi/10.1109/SCW63240.2024.00075

Index Terms

Improving Virtualized I/O Performance by Expanding the Polled I/O Path of Linux
1. General and reference
  1. Cross-computing tools and techniques
    1. Performance
2. Software and its engineering
  1. Software organization and properties
    1. Contextual software domains
      1. Operating systems
        Communications management
        Input / output
    2. Software system structures
      1. Software architectures
        n-tier architectures

Recommendations

Improving disk I/O performance in a virtualized system

Desktop virtualization is a general solution for providing users with various working environments on a single physical machine. It is typically based on the virtual machine (VM) technology, which can provide smart sharing policies on the scarce ...
Storage technologies, management and troubleshooting in virtualized datacenters
SIGMETRICS '11: Proceedings of the ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems

Storage management in virtualized environments is considered as one of the biggest cost factors. According to some estimates, majority of the cost and performance problems are related to storage devices. In this tutorial, we will discuss some of the key ...
SymFlex: Elastic, Persistent and Symbiotic SSD Caching in Virtualization Environments
ICPE '21: Proceedings of the ACM/SPEC International Conference on Performance Engineering

Hypervisor managed SSD caching is an often used technique for improving IO performance in virtualization based hosting solutions. Such caches are either explicitly managed by the hypervisor which approximate the access semantics of the applications for ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

HotStorage '24: Proceedings of the 16th ACM Workshop on Hot Topics in Storage and File Systems

July 2024

141 pages

ISBN:9798400706301

DOI:10.1145/3655038

Copyright © 2024 Owner/Author.

This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

SIGOPS: ACM Special Interest Group on Operating Systems

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 July 2024

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

National Research Foundation of Korea

Conference

HOTSTORAGE '24

Sponsor:

SIGOPS

HOTSTORAGE '24: 16th ACM Workshop on Hot Topics in Storage and File Systems

July 8 - 9, 2024

CA, Santa Clara, USA

Acceptance Rates

Overall Acceptance Rate 34 of 87 submissions, 39%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
369
Total Downloads

Downloads (Last 12 months)369
Downloads (Last 6 weeks)80

Reflects downloads up to 16 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Khan BKoch A(2024)DeLiBA-K: Speeding-up Hardware-Accelerated Distributed Storage Access by Tighter Linux Kernel Integration and Use of a Modern APIProceedings of the SC '24 Workshops of the International Conference on High Performance Computing, Network, Storage, and Analysis10.1109/SCW63240.2024.00075(531-544)Online publication date: 17-Nov-2024
https://dl.acm.org/doi/10.1109/SCW63240.2024.00075

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Figures

Tables

Media

View Table of Conten