ABSTRACT
The IO performance of storage devices has accelerated from hundreds of IOPS five years ago, to hundreds of thousands of IOPS today, and tens of millions of IOPS projected in five years. This sharp evolution is primarily due to the introduction of NAND-flash devices and their data parallel design. In this work, we demonstrate that the block layer within the operating system, originally designed to handle thousands of IOPS, has become a bottleneck to overall storage system performance, specially on the high NUMA-factor processors systems that are becoming commonplace. We describe the design of a next generation block layer that is capable of handling tens of millions of IOPS on a multi-core system equipped with a single storage device. Our experiments show that our design scales graciously with the number of cores, even on NUMA systems with multiple sockets.
- Improving network performance in multi-core systems. Intel Corporation, 2007.Google Scholar
- J. Axboe. Linux Block IO present and future. Ottawa Linux Symposium, 2004.Google Scholar
- A. Baumann, P. Barham, P.-E. Dagand, T. Harris, R. Isaacs, S. Peter, T. Roscoe, A. Schupbach, and S. Akhilesh. The multikernel: a new OS architecture for scalable multicore systems. Symposium on Operating Systems Principles, 2009. Google ScholarDigital Library
- M. Bjørling, P. Bonnet, L. Bouganim, and N. Dayan. The necessary death of the block device interface. In Conference on Innovative Data Systems Research, 2013.Google Scholar
- S. Boyd-wickizer, A. T. Clements, Y. Mao, A. Pesterev, M. F. Kaashoek, R. Morris, and N. Zeldovich. An Analysis of Linux Scalability to Many Cores. Operating Systems Design and Implementation, 2010. Google ScholarDigital Library
- G. W. Burr, M. J. Breitwisch, M. Franceschini, D. Garetto, K. Gopalakrishnan, B. Jackson, C. Lam, and A. Luis. Phase change memory technology. Journal of Vacuum Science and Technology B, 28(2):223--262, 2010.Google ScholarCross Ref
- A. M. Caulfield, A. De, J. Coburn, T. I. Mollov, R. K. Gupta, and S. Swanson. Moneta: A high-performance storage array architecture for next-generation, non-volatile memories. In Proceedings of The 43rd Annual IEEE/ACM International Symposium on Microarchitecture, 2010. Google ScholarDigital Library
- A. M. Caulfield, T. I. Mollov, L. A. Eisner, A. De, J. Coburn, and S. Swanson. Providing safe, user space access to fast, solid state disks. SIGARCH Comput. Archit. News, 40(1):387--400, Mar. 2012. Google ScholarDigital Library
- S. Cho, C. Park, H. Oh, S. Kim, Y. Y. Yi, and G. Ganger. Active Disk Meets Flash: A Case for Intelligent SSDs. Technical Report CMU-PDL-11-115, 2011.Google Scholar
- Completely Fair Queueing (CFQ) Scheduler. http://en.wikipedia.org/wiki/CFQ.Google Scholar
- J. Condit, E. B. Nightingale, C. Frost, E. Ipek, B. Lee, D. Burger, and D. Coetzee. Better I/O through byte-addressable, persistent memory. Symposium on Operating Systems Principles, page 133, 2009. Google ScholarDigital Library
- Deadline IO Scheduler. http://en.wikipedia.org/wiki/Deadline_scheduler.Google Scholar
- M. Dunn and A. L. N. Reddy. A new I/O scheduler for solid state devices. Texas A&M University, 2010.Google Scholar
- fio. http://freecode.com/projects/fio.Google Scholar
- P. Foglia, C. A. Prete, M. Solinas, and F. Panicucci. Investigating design tradeoffs in S-NUCA based CMP systems. UCAS, 2009.Google Scholar
- Fusion-io ioDrive2. http://www.fusionio.com/.Google Scholar
- L. M. Grupp, J. D. David, and S. Swanson. The Bleak Future of NAND Flash Memory. USENIX Conference on File and Storage Technologies, 2012. Google ScholarDigital Library
- A. Huffman. NVM Express, Revision 1.0c. Intel Corporation, 2012.Google Scholar
- J. Kim, Y. Oh, E. Kim, J. Choi, D. Lee, and S. H. Noh. Disk Schedulers for Solid State Drives. In EMSOFTâĂŹ09: 7th ACM Conf. on Embedded Software, pages 295--304, 2009. Google ScholarDigital Library
- F. Liu, X. Jiang, and Y. Solihin. Understanding How Off-Chip Memory Bandwidth Partitioning in Chip Multiprocessors Affects System Performance. High Performance Computer Architecture, 2009.Google Scholar
- S. Mangold, S. Choi, P. May, O. Klein, G. Hiertz, and L. Stibor. 802.11e Wireless LAN for Quality of Service. IEEE, 2012.Google Scholar
- J. Nieplocha, R. J. Harrison, and R. J. Littlefield. Global Arrays: A Non-Uniform-Memory-Access Programming Model For High-Performance Computers. The Journal of Supercomputing, 1996. Google ScholarDigital Library
- S. Park and K. Shen. FIOS: A Fair, Efficient Flash I/O Scheduler. In USENIX Conference on File and Storage Technologies, 2010. Google ScholarDigital Library
- J. Parkhurst, J. Darringer, and B. Grundmann. From single core to multi-core: preparing for a new exponential. In Proceedings of the 2006 IEEE/ACM international conference on Computer-aided design, 2006. Google ScholarDigital Library
- PCI-SIG. PCI Express Specification Revision 3.0. Technical report, 2012.Google Scholar
- L. Soares and M. Stumm. Flexsc: Flexible system call scheduling with exception-less system calls. In Proceedings of the 9th USENIX conference on Operating systems design and implementation, 2010. Google ScholarDigital Library
- H. Sutter. The free lunch is over: A fundamental turn toward concurrency in software. Dr. Dobb's Journal, 30(3):202--210, 2005.Google Scholar
- V. Vasudevan, M. Kaminsky, and D. G. Andersen. Using vector interfaces to deliver millions of iops from a networked key-value storage server. In Proceedings of the Third ACM Symposium on Cloud Computing, 2012. Google ScholarDigital Library
- B. Verghese, S. Devine, A. Gupta, and M. Rosenblum. Operating System Support for Improving Data Locality on CC-NUMA Compute Servers. In International Conference on Architectural Support for Programming Languages and Operating Systems, 1996. Google ScholarDigital Library
- J. Weinberg. Quantifying Locality In The Memory Access Patterns of HPC Applications. PhD thesis, 2005.Google Scholar
- J. Yang, D. B. Minturn, and F. Hady. When Poll is Better than Interrupt. In USENIX Conference on File and Storage Technologies, 2012. Google ScholarDigital Library
Index Terms
- Linux block IO: introducing multi-queue SSD access on multi-core systems
Recommendations
Optimizing file systems for fast storage devices
SYSTOR '15: Proceedings of the 8th ACM International Systems and Storage ConferenceEmerging high-performance storage devices have attractive features such as low latency and high throughput. This leads to a rapid increase in the demand for fast storage devices in cloud platforms, social network services, etc. However, there are few ...
Using Working Set Reorganization to Manage Storage Systems with Hard and Solid State Disks
ICPPW '14: Proceedings of the 2014 43rd International Conference on Parallel Processing WorkshopsScientific applications from many problem domains produce and/or access large volumes of data. To support these applications, designers of high-end computing (HEC) systems have greatly increased the capacity of storage systems in recent years. However, ...
Operating system support for dynamic over-provisioning of solid state drives
SAC '12: Proceedings of the 27th Annual ACM Symposium on Applied ComputingEmploying solid state drives (SSDs) can leverage the performance of persistent storage systems into a new dimension. However, in order to ensure a continuously high write throughput especially for small random writes, it is crucial to always maintain a ...
Comments