skip to main content
10.1145/2043556.2043563acmconferencesArticle/Chapter ViewAbstractPublication PagessospConference Proceedingsconference-collections
research-article

Differentiated storage services

Published: 23 October 2011 Publication History

Abstract

We propose an I/O classification architecture to close the widening semantic gap between computer systems and storage systems. By classifying I/O, a computer system can request that different classes of data be handled with different storage system policies. Specifically, when a storage system is first initialized, we assign performance policies to predefined classes, such as the filesystem journal. Then, online, we include a classifier with each I/O command (e.g., SCSI), thereby allowing the storage system to enforce the associated policy for each I/O that it receives.
Our immediate application is caching. We present filesystem prototypes and a database proof-of-concept that classify all disk I/O --- with very little modification to the filesystem, database, and operating system. We associate caching policies with various classes (e.g., large files shall be evicted before metadata and small files), and we show that end-to-end file system performance can be improved by over a factor of two, relative to conventional caches like LRU. And caching is simply one of many possible applications. As part of our ongoing work, we are exploring other classes, policies and storage system mechanisms that can be used to improve end-to-end performance, reliability and security.

References

[1]
M. Abd-El-Malek, W. V. C. II, C. Cranor, G. R. Ganger, J. Hendricks, A. J. Klosterman, M. Mesnier, M. Prasad, B. Salmon, R. R. Sambasivan, S. Sinnamohideen, J. D. Strunk, E. Thereska, M. Wachs, and J. J. Wylie. Ursa Minor: versatile cluster-based storage. In Proceedings of the 4th USENIX Conference on File and Storage Technologies, San Francisco, CA, December 2005. The USENIX Association.
[2]
A. C. Arpaci-Dusseau and R. H. Arpaci-Dusseau. Information and Control in Gray-Box Systems. In Proceedings of the 18nd ACM Symposium on Operating Systems Principles (SOSP 01), Chateau Lake Louise, Banff, Canada, October 2001.
[3]
S. N. I. Association. A Dictionary of Storage Networking Terminology. http://www.snia.org/education/dictionary.
[4]
P. R. Barham. A Fresh Approach to File System Quality of Service. In Proceedings of the IEEE 7th International Workshop on Network and Operating System Support for Digital Audio and Video (NOSSDAV 97), St. Louis, MO, May 1997.
[5]
M. Bhadkamkar, J. Guerra, L. Useche, S. Burnett, J. Liptak, R. Rangaswami, and V. Hristidis. BORG: Block-reORGanization for Self-optimizing Storage Systems. In Proceedings of the 7th USENIX Conference on File and Storage Technologies (FAST 09), San Francisco, CA, February 2009. The USENIX Association.
[6]
F. Chen, D. Koufaty, and X. Zhang. Hystor: Making the best use of solid state drives in high performance storage systems. In Proceedings of the 25th ACM International Conference on Supercomputing (ICS 2011), Tucson, AZ, May 31 - June 4 2011.
[7]
F. Chen, D. A. Koufaty, and X. Zhang. Understanding Intrinsic Characteristics and System Implications of Flash Memory based Solid State Drives. In Proceedings of the International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS 2009), Seattle, WA, June 2009. ACM Press.
[8]
J. Condit, E. B. Nightingale, C. Frost, E. Ipek, D. Burger, B. C. Lee, and D. Coetzee. Better I/O Through Byte-Addressable, Persistent Memory. In Proceedings of the 22nd ACM Symposium on Operating Systems Principles (SOSP 09), Big Sky, MT, October 2009.
[9]
X. Ding, S. Jiang, F. Chen, K. Davis, and X. Zhang. DiskSeen: Exploiting Disk Layout and Access History to Enhance I/O Prefetch. In Proceedings of the 2007 USENIX Annual Technical Conference, Santa Clara, CA, June 2007. The USENIX Association.
[10]
W. Effelsberg and T. Haerder. Principles of database buffer management. ACM Transactions on Database Systems (TODS), 9(4):560--595, December 1984.
[11]
H. Huang, A. Hung, and K. G. Shin. FS2: Dynamic Data Replication in Free Disk Space for Improving Disk Performance and Energy Consumption. In Proceedings of 20th ACM Symposium on Operating System Principles, pages 263--276, Brighton, UK, October 2005. ACM Press.
[12]
Intel Corporation. Open Storage Toolkit. http://www.sourceforge.net/projects/intel-iscsi.
[13]
S. Jiang, F. Chen, and X. Zhang. CLOCK-Pro: An Effective Improvement of the CLOCK Replacement. In Proceedings of the 2005 USENIX Annual Technical Conference (USENIX ATC 2005), Anaheim, CA, April 10--15 2005. The USENIX Association.
[14]
S. Jiang and X. Zhang. LIRS: An Efficient Low Inter-reference Recency Set Replacement Policy to Improve Buffer Cache Performance. In Proceedings of the International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS 2002), Marina Del Rey, CA, June 15-19 2002. ACM Press.
[15]
T. Johnson and D. Shasha. 2Q: A Low Overhead High Performance Buffer Management Replacement Algorithm. In Proceedings of the 20th International Conference on Very Large Data Bases (VLDB'94), Santiago Chile, Chile, September 12--15 1994. Morgan Kaufmann.
[16]
W. K. Josephson, L. A. Bongo, D. Flynn, and K. Li. DFS: A File System for Virtualized Flash Storage. In Proceedings of the 8th USENIX Conference on File and Storage Technologies (FAST 10), San Jose, CA, February 2010. The USENIX Association.
[17]
M. Karlsson, C. Karamanolis, and X. Zhu. Triage: performance differentiation for storage systems using adaptive control. ACM Transactions on Storage, 1(4):457--480, November 2006.
[18]
S. Khuller, Y.-A. Kim, and Y.-C. J. Wan. Algorithms for data migration with cloning. In Proceedings of the 22nd ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, San Diego, CA, June 2003. ACM Press.
[19]
D. Lee, J. Choi, J. Kim, S. H. Noh, S. L. Min, Y. Cho, and C. S. Kim. LRFU: A Spectrum of Policies that Subsumes the Least Recently Used and Least Frequently Used Policies. IEEE Transactions on Computers, 50(12):1352--1361, December 2001.
[20]
A. Leventhal. Flash storage memory. In Communications of the ACM, volume 51(7), pages 47--51, July 2008.
[21]
C. Lu, G. A. Alvarez, and J. Wilkes. Aqueduct: online data migration with performance guarantees. In Proceedings of the 1st USENIX Conference on File and Storage Technologies (FAST 02), Monterey, CA, January 2002. The USENIX Association.
[22]
P. Macko, M. Seltzer, and K. A. Smith. Tracking Back References in a Write-Anywhere File System. In Proceedings of the 8th USENIX Conference on File and Storage Technologies (FAST 10), San Jose, CA, February 2010. The USENIX Association.
[23]
B. Marsh, F. Douglis, and P. Krishnan. Flash memory file caching for mobile computers. In Proceedings of the 27th Hawaii Conference on Systems Science, Wailea, HI, Jan 1994.
[24]
J. Matthews, S. Trika, D. Hensgen, R. Coulson, and K. Grimsrud. Intel Turbo Memory: Nonvolatile disk caches in the storage hierarchy of mainstream computer systems. In ACM Transactions on Storage (TOS), volume 4, May 2008.
[25]
N. Megiddo and D. S. Modha. Outperforming LRU with an Adaptive Replacement Cache Algorithm. IEEE Computer Magazine, 37(4):58--65, April 2004.
[26]
M. Mesnier, E. Thereska, G. Ganger, D. Ellard, and M. Seltzer. File classification in self-* storage systems. In Proceedings of the 1st International Conference on Autonomic Computing (ICAC-04), New York, NY, May 2004. IEEE Computer Society.
[27]
M. Mesnier, M. Wachs, R. R. Sambasivan, A. Zheng, and G. R. Ganger. Modeling the relative fitness of storage. In Proceedings of the International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS 2007), San Diego, CA, June 2007. ACM Press.
[28]
M. P. Mesnier, G. R. Ganger, and E. Riedel. Object-based Storage. IEEE Communications, 44(8):84--90, August 2003.
[29]
D. T. Meyer and W. J. Bolosky. A Study of Practical Deduplication. In Proceedings of the 9th USENIX Conference on File and Storage Technologies (FAST 11). San Jose, CA, Feb 15--17 2011. The USENIX Association.
[30]
O. Muller and D. Graf. Swiss Internet Analysis 2002. http://swiss-internet-analysis.org.
[31]
D. Narayanan, E. Thereska, A. Donnelly, S. Elnikety, and A. Rowstron. Migrating Server Storage to SSDs: Analysis of Tradeoffs. In Proceedings of the 4th ACM European Conference on Computer systems (EuroSys '09), Nuremberg, Germany, March 31--April 3 2009. ACM Press.
[32]
E. J. O'Neil, P. E. O'Neil, and G. Weikum. The LRU-K page replacement algorithm for database disk buffering. In Proceedings of the 1993 ACM International Conference on Management of Data (SIGMOD '93), Washington, D. C., May 26--28 1993. ACM Press.
[33]
PostgreSQL Global Development Group. Open source database. http://www.postgresql.org.
[34]
V. Prabhakaran, T. L. Rodeheffer, and L. Zhou. Transactional Flash. In Proceedings of the 8th USENIX Symposium on Operating Systems Design and Implementation (OSDI 08), San Diego, CA, December 2008. The USENIX Association.
[35]
J. T. Robinson and M. V. Devarakonda. Data Cache Management Using Frequency-Based Replacement. In Proceedings of the International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS 1990), Boulder, CO, May 22--25 1990. ACM Press.
[36]
O. Rodeh and A. Teperman. zFS - A Scalable Distributed File System Using Object Disks. In Proceedings of the 20th Goddard Conference on Mass Storage Systems (MSS'03). San Diego, CA, April 2003. IEEE.
[37]
C. Ruemmler and J. Wilkes. Disk shuffling. Technical Report HPL-91-156, Hewlett-Packard Laboratories, October 2001.
[38]
J. Schindler, J. L. Griffin, C. R. Lumb, and G. R. Ganger. Track-aligned Extents: Matching Access Patterns to Disk Drive Characteristics. In Proceedings of the 1st USENIX Conference on File and Storage Technologies (FAST 02), Monterey, CA, January 2002. The USENIX Association.
[39]
A. Silberschatz, P. B. Galvin, and G. Gagne. Operating Systems Concepts. Wiley, 8th edition, 2009.
[40]
G. Sivathanu, S. Sundararaman, and E. Zadok. Type-safe Disks. In Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation (OSDI 06), Seattle, WA, November 2006. The USENIX Association.
[41]
M. Sivathanu, L. N. Bairavasundaram, A. C. Arpaci-Dusseau, and R. H. Arpaci-Dusseau. Life or Death at Block Level. In Proceedings of the 6th Symposium on Operating Systems Design and Implementation (OSDI 04), pages 379--394, San Francisco, CA, December 2004. The USENIX Association.
[42]
M. Sivathanu, V. Prabhakaran, A. C. Arpaci-Dusseau, and R. H. Arpaci-Dusseau. Improving Storage System Availability with D-GRAID. In Proceedings of the 3rd USENIX Conference on File and Storage Technologies (FAST 04), pages 15--30, San Francisco, CA, March 2004. The USENIX Association.
[43]
M. Sivathanu, V. Prabhakaran, F. I. Popovici, T. E. Denehy, A. C. Arpaci-Dusseau, and R. H. Arpaci-Dusseau. Semantically-Smart Disk Systems. In Proceedings of the 2th USENIX Conference on File and Storage Technologies (FAST 03), San Francisco, CA, March-April 2003. The USENIX Association.
[44]
Standard Performance Evaluation Corporation. Spec sfs. http://www.storageperformance.org.
[45]
M. Stonebraker. Operating system support for database management. Communications of the ACM, 2(7):412--418, July 1981.
[46]
V. Sundaram and P. Shenoy. A Practical Learning-based Approach for Dynamic Storage Bandwidth Allocation. In Proceedings of the Eleventh International Workshop on Quality of Service (IWQoS 2003), Berkeley, CA, June 2003. Springer.
[47]
Transaction Processing Performance Council. TPC Benchmark H. http://www.tpc.org/tpch.
[48]
S. Uttamchandani, K. Voruganti, S. Srinivasan, J. Palmer, and D. Pease. Polus: Growing Storage QoS Management Beyond a "4-Year Old Kid". In Proceedings of the 3rd USENIX Conference on File and Storage Technologies (FAST 04), San Francisco, CA, March 2004. The USENIX Association.
[49]
M. Uysal, G. A. Alvarez, and A. Merchant. A modular, analytical throughput model for modern disk arrays. In Proceedings of the 9th International Symposium, on Modeling Analysis and Simulation of Computer and Telecommunications Systems (MASCOTS-2001), Cincinnati, OH, August 2001. IEEE/ACM.
[50]
M. Wachs, M. Abd-El-Malek, E. Thereska, and G. R. Ganger. Argon: Performance Insulation for Shared Storage Servers. In Proceedings of the 5th USENIX Conference on File and Storage Technologies (FAST 07), San Jose, CA, February 2007. The USENIX Association.
[51]
A.-I. A. Wang, P. Reiher, G. J. Popek, and G. H. Kuenning. Conquest: Better performance through a Disk/Persistent-RAM hybrid file system. In Proceedings of the 2002 USENIX Annual Technical Conference (USENIX ATC 2002), Monterey, CA, June 2002. The USENIX Association.
[52]
M. Wang, K. Au, A. Ailamaki, A. Brockwell, C. Faloutsos, and G. R. Ganger. Storage device performance prediction with CART models. In Proceedings of the 12th International Symposium on Modeling Analysis and Simulation of Computer and Telecommunications Systems (MASCOTS-2004), Volendam, The Netherlands, October 2004. IEEE.
[53]
Y. Wang and A. Merchant. Proportional share scheduling for distributed storage systems. In Proceedings of the 5th USENIX Conference on File and Storage Technologies (FAST 07), San Jose, CA, February 2007. The USENIX Association.
[54]
R. Wijayaratne and A. L. N. Reddy. Providing QoS guarantees for disk I/O. Multimedia Systems, 8(1):57--68, February 2000.
[55]
J. Wilkes. Traveling to Rome: QoS specifications for automated storage system management. In Proceedings of the 9th International Workshop on Quality of Service (IWQoS 2001), Karlsruhe, Germany, June 2001.
[56]
J. Wilkes, R. Golding, C. Staelin, and T. Sullivan. The HP AutoRAID Hierarchical Storage System. ACM Transactions on Computer Systems (TOCS), 14(1):108--136, February 1996.
[57]
G. Yadgar, M. Factor, and A. Schuster. Karma: Know-it-All Replacement for a Multilevel cAche. In Proceedings of the 5th USENIX Conference on File and Storage Technologies (FAST 07), San Jose, CA, February 2007. The USENIX Association.
[58]
C. Yalamanchili, K. Vijayasankar, E. Zadok, and G. Sivathanu. DHIS: discriminating hierarchical storage. In Proceedings of The Israeli Experimental Systems Conference (SYSTOR 09), Haifa, Israel, May 2009. ACM Press.
[59]
Y. Zhou, J. F. Philbin, and K. Li. The Multi-Queue Replacement Algorithm for Second Level Buffer Caches. In Proceedings of the 2001 USENIX Annual Technical Conference, Boston, MA, June 25-30 2001. The USENIX Association.

Cited By

View all
  • (2024)Configuring and Coordinating End-to-end QoS for Emerging Storage InfrastructureACM Transactions on Modeling and Performance Evaluation of Computing Systems10.1145/36316069:1(1-32)Online publication date: 15-Jan-2024
  • (2024)LMPT: A Novel Authenticated Data Structure to Eliminate Storage Bottlenecks for High Performance BlockchainsIEEE Transactions on Network and Service Management10.1109/TNSM.2023.334620221:2(1333-1343)Online publication date: Apr-2024
  • (2023)CacheSack: Theory and Experience of Google’s Admission Optimization for Datacenter Flash CachesACM Transactions on Storage10.1145/358201419:2(1-24)Online publication date: 6-Mar-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SOSP '11: Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles
October 2011
417 pages
ISBN:9781450309776
DOI:10.1145/2043556
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 October 2011

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. caching
  2. classification
  3. quality of service
  4. solid-state storage

Qualifiers

  • Research-article

Conference

SOSP '11
Sponsor:

Acceptance Rates

Overall Acceptance Rate 174 of 961 submissions, 18%

Upcoming Conference

SOSP '25
ACM SIGOPS 31st Symposium on Operating Systems Principles
October 13 - 16, 2025
Seoul , Republic of Korea

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)24
  • Downloads (Last 6 weeks)4
Reflects downloads up to 17 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Configuring and Coordinating End-to-end QoS for Emerging Storage InfrastructureACM Transactions on Modeling and Performance Evaluation of Computing Systems10.1145/36316069:1(1-32)Online publication date: 15-Jan-2024
  • (2024)LMPT: A Novel Authenticated Data Structure to Eliminate Storage Bottlenecks for High Performance BlockchainsIEEE Transactions on Network and Service Management10.1109/TNSM.2023.334620221:2(1333-1343)Online publication date: Apr-2024
  • (2023)CacheSack: Theory and Experience of Google’s Admission Optimization for Datacenter Flash CachesACM Transactions on Storage10.1145/358201419:2(1-24)Online publication date: 6-Mar-2023
  • (2023)KVRangeDB: Range Queries for a Hash-based Key–Value DeviceACM Transactions on Storage10.1145/358201319:3(1-21)Online publication date: 19-Jun-2023
  • (2023)Localized Validation Accelerates Distributed Transactions on Disaggregated Persistent MemoryACM Transactions on Storage10.1145/358201219:3(1-35)Online publication date: 19-Jun-2023
  • (2023)Incorporating A Triple Graph Neural Network with Multiple Implicit Feedback for Social RecommendationACM Transactions on the Web10.1145/3580517Online publication date: 21-Jan-2023
  • (2023)Extending and Programming the NVMe I/O Determinism Interface for Flash ArraysACM Transactions on Storage10.1145/356842719:1(1-33)Online publication date: 11-Jan-2023
  • (2023)Taming Metadata-intensive HPC Jobs Through Dynamic, Application-agnostic QoS Control2023 IEEE/ACM 23rd International Symposium on Cluster, Cloud and Internet Computing (CCGrid)10.1109/CCGrid57682.2023.00015(47-61)Online publication date: May-2023
  • (2022)HintStor: A Framework to Study I/O Hints in Heterogeneous StorageACM Transactions on Storage10.1145/348914318:2(1-24)Online publication date: 10-Mar-2022
  • (2020)Desperately seeking ... optimal multi-tier cache configurationsProceedings of the 12th USENIX Conference on Hot Topics in Storage and File Systems10.5555/3488733.3488739(6-6)Online publication date: 13-Jul-2020
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media