research-article

Efficient management of idleness in storage systems

Authors:

Evgenia Smirni,

Erik RiedelAuthors Info & Claims

ACM Transactions on Storage (TOS), Volume 5, Issue 2

Article No.: 4, Pages 1 - 25

https://doi.org/10.1145/1534912.1534913

Published: 12 June 2009 Publication History

Abstract

Various activities that intend to enhance performance, reliability, and availability of storage systems are scheduled with low priority and served during idle times. Under such conditions, idleness becomes a valuable “resource” that needs to be efficiently managed. A common approach in system design is to be nonwork conserving by “idle waiting”, that is, delay the scheduling of background jobs to avoid slowing down upcoming foreground tasks.

In this article, we complement “idle waiting” with the “estimation” of background work to be served in every idle interval to effectively manage the trade-off between the performance of foreground and background tasks. As a result, the storage system is better utilized without compromising foreground performance. Our analysis shows that if idle times have low variability, then idle waiting is not necessary. Only if idle times are highly variable does idle waiting become necessary to minimize the impact of background activity on foreground performance. We further show that if there is burstiness in idle intervals, then it is possible to predict accurately the length of incoming idle intervals and use this information to serve more background jobs without affecting foreground performance.

References

[1]

Abd-El-Malek, M., Ganger, G. R., Goodson, G. R., Reiter, M. K., and Wylie, J. J. 2005. Lazy verification in fault-tolerant distributed storage systems. In Proceedings of the 24th IEEE Symposium on Reliable Distributed Systems (SRDS).

Digital Library

[2]

Bachmat, E. and Schindler, J. 2002. Analysis of methods for scheduling low priority disk drive tasks. In Proceedings of the ACM Conference on Measurements and Modeling of Computer Systems (SIGMETRICS). ACM Press. 55--65.

Digital Library

[3]

Bairavasundaram, L. N., Goodson, G. R., Pasupathy, S., and Schindler, J. 2007. An analysis of latent sector errors in disk drives. In Proceedings of the ACM SIGMETRICS Conference. 289--300.

Digital Library

[4]

Colarelli, D. and Grunwald, D. 2002. Massive arrays of idle disks for storage archives. In Proceeding of the SuperComputing Conferences. 1--11.

Digital Library

[5]

Douceur, J. R. and Bolosky, W. J. 1999. Progress-Based regulation of low-importance processes. In Proceedings of 17th ACM Symposium on Operating Systems Principles (SOSP'99). ACM Press. 247--260.

Digital Library

[6]

Douglis, F., Krishnan, P., and Bershad, B. N. 1995. Adaptive disk spin-down policies for mobile computers. In Proceedings of the 2nd USENIX Symposium on Mobile and Location-Independent Computing. 121--137.

Digital Library

[7]

Eggert, L. and Touch, J. D. 2005. Idletime scheduling with preemption intervals. In Proceedings of the 20th ACM Symposium on Operating Systems Principles (SOSP'05). ACM Press. 249--262.

Digital Library

[8]

Golding, R., Bosch, P., Staelin, C., Sullivan, T., and Wilkes, J. 1995. Idleness is not sloth. In Proceedings of the Winter'95 USENIX Conference. 201--222.

Digital Library

[9]

Helmbold, D. P., Long, D. D. E., Sconyers, T. L., and Sherrod, B. 2000. Adaptive disk spin-down for mobile computers. Mobile Netw. Appl 5, 4, 285--297.

Digital Library

[10]

Huang, H., Hung, W., and Shin, K. G. 2005. Fs2: Dynamic data replication in free disk space for improving disk performance and energy consumption. In Proceedings of the 20th ACM Symposium on Operating Systems Principles (SOSP'05). ACM Press. 263--276.

Digital Library

[11]

Iliadis, I., Haas, R., Hu, X.-Y., and Eleftheriou, E. 2008. Disk scrubbing versus intra-disk redundancy for high-reliability raid storage systems. In Proceedings of the ACM SIGMETRICS Conference 241--252.

Digital Library

[12]

Litzkow, M. J., Livny, M., and Mutka, M. W. 1988. Condor - A hunter of idle workstations. In Proceedings of the IEEE International Conference on Distributed Computing Systems (ICDCS). 104--111.

[13]

Lo, V. M., Zappala, D., Zhou, D., Liu, Y., and Zhao, S. 2004. Cluster computing on the fly: P2P scheduling of idle cycles in the Internet. In Proceedings of the International Workshop on Peer-to-Peer Systems (IPTPS). 227--236.

Digital Library

[14]

Merchant, A. and Yu, P. S. 1994. An analytic model of reconstruction time in mirrored disks. Perform. Eval. J. 20, 1-3, 115--129.

Digital Library

[15]

Mi, N., Riska, A., Smirni, E., and Riedel, E. 2008. Enhancing data availability in disk drives through background activities. In Proceedings of the Symposium on the Dependability of Systems and Networks (DSN). 492--501.

[16]

Muntz, R. R. and Lui, J. C. S. 1990. Performance analysis of disk arrays under failures. In International Conference on Very Large Databases (VLDB). 162--173.

Digital Library

[17]

Niu, Z., Shu, T., and Takahashi, Y. 2003. A vacation queue with setup and close-down times and batch markovian arrival processes. Perform. Eval. 54, 3, 225--248.

Digital Library

[18]

Osogami, T., Harchol-Balter, M., and Scheller-Wolf, A. 2005. Analysis of cycle stealing with switching times and thresholds. Perform. Eval. J. 61, 4, 347--369.

Digital Library

[19]

Riska, A. and Riedel, E. 2006. Disk drive level workload characterization. In Proceedings of the USENIX Annual Technical Conference. 97--103.

Digital Library

[20]

Riska, A. and Riedel, E. 2008. Idle read after write - IRAW. In Proceedings of the USENIX Annual Technical Conference. 43--56.

Digital Library

[21]

Schwarz, T. J. E., Xin, Q., Miller, E. L., Long, D. D. E., Hospodor, A., and Ng, S. 2004. Disk scrubbing in large archival storage systems. In Proceedings of the International Symposium on Modeling and Simulation of Computer and Communications Systems (MASCOTS). IEEE Press.

Digital Library

[22]

Sivathanu, M., Prabhakaran, V., Arpaci-Dusseau, A. C., and Arpaci-Dusseau, R. H. 2004. Improving storage system availability with D-GRAID. In Proceedings of the 3rd USENIX Symposium on File and Storage Technologies (FAST'04).

Digital Library

[23]

Takagi, H. 1991. Queuing Analysis Volume 1: Vacations and Priority Systems. North-Holland, New York.

[24]

Theimer, M. M., Lantz, K. A., and Cheriton, D. R. 1985. Preemptable remote execution facilities for the v-system. In Proceedings of the ACM Symposium on Operating Systems Principles (SOSP). 2--12.

Digital Library

[25]

Thereska, E., Schindler, J., Bucy, J., Salmon, B., Lumb, C. R., and Ganger, G. R. 2004. A framework for building unobtrusive disk maintenance applications. In Proceedings of the 3rd USENIX Conference on File and Storage Technologies (FAST).

Digital Library

[26]

Thomasian, A. and Nicola, V. F. 1993. Performance evaluation of a threshold policy for scheduling readers and writers. IEEE Trans. Comput. 42, 1, 83--98.

Digital Library

[27]

Venkataramani, A., Kokku, R., and Dahlin, M. 2002. TCP nice: A mechanism for background transfers. In Proceedings of the 5th Symposium on Operating Systems Design and Implementation (OSDI). 329--343.

Digital Library

[28]

Xu, E. and Alfa, A. S. 2002. A vacation model for the non-saturated readers and writers system with a threshold policy. Perform. Eval. 50, 4, 233--244.

Digital Library

Cited By

Reidys BXue YLi DSukhwani BHwu WChen DAsaad SHuang JDruschel PKaufmann AMace JFlinn JSeltzer M(2023)RackBlox: A Software-Defined Rack-Scale Storage System with Network-Storage Co-DesignProceedings of the 29th Symposium on Operating Systems Principles10.1145/3600006.3613170(182-199)Online publication date: 23-Oct-2023
https://dl.acm.org/doi/10.1145/3600006.3613170
Han SCho MLee GChung E(2023)Page Type-Aware Data Migration Technique for Read Disturb Management of NAND Flash MemoryIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/TVLSI.2023.324017231:4(591-595)Online publication date: 1-Apr-2023
https://dl.acm.org/doi/10.1109/TVLSI.2023.3240172
Pinciroli RYang LAlter JSmirni E(2023)Lifespan and Failures of SSDs and HDDs: Similarities, Differences, and Prediction ModelsIEEE Transactions on Dependable and Secure Computing10.1109/TDSC.2021.313157120:1(256-272)Online publication date: 1-Jan-2023
https://doi.org/10.1109/TDSC.2021.3131571
Show More Cited By

Index Terms

Efficient management of idleness in storage systems
1. Computer systems organization
  1. Dependable and fault-tolerant systems and networks
2. General and reference
  1. Cross-computing tools and techniques
    1. Design
    2. Reliability

Recommendations

Restrained utilization of idleness for transparent scheduling of background tasks
SIGMETRICS '09

A common practice in system design is to treat features intended to enhance performance and reliability as low priority tasks by scheduling them during idle periods, with the goal to keep these features transparent to the user. In this paper, we present ...
Restrained utilization of idleness for transparent scheduling of background tasks
SIGMETRICS '09: Proceedings of the eleventh international joint conference on Measurement and modeling of computer systems

A common practice in system design is to treat features intended to enhance performance and reliability as low priority tasks by scheduling them during idle periods, with the goal to keep these features transparent to the user. In this paper, we present ...
Evaluating the Performability of Systems with Background Jobs
DSN '06: Proceedings of the International Conference on Dependable Systems and Networks

As most computer systems are expected to remain operational 24 hours a day, 7 days a week, they must complete maintenance work while in operation. This work is in addition to the regular tasks of the system and its purpose is to improve system ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Storage

ACM Transactions on Storage Volume 5, Issue 2

June 2009

95 pages

ISSN:1553-3077

EISSN:1553-3093

DOI:10.1145/1534912

Issue’s Table of Contents

Copyright © 2009 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 June 2009

Accepted: 01 January 2009

Revised: 01 August 2008

Received: 01 July 2008

Published in TOS Volume 5, Issue 2

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Funding Sources

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

37
Total Citations
View Citations
617
Total Downloads

Downloads (Last 12 months)11
Downloads (Last 6 weeks)1

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Reidys BXue YLi DSukhwani BHwu WChen DAsaad SHuang JDruschel PKaufmann AMace JFlinn JSeltzer M(2023)RackBlox: A Software-Defined Rack-Scale Storage System with Network-Storage Co-DesignProceedings of the 29th Symposium on Operating Systems Principles10.1145/3600006.3613170(182-199)Online publication date: 23-Oct-2023
https://dl.acm.org/doi/10.1145/3600006.3613170
Han SCho MLee GChung E(2023)Page Type-Aware Data Migration Technique for Read Disturb Management of NAND Flash MemoryIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/TVLSI.2023.324017231:4(591-595)Online publication date: 1-Apr-2023
https://dl.acm.org/doi/10.1109/TVLSI.2023.3240172
Pinciroli RYang LAlter JSmirni E(2023)Lifespan and Failures of SSDs and HDDs: Similarities, Differences, and Prediction ModelsIEEE Transactions on Dependable and Secure Computing10.1109/TDSC.2021.313157120:1(256-272)Online publication date: 1-Jan-2023
https://doi.org/10.1109/TDSC.2021.3131571
Wang YZhou YWu FZhong YZhou JLu ZLi SWang ZXie C(2023)Holistic and Opportunistic Scheduling of Background I/Os in Flash-Based SSDsIEEE Transactions on Computers10.1109/TC.2023.328874872:11(3127-3139)Online publication date: 1-Nov-2023
https://dl.acm.org/doi/10.1109/TC.2023.3288748
Reidys BLiu PHuang JFalsafi BFerdman MLu SWenisch T(2022)RSSD: defend against ransomware with hardware-isolated network-storage codesign and post-attack analysisProceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3503222.3507773(726-739)Online publication date: 28-Feb-2022
https://dl.acm.org/doi/10.1145/3503222.3507773
Pan YChen HZhao JXu Y(2022)HCFTL: A Locality-Aware Flash Translation Layer for Efficient Address TranslationIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2021.311214241:8(2477-2489)Online publication date: Aug-2022
https://doi.org/10.1109/TCAD.2021.3112142
Ali APinciroli RYan FSmirni ECuicchi CQualters IKramer W(2020)BatchProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.5555/3433701.3433792(1-15)Online publication date: 9-Nov-2020
https://dl.acm.org/doi/10.5555/3433701.3433792
Ali APinciroli RYan FSmirni E(2020)BATCH: Machine Learning Inference Serving on Serverless Platforms with Adaptive BatchingSC20: International Conference for High Performance Computing, Networking, Storage and Analysis10.1109/SC41405.2020.00073(1-15)Online publication date: Nov-2020
https://doi.org/10.1109/SC41405.2020.00073
Zhang ZHuang LPauloski JFoster I(2020)Efficient I/O for Neural Network Training with Compressed Data2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS47924.2020.00050(409-418)Online publication date: May-2020
https://doi.org/10.1109/IPDPS47924.2020.00050
Kougkas ADevarajan HSun X(2020)I/O Acceleration via Multi-Tiered Data Buffering and PrefetchingJournal of Computer Science and Technology10.1007/s11390-020-9781-135:1(92-120)Online publication date: 17-Jan-2020
https://dl.acm.org/doi/10.1007/s11390-020-9781-1
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Issue’s Table of Contents