Article

Modeling the relative fitness of storage

Authors:
Michael P. Mesnier

Intel and Carnegie Mellon University

Intel and Carnegie Mellon University
View Profile

,
Matthew Wachs

Carnegie Mellon University

Carnegie Mellon University
View Profile

,
Raja R. Sambasivan

Carnegie Mellon University

Carnegie Mellon University
View Profile

,
Alice X. Zheng

Carnegie Mellon University

Carnegie Mellon University
View Profile

,
Gregory R. Ganger

Carnegie Mellon University

Carnegie Mellon University
View Profile

SIGMETRICS '07: Proceedings of the 2007 ACM SIGMETRICS international conference on Measurement and modeling of computer systemsJune 2007Pages 37–48https://doi.org/10.1145/1254882.1254887

Published:12 June 2007Publication History

SIGMETRICS '07: Proceedings of the 2007 ACM SIGMETRICS international conference on Measurement and modeling of computer systems

Pages 37–48

ABSTRACT

Relative fitness is a new black-box approach to modeling the performance of storage devices. In contrast with an absolute model that predicts the performance of a workload on a given storage device, a relative fitness model predicts performance differences between a pair of devices. There are two primary advantages to this approach. First, because are lative fitness model is constructed for a device pair, the application-device feedback of a closed workload can be captured (e.g., how the I/O arrival rate changes as the workload moves from device A to device B). Second, a relative fitness model allows performance and resource utilization to be used in place of workload characteristics. This is beneficial when workload characteristics are difficult to obtain or concisely express (e.g., rather than describe the spatio-temporal characteristics of a workload, one could use the observed cache behavior of device A to help predict the performance of B.

This paper describes the steps necessary to build a relative fitness model, with an approach that is general enough to be used with any black-box modeling technique. We compare relative fitness models and absolute models across a variety of workloads and storage devices. On average, relative fitness models predict bandwidth and throughput within 10-20% and can reduce prediction error by as much as a factor of two when compared to absolute models.

References

G. A. Alvarez, J. Wilkes, E. Borowsky, S. Go, T. H. Romer, R. Becker-Szendy, R. Golding, A. Merchant, M. Spasojevic, and A. Veitch. Minerva: an automated resource provisioning tool for large-scale storage systems. ACM Transactions on Computer Systems, 19(4):483--518. ACM, November 2001. Google ScholarDigital Library
E. Anderson. Simple table-based modeling of storage devices. SSP Technical Report HPL-SSP-2001-4. HP Laboratories, July 2001.Google Scholar
E. Anderson, M. Hobbs, K. Keeton, S. Spence, M. Uysal, and A. Veitch. Hippodrome: running circles around storage administration. Conference on File and Storage Technologies (Monterey, CA, 28-30 January 2002), pages 175--188. USENIX Association, 2002. Google ScholarDigital Library
N. Appliance. PostMark: A New File System Benchmark. http://www.netapp.com.Google Scholar
E. Borowsky, R. Golding, A. Merchant, L. Schreier, E. Shriver, M. Spasojevic, and J. Wilkes. Using attribute-managed storage to achieve QoS. International Workshop on Quality of Service (Pittsburgh, PA, 21-23 March 1997). IFIP, 1997.Google ScholarCross Ref
L. Breiman, J. H.Friedman, R. A. Olshen, and C. J. Stone. Classification and Regression Trees. Wadsworth.Google Scholar
M. J. Carey, D. J. DeWitt, M. J. Franklin, N. E. Hall, M. L. McAuliffe, J. F. Naughton, D. T. Schuh, M. H. Solomon, C. K. Tan, O. G. Tsatalos, S. J. White, and M. J. Zwilling. Shoring up persistent applications. ACM SIGMOD International Conference on Management of Data (Minneapolis, MN, 24-27 May 1994). Published as SIGMOD Record, 23(2):383--394. ACM Press, 1994. Google ScholarDigital Library
D. J. Futuyma. Evolutionary Biology. Third edition. SUNY, Stony Brook. Sinauer. December 1998.Google Scholar
G. R. Ganger. Generating representative synthetic workloads: an unsolved problem. International Conference on Management and Performance Evaluation of Computer Systems (Nashville, TN), pages 1263--1269, 1995.Google Scholar
G. R. Ganger and Y. N. Patt. Using system-level models to evaluate I/O subsystem designs. IEEE Transactions on Computers, 47(6):667--678, June 1998. Google ScholarDigital Library
G. R. Ganger, J. D. Strunk, and A. J. Klosterman. Self-Storage: brick-based storage with automated administration. Technical Report CMU-CS-03-178. Carnegie Mellon University, August 2003.Google ScholarCross Ref
T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning. Springer Verlag. 2001.Google Scholar
Intel. iSCSI. www.sourceforge.net/projects/intel-iscsi.Google Scholar
T. Kelly, I. Cohen, M. Goldszmidt, and K. Keeton. Inducing models of black-box storage arrays. Technical report HPL-2004-108. HP, June 2004.Google Scholar
Z. Kurmas and K. Keeton. Using the distiller to direct the development of self-configuration software. International Conference on Autonomic Computing (New York, NY, 17-18 May 2004), pages 172--179. IEEE, 2004. Google ScholarDigital Library
Z. Kurmas, K. Keeton, and K. Mackenzie. Synthesizing representative I/O workloads using iterative distillation. International Workshop on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems (Orlando, FL, 12-15 October 2003). IEEE/ACM, 2003.Google ScholarCross Ref
A. Merchant and P. S. Yu. Analytic modeling of clustered RAID with mapping based on nearly random permutation. IEEE Transactions on Computers, 45(3):367--373, March 1996. Google ScholarDigital Library
T. M. Mitchell. Machine Learning. McGraw-Hill, 1997. Google ScholarDigital Library
F. I. Popovici, A. C. A. Dusseau, and R. H. A. Dusseau. Robust, portable I/O scheduling with the disk mimic. USENIX Annual Technical Conference (San Antonio, TX, 09-14 June 2003), pages 297--310. IEEE, 2003.Google Scholar
C. Ruemmler and J. Wilkes. An introduction to disk drive modeling. IEEE Computer, 27(3):17--28, March 1994. Google ScholarDigital Library
J. Satran. iSCSI. http://www.ietf.org/rfc/rfc3720.txt.Google Scholar
E. Shriver, A. Merchant, and J. Wilkes. An analytic behavior model for disk drives with readahead caches and request reordering. ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems (Madison, WI, 22-26 June 1999). Published as ACM SIGMETRICS Performance Evaluation Review, 26(1):182--191. ACM Press, 1990. Google ScholarDigital Library
Transaction Processing Performance Council. TPC Benchmark C. http://www.tpc.org/tpcc.Google Scholar
M. Uysal, G. A. Alvarez, and A. Merchant. A modular, analytical throughput model for modern disk arrays. International Workshop on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems (Cincinnati, OH, 15-18 August 2001), pages 183--192. IEEE, 2001. Google ScholarDigital Library
E. Varki, A. Merchant, J. Xu, and X. Qiu. Issues and challenges in the performance analysis of real disk arrays. Transactions on Parallel and Distributed Systems, 15(6):559--574. IEEE, June 2004. Google ScholarDigital Library
M. Wang, A. Ailamaki, and C. Faloutsos. Capturing the spatio-temporal behavior of real traffic data. IFIP WG 7.3 Symposium on Computer Performance (Rome, Italy, 23-27 September 2002). Published as Performance Evaluation, 49(1-4):147--163, 2002. Google ScholarDigital Library
M. Wang, K. Au, A. Ailamaki, A. Brockwell, C. Faloutsos, and G. R. Ganger. Storage device performance prediction with CART models. International Workshop on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems (Volendam, The Netherlands, 05-07 October 2004), pages 588--595. IEEE/ACM, 2004. Google ScholarDigital Library
M. Wang, T. Madhyastha, N. H. Chan, S. Papadimitriou, and C. Faloutsos. Data mining meets performance evaluation: fast algorithms for modeling bursty traffic. International Conference on Data Engineering (San Jose, CA, 26-01 March 2002), pages 507--516. IEEE, 2002. Google ScholarDigital Library

Index Terms

Modeling the relative fitness of storage
1. Computing methodologies
  1. Modeling and simulation
    1. Model development and analysis
2. Software and its engineering
  1. Software creation and management
    1. Software development process management
      1. Software development methods

Recommendations

Relative fitness modeling
A Direct Path to Dependable Software

Relative fitness is a new approach to modeling the performance of storage devices (e.g., disks and RAID arrays). In contrast to a conventional model, which predicts the performance of an application's I/O on a given device, a relative fitness model ...
Read More
Modeling the relative fitness of storage
SIGMETRICS '07 Conference Proceedings

Relative fitness is a new black-box approach to modeling the performance of storage devices. In contrast with an absolute model that predicts the performance of a workload on a given storage device, a relative fitness model predicts performance ...
Read More
On modeling the relative fitness of storage
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SIGMETRICS '07: Proceedings of the 2007 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
June 2007
398 pages
ISBN:9781595936394
DOI:10.1145/1254882
General Chair:
Leana Golubchik
University of Southern California, USA
,
Program Chairs:
Mostafa Ammar
Georgia Institute of Technology, USA
,
Mor Harchol-Balter
Carnegie Mellon University, USA
ACM SIGMETRICS Performance Evaluation Review Volume 35, Issue 1
SIGMETRICS '07 Conference Proceedings
June 2007
382 pages
ISSN:0163-5999
DOI:10.1145/1269899
Issue’s Table of Contents
Copyright © 2007 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 12 June 2007
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
CART
black-box
modeling
storage
Qualifiers
- Article
Conference

Acceptance Rates
Overall Acceptance Rate459of2,691submissions,17%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 82
  Total Citations
  View Citations
- 744
  Total Downloads
- Downloads (Last 12 months)8
- Downloads (Last 6 weeks)3
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Modeling the relative fitness of storage

SIGMETRICS '07: Proceedings of the 2007 ACM SIGMETRICS international conference on Measurement and modeling of computer systems

ABSTRACT

References

Cited By

Index Terms

Recommendations

Relative fitness modeling

Modeling the relative fitness of storage

On modeling the relative fitness of storage