skip to main content
10.1145/1374596.1374600acmconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections
research-article

Integrated system models for reliable petascale storage systems

Published: 11 November 2007 Publication History

Abstract

The big challenges facing petascale storage systems are not actually those of performanc or traditional fault tolerance schemes, but instead the challenges stem from the inherent complexity of large distributed systems. The best way to address these problems is to build a system model into the system itself, and have the system continuously monitor its current state as compared with the desired state according to the model. Change is applied to the model first, and then the system strives to achieve the new model. The hard part is defining a good model, building good support for introspection, and exploiting these features in the system design. This paper describes this problem in more detail, explains some of the techniques we have used in building the Panasas distributed storage system, and concludes with a summary of the open problems with building petascale storage systems.

References

[1]
Gregory R. Ganger, John D. Strunk, Andrew J. Klosterman. Self-* Storage: Brick-based Storage with Automated Administration. Published as Carnegie Mellon University Technical Report, CMU-CS-03-178, August 2003.
[2]
Jeffrey O. Kephart, David M. Chess. "The Vision of Autonomic Computing." Computer, Vol. 36 No. 1, 41--50, 2003.
[3]
L. Lamport. "The Part-Time Parliament." ACM Transactions on Computer Systems, Vol. 16 No. 2, 133--169, 1998
[4]
D. Nagle, D. Serenyi, and A. Matthews. The Panasas ActiveScale storage cluster. Delivering scalable high bandwidth storage. In Proc. of the 2004 ACM/IEEE Conf. on Supercomputing, Nov. 2004.
[5]
Oppenheimer, D., Archana Ganapathi, and David A. Patterson. Why do Internet services fail, and what can be done about it? 4th USENIX Symposium on Internet Technologies and Systems (USITS '03), March 2003.
[6]
van Renesse, R., Birman, K. and Vogels, W. "Astrolabe: A Robust and Scalable Technology for Distributed System Monitoring, Management, and Data Mining." ACM Transactions on Computer Systems, 21(2):164--206, May 2003
[7]
S. A. Weil, S. A. Brandt, E. L. Miller, D. D. E. Long, and C. Maltzahn. Ceph: A scalable, high-performance distributed file system. In Proceedings of the 7th Symposium on Operating Systems Design and Implementation (OSDI), Seattle, WA, Nov. 2006. USENIX.
[8]
B. Welch, Z. Abbasi, G. Gibson, B. Mueller, M. Unangst, J. Zelenka, B. Zhou. Scalable Performance of the Panasas Distributed File System, Proceedings of FAST08, January 2008, San Jose, CA. (to appear)
[9]
Theodore M. Wong, Richard A. Golding, Joseph S. Glider, Elizabeth Borowsky, Ralph A. Becker-Szendy, Claudio Fleiner, Deepak R. Kenchammana-Hosekote, Omer A. Zaki. Kybos: self-management for distributed brick-base storage. Research report RJ 10356, IBM Almaden Research Center, 26 August 2005

Cited By

View all
  • (2011)Scale and concurrency of GIGA+Proceedings of the 9th USENIX conference on File and stroage technologies10.5555/1960475.1960488(13-13)Online publication date: 15-Feb-2011
  • (2010)Parallel Data Storage and AccessScientific Data Management10.1201/9781420069815-c2Online publication date: 6-May-2010
  • (2008)Scalable performance of the Panasas parallel file systemProceedings of the 6th USENIX Conference on File and Storage Technologies10.5555/1364813.1364815(1-17)Online publication date: 26-Feb-2008

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
PDSW '07: Proceedings of the 2nd international workshop on Petascale data storage: held in conjunction with Supercomputing '07
November 2007
72 pages
ISBN:9781595938992
DOI:10.1145/1374596
  • Conference Chair:
  • Garth A. Gibson
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 November 2007

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Panasas
  2. autonomics
  3. cluster management
  4. petascale

Qualifiers

  • Research-article

Conference

SC '07
Sponsor:

Acceptance Rates

Overall Acceptance Rate 17 of 41 submissions, 41%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2011)Scale and concurrency of GIGA+Proceedings of the 9th USENIX conference on File and stroage technologies10.5555/1960475.1960488(13-13)Online publication date: 15-Feb-2011
  • (2010)Parallel Data Storage and AccessScientific Data Management10.1201/9781420069815-c2Online publication date: 6-May-2010
  • (2008)Scalable performance of the Panasas parallel file systemProceedings of the 6th USENIX Conference on File and Storage Technologies10.5555/1364813.1364815(1-17)Online publication date: 26-Feb-2008

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media