skip to main content
10.1145/1362622.1362659acmconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections
research-article

Integrating parallel file systems with object-based storage devices

Published: 10 November 2007 Publication History

Abstract

As storage systems evolve, the block-based design of today's disks is becoming inadequate. As an alternative, object-based storage devices (OSDs) offer a view where the disk manages data layout and keeps track of various attributes about data objects. By moving functionality that is traditionally the responsibility of the host OS to the disk, it is possible to improve overall performance and simplify management of a storage system. The capabilities of OSDs will also permit performance improvements in parallel file systems, such as further decoupling metadata operations and thus reducing metadata server bottlenecks.
In this work we present an implementation of the Parallel Virtual File System (PVFS) integrated with a software emulator of an OSD and describe an infrastructure for client access. Even with the overhead of emulation, performance is comparable to a traditional server-fronted implementation, demonstrating that serverless parallel file systems using OSDs are an achievable goal.

References

[1]
P. H. Carns, W. B. Ligon III, R. Ross, and P. Wyckoff. BMI: a network abstraction layer for parallel I/O. In Proceedings of IPDPS '05, CAC workshop, Denver, CO, Apr. 2005.
[2]
P. H. Carns, W. B. Ligon III, R. B. Ross, and R. Thakur. PVFS: A parallel file system for Linux clusters. In Proceedings of the 4th Annual Linux Showcase and Conference, pages 317--327, 2000.
[3]
Cluster File Systems, Inc. Lustre frequently asked questions. http://www.clusterfs.com/faq.html.
[4]
Cluster File Systems, Inc. Lustre: a scalable high-performance file system. Technical report, Cluster File Systems, Nov. 2002. http://www.lustre.org/docs/whitepaper.pdf.
[5]
A. Devulapalli and P. Wyckoff. File creation strategies in a distributed metadata file system. In Proceedings of the IEEE International Parallel and Distributed Processing Symposium (IPDPS '07), Mar. 2007.
[6]
D. Du, D. He, C. Hong, J. Jeong, et al. Experiences in building an object-based storage system based on the OSD T-10 standard. In Proceedings of MSST '06, College Park, MD, May 2006.
[7]
M. Factor, K. Meth, D. Naor, O. Rodeh, and J. Satran. Object storage: The future building block for storage systems. In Global Data Interoperability---Challenges and Technologies, Sardinia, Italy, June 2005.
[8]
M. Factor, D. Nagle, D. Naor, E. Riedel, and J. Satran. The OSD security protocol. In Security in Storage Workshop (SISW '05), San Francisco, CA, Dec. 2005.
[9]
T. Fujita and M. Christie. tgt: framework for storage target drivers. In Proceedings of the Ottawa Linux Symposium, Ottawa, Canada, July 2006.
[10]
G. A. Gibson and R. V. Meter. Network attached storage architecture. Communications of the ACM, 43(11):37--45, Nov. 2000.
[11]
G. A. Gibson, D. F. Nagle, K. Amiri, J. Butler, et al. A cost-effective, high-bandwidth storage architecture. In Proceedings of the 8th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-VIII), pages 92--103, 1998.
[12]
G. Goodson, B. Welch, B. Halevy, D. Black, and A. Adamson. NFSv4 pNFS extensions. Technical Report draft-ietf-nfsv4-pnfs-00.txt, IETF, Oct. 2005.
[13]
B. Halevy, B. Welch, J. Zelenka, and T. Pisek. Object-based pNFS Operations. Technical Report draft-ietf-nfsv4-pnfs-obj-00.txt, IETF, Jan. 2006.
[14]
D. R. Hipp et al. SQLite. http://www.sqlite.org/, 2007.
[15]
IBM Research. ObjectStone. http://www.haifa.il.ibm.com/projects/storage/objectstore/objectstone.html.
[16]
InfiniBand Trade Association. InfiniBand Architecture Specification, Oct. 2004.
[17]
Intel Inc. et al. Intel open storage toolkit. http://sourceforge.net/projects/intel-iscsi/, 2007.
[18]
M. Mathis, J. Semke, J. Mahdavi, and T. Ott. The macroscopic behavior of the TCP congestion avoidance algorithm. Computer Communications Review, 27(3), July 1997.
[19]
J. H. Morris, M. Satyanarayanan, M. H. Conner, J. H. Howard, D. S. Rosenthal, and F. D. Smith. Andrew: a distributed personal computing environment. Communications of the ACM, 29(3):184--201, 1986.
[20]
D. Nagle, D. Serenyi, and A. Matthews. The Panasas ActiveScale storage cluster---delivering scalable high bandwidth storage. In Proceedings of the ACM/IEEE SC2004 Conference (SC '04), Pittsburgh, PA, Nov. 2004.
[21]
B. Pawlowski, C. Juszczak, P. Staubach, C. Smith, D. Lebel, and D. Hitz. NFS version 3: Design and implementation. In USENIX Summer Technical Conference, pages 137--152, 1994.
[22]
O. Rodeh. Building a distributed database with device-served leases. Technical report, IBM Haifa labs, 2005.
[23]
D. Roselli, J. Lorch, and T. Anderson. A comparison of file system workloads. In Proceedings of the 2000 USENIX Annual Technical Conference, pages 41--54, June 2000.
[24]
R. Rosner, A. Calder, J. Dursi, B. Fryxell, et al. Flash code: Studying astrophysical thermonuclear flashes. In Computing in Science and Engineering, volume 2, pages 33--41, Mar. 2000.
[25]
R. Ross, D. Nurmi, A. Cheng, and M. Zingale. A case study in application I/O on linux clusters. In Proceedings of SC '01, Denver, CO, 2001.
[26]
J. Satran, K. Meth, C. Sapuntzakis, M. Chadalapaka, and E. Zeidner. Internet small computer systems interface (iSCSI). Technical report, IETF RFC 3720, Apr. 2004.
[27]
S. Shepler, B. Callaghan, D. Robinson, R. Thurlow, C. Beame, M. Eisler, and D. Noveck. Network file system (NFS) version 4 protocol. Technical report, IETF RFC 3530, Apr. 2003.
[28]
Sun Inc. et al. Solaris object storage device (OSD). http://www.opensolaris.org/os/project/osd/, 2007.
[29]
F. Tomonori and O. Masanori. Analysis of iSCSI target software. In SNAPI '04: Proceedings of the international workshop on storage network architecture and parallel I/Os, pages 25--32, 2004.
[30]
S. Tweedie. Ext3, journaling filesystem. In Proceedings of the Ottawa Linux Symposium, Ottawa, Canada, July 2000.
[31]
F. Wang, S. A. Brandt, E. L. Miller, and D. D. E. Long. OBFS: A file system for object-based storage devices. In 21st IEEE / 12th NASA Goddard Conference on Mass Storage Systems and Technologies (MSST '04), pages 283--300, College Park, MD, Apr. 2004.
[32]
F. Wang, Q. Xin, B. Hong, S. A. Brandt, E. L. Miller, D. D. E. Long, and T. T. McLarty. File system workload analysis for large scale scientific computing applications. In Proceedings of the Twentieth IEEE/Eleventh NASA Goddard Conference on Mass Storage Systems and Technologies, College Park, MD, Apr. 2004.
[33]
R. O. Weber. Information technology---SCSI Primary commands - 3 (SPC-2), revision 23. Technical report, INCITS Technical Committee T10/1416-D, May 2005.
[34]
R. O. Weber. Information technology---SCSI object-based storage device commands -2 (OSD-2), revision 1. Technical report, INCITS Technical Committee T10/1729-D, Jan. 2007.
[35]
S. A. Weil, S. A. Brandt, E. L. Miller, D. D. E. Long, and C. Maltzahn. Ceph: A scalable, high-performance distributed file system. In Proceedings of OSDI '06, pages 307--320, Seattle, WA, Nov. 2006.
[36]
P. Wong and R. der Wijngaart. NAS parallel benchmarks I/O version 2.4. Technical Report NAS-03-002, NASA Ames Research Center, Moffet Field, CA, Jan. 2003.

Cited By

View all
  • (2019)D3N: A multi-layer cache for the rest of us2019 IEEE International Conference on Big Data (Big Data)10.1109/BigData47090.2019.9006396(327-338)Online publication date: Dec-2019
  • (2018)Toward Transparent Data Management in Multi-Layer Storage Hierarchy of HPC Systems2018 IEEE International Conference on Cloud Engineering (IC2E)10.1109/IC2E.2018.00046(211-217)Online publication date: Apr-2018
  • (2017)Rethinking key-value store for parallel I/O optimizationInternational Journal of High Performance Computing Applications10.1177/109434201667708431:4(335-356)Online publication date: 1-Jul-2017
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SC '07: Proceedings of the 2007 ACM/IEEE conference on Supercomputing
November 2007
723 pages
ISBN:9781595937643
DOI:10.1145/1362622
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 10 November 2007

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Conference

SC '07
Sponsor:

Acceptance Rates

SC '07 Paper Acceptance Rate 54 of 268 submissions, 20%;
Overall Acceptance Rate 1,516 of 6,373 submissions, 24%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)3
  • Downloads (Last 6 weeks)0
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2019)D3N: A multi-layer cache for the rest of us2019 IEEE International Conference on Big Data (Big Data)10.1109/BigData47090.2019.9006396(327-338)Online publication date: Dec-2019
  • (2018)Toward Transparent Data Management in Multi-Layer Storage Hierarchy of HPC Systems2018 IEEE International Conference on Cloud Engineering (IC2E)10.1109/IC2E.2018.00046(211-217)Online publication date: Apr-2018
  • (2017)Rethinking key-value store for parallel I/O optimizationInternational Journal of High Performance Computing Applications10.1177/109434201667708431:4(335-356)Online publication date: 1-Jul-2017
  • (2017)SoMeta: Scalable Object-Centric Metadata Management for High Performance Computing2017 IEEE International Conference on Cluster Computing (CLUSTER)10.1109/CLUSTER.2017.53(359-369)Online publication date: Sep-2017
  • (2014)Using an Object-Based Active Storage Framework to Improve Parallel Storage SystemsProceedings of the 2014 43rd International Conference on Parallel Processing Workshops10.1109/ICPPW.2014.22(70-78)Online publication date: 9-Sep-2014
  • (2014)Rethinking key-value store for parallel I/O optimizationProceedings of the 2014 International Workshop on Data Intensive Scalable Computing Systems10.1109/DISCS.2014.11(33-40)Online publication date: 16-Nov-2014
  • (2013)Grid Data HandlingIT Policy and Ethics10.4018/978-1-4666-2919-6.ch014(294-321)Online publication date: 2013
  • (2013)Toward a unified object storage foundation for scalable storage systems2013 IEEE International Conference on Cluster Computing (CLUSTER)10.1109/CLUSTER.2013.6702691(1-8)Online publication date: Sep-2013
  • (2013)Optimizations on the Parallel Virtual File System implementation integrated with Object-Based Storage Devices2013 IEEE International Conference on Cluster Computing (CLUSTER)10.1109/CLUSTER.2013.6702660(1-5)Online publication date: Sep-2013
  • (2013)An object interface storage node for clustered file systems2013 IEEE International Conference on Cluster Computing (CLUSTER)10.1109/CLUSTER.2013.6702624(1-5)Online publication date: Sep-2013
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media