ABSTRACT
The Data SuperCell (DSC) is a new, disk-based data archive deployed and in production at the Pittsburgh Supercomputing Center (PSC). It specifically deals with the archival demands of large data processing in an economic way. DSC incorporates PSCs SLASH2, layered filesystem technology, with commodity hardware and open software, to provide superior functionality, flexibility, manageability, reliability, performance and cost. Below, we describe DSC functionality goals; SLASH2 architecture, capabilities and suitability for archival applications; ZFS as an underlying file system; DSC architecture, structure and capabilities; followed by discussion of our experience with DSC, some performance measurements and plans for further development.
- SLASH2 - (https://quipu.psc.teragrid.org/slash2)Google Scholar
- ZFS - (http://en.wikipedia.org/wiki/ZFS)Google Scholar
- Nowoczynski, P.; Stone, N.; Yanovich, J.; Sommerfield, J. 2008. Zest - Checkpoint storage system for large supercomputers. Petascale Data Storage Workshop, 2008. PDSW '08. 3rd Digital Object Identifier: 10.1109/PDSW.2008.4811883 Publication Year: 2008, Page(s): 1--5Google ScholarCross Ref
- Sandia Portals (http://www.cs.sandia.gov/Portals/)Google Scholar
- File System in Userspace -- FUSE (http://fuse.sourceforge.net/)Google Scholar
- ZFS-FUSE (http://zfs-fuse.net/)Google Scholar
- GPFS/HPSS Interface -- GHI (www.hpss-collaboration.org/documents/HPSS-GPFS2009.pdf)Google Scholar
- Data Supercell (http://www.psc.edu/general/filesys/far/data.php)Google Scholar
- Simms, S. C., M. Davy, B. Hammond, M. Link, C. Stewart, R. Bramley, B. Plale, D. Gannon, M. - H. Baik, S. Teige, et al., All in a day's work: advancing data-intensive research with the data capacitor" Conference on High Performance Networking and Computing, Tampa, FL, ACM, pp. 244, 11/2006. Google ScholarDigital Library
- Data Capacitor (https://pti.iu.edu/dc)Google Scholar
- IOzone Filesystem Benchmark (http://www.iozone.org)Google Scholar
- Lustre-HSM (http://wiki.lustre.org/images/4/4d/Lustre_hsm_seminar_lug10.pdf)Google Scholar
- NWFS2 (http://www.pdsi-scidac.org/docs/sc06/pnnl_sc06_pdsi.pdf)Google Scholar
- Albedo (https://www.xsede.org/web/guest/psc-albedo)Google Scholar
- ExTENCI (http://www.ogf.org/OGF34/materials/2418/ExTENCI-GIN-OGF34.pdf)Google Scholar
- GLUSTER (http://www.gluster.org)Google Scholar
- GPFS (http://www-03.ibm.com/systems/software/gpfs/)Google Scholar
- TeraGrid Data Movement with GPFS-WAN and Parallel NFS. 2007. Supercomputing '07 Bandwidth Challenge.Google Scholar
- High Performance Storage System - HPSS (http://www.hpss-collaboration.org/)Google Scholar
- MooseFS (http://www.moosefs.org)Google Scholar
- The integrated Rule-Oriented Data System -- iRODS (http://www.irods.org)Google Scholar
- ZFS on Linux (http://zfsonlinux.org)Google Scholar
Index Terms
- The data supercell
Recommendations
Optimizing Local File Accesses for FUSE-Based Distributed Storage
SCC '12: Proceedings of the 2012 SC Companion: High Performance Computing, Networking Storage and AnalysisModern distributed file systems can store huge amounts of information while retaining the benefits of high reliability and performance. Many of these systems are prototyped with FUSE, a popular framework for implementing user-level file systems. ...
Direct-FUSE: Removing the Middleman for High-Performance FUSE File System Support
ROSS'18: Proceedings of the 8th International Workshop on Runtime and Operating Systems for SupercomputersDeveloping a file system is a challenging task, especially a kernel-level file system. User-level file systems alleviate the burden and development complexity associated with kernel-level implementations. The Filesystem in Userspace (FUSE) is a widely ...
SuperCell: adaptive software-defined storage for cloud storage workloads
CCGrid '18: Proceedings of the 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid ComputingThe explosive growth of data due to the increasing adoption of cloud technologies in the enterprise has created a strong demand for more flexible, cost-effective, and scalable storage solutions. Many storage systems, however, are not well matched to the ...
Comments