ABSTRACT
The need to support new features in existing storage systems is an ongoing concern for storage developers. So is the desire to develop next generation storage systems that can adopt newly developed feature improvements with relative ease. Extending storage systems is challenging because of the inherent complexity of their codebases and the need to ensure that the storage state does not become corrupt or inconsistent when enabling new features. In this work, we examine a new storage architecture, FDMI, that uses the well-established publish-subscribe model for extending the feature set of a host storage system using plugins. A central mechanism in FDMI is transactional coupling. With transactional coupling, the subscribed plugin can either create new transactions that execute asynchronously following the successful completion of the precipitating event or can participate in the pending transaction and control whether the precipitating event itself will or will not be committed. We further create a classification of transactional mechanisms as well as possible desired plugin functionality and explore the matrix of these two classifications to create a new model for faster, safer distributed storage development.
- 2022. File Systems in the Linux kernel: FUSE. https://www.kernel.org/doc/html/latest/filesystems/fuse.html.Google Scholar
- Nitin Agrawal, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. 2009. Generating Realistic Impressions for File-SystemBench-marking. In 7th USENIX Symposium on File and Storage Technologies.Google Scholar
- Marcos K. Aguilera, Kimberly Keeton, Arif Merchant, Kiran-Kumar Muniswamy-Reddy, and Mustafa Uysal. 2007. Improving Recoverability in Multi-tier Storage Systems. In 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN'07). 677--686.Google Scholar
- Samer Al-Kiswany, Abdullah Gharaibeh, and Matei Ripeanu. 2009. The Case for a Versatile Storage System. In USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage'09).Google Scholar
- Amazon AWS. 2022. Amazon S3 Event Notifications. https://docs.aws.amazon.com/AmazonS3/latest/userguide/NotificationHowTo.html.Google Scholar
- Amazon AWS. 2022. AWS Lambda service. https://aws.amazon.com/lambda/.Google Scholar
- Amazon AWS. 2022. S3 Bucket Backup. https://docs.aws.amazon.com/aws-backup/latest/devguide/s3-backups.html.Google Scholar
- John Bent, Garth Gibson, Gary Grider, Ben McClelland, Paul Nowoczynski, James Nunez, Milo Polte, and Meghan Wingate. 2009. PLFS: a checkpoint filesystem for parallel applications. In Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis. IEEE, 1--12.Google ScholarDigital Library
- Medha Bhadkamkar, Jorge Guerra, Luis Useche, Sam Burnett, Jason Liptak, and Vagelis Hristidis. 2009. BORG: Block-reORGanization for Self-optimizing Storage Systems. In 7th USENIX Conference on File and Storage Technologies (FAST 09).Google Scholar
- K. Birman and T. Joseph. 1987. Exploiting Virtual Synchrony in Distributed Systems. In Proceedings of the Eleventh ACM Symposium on Operating Systems Principles (SOSP '87). 123âĂŞ138.Google Scholar
- Matt Blaze. 1993. A Cryptographic File System for UNIX. In Proceedings of the 1st ACM Conference on Computer and Communications Security (CCS '93). 9âĂŞ16.Google ScholarDigital Library
- Giuseppe Cattaneo, Luigi Catuogno:Università di Salerno, Aniello Del Sorbo:Università di Salerno, and Pino Persiano:Università di Salerno. 2001. The Design and Implementation of a Transparent Cryptographic File System for UNIX. In 2001 USENIX Annual Technical Conference (USENIX ATC 01).Google Scholar
- Ceph. 2022. Bucket Notifications. https://docs.ceph.com/en/latest/radosgw/notifications/.Google Scholar
- Nikita Yurievich Danilov and Eric Barton. 2012. System and method for performing distributed transactions using global epochs. US Patent 8,103,643.Google Scholar
- Michail Flouris and Angelos Bilas. 2005. Violin: A Framework For Extensible Block-level Storage. In IEEE Conference on Mass Storage Systems and Technologies.Google Scholar
- Jorge Guerra, Himabindu Pucha, Joseph Glider, Wendy Belluomini, and Raju Rangaswami. 2011. Cost Effective Storage Using Extent Based Dynamic Tiering. In Proceedings of the 9th USENIX Conference on File and Stroage Technologies (FAST'11). 20.Google ScholarDigital Library
- Jorge Guerra, Luis Useche, Medha Bhadkamkar, Ricardo Koller, and Raju Rangaswami. 2008. The case for active block layer extensions. ACM SIGOPS Operating Systems Review 42 (10 2008), 3--9.Google Scholar
- Haryadi S. Gunawi, Vijayan Prabhakaran, Swetha Krishnan, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. 2007. Improving File System Reliability with I/O Shepherding. In Proceedings of Twenty-First ACM SIGOPS Symposium on Operating Systems Principles (SOSP '07). New York, NY, USA, 293âĂŞ306.Google Scholar
- Hadoop. 2022. Zookeeper Watches. https://zookeeper.apache.org/doc/r3.3.3/zookeeperProgrammers.html.Google Scholar
- Tyler Harter, Chris Dragga, Michael Vaughn, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. 2012. A File Is Not a File: Understanding the I/O Behavior of Apple Desktop Applications. ACM Trans. Comput. Syst. (aug 2012), 39.Google Scholar
- Dave Hitz, Michael Malcolm, and James Lau. 1994. File System Design for an NFS File Server Appliance. In USENIX Winter 1994 Technical Conference (USENIX Winter 1994 Technical Conference).Google ScholarDigital Library
- Windsor W. Hsu, Alan Jay Smith, and Honesty C. Young. 2005. The Automatic Improvement of Locality in Storage Systems. ACM Trans. Comput. Syst. (nov 2005), 424âĂŞ473.Google Scholar
- Tianyang Jiang, Guangyan Zhang, Zican Huang, Xiaosong Ma, Junyu Wei, Zhiyue Li, and Weimin Zheng. 2021. FusionRAID: Achieving Consistent Low Latency for Commodity SSD Arrays. In 19th USENIX Conference on File and Storage Technologies (FAST 21). 355--370.Google Scholar
- Kafka. 2022. Kafka Documentation. https://kafka.apache.org/documentation/.Google Scholar
- Kimberley Keeton, Cipriano Santos, Dirk Beyer, Jeffrey Chase, and John Wilkes. 2004. Designing for Disasters. In 3rd USENIX Conference on File and Storage Technologies (FAST 04).Google Scholar
- Gene H. Kim and Eugene H. Spafford. 1994. The Design and Implementation of Tripwire: A File System Integrity Checker. In Proceedings of the 2nd ACM Conference on Computer and Communications Security (CCS '94). 18âĂŞ29.Google Scholar
- Ricardo Koller and Raju Rangaswami. 2010. I/O Deduplication: Utilizing Content Similarity to Improve I/O Performance. In 8th USENIX Conference on File and Storage Technologies (FAST 10).Google ScholarDigital Library
- Andrew W. Leung, Ethan L. Miller, and Stephanie Jones. 2007. Scalable security for petascale parallel file systems. In SC '07: Proceedings of the 2007 ACM/IEEE Conference on Supercomputing. 1--12.Google ScholarDigital Library
- Mark Lillibridge, Kave Eshghi, Deepavali Bhagwat, Vinay Deolalikar, Greg Trezise, and Peter Camble. 2009. Sparse Indexing: Large Scale, Inline Deduplication Using Sampling and Locality. In 7th USENIX Conference on File and Storage Technologies (FAST 09).Google Scholar
- Linux. 2022. LessFS: deduplication filse system in Linux. https://sites.google.com/a/projectme.org/lessfs/lessfs-guide.Google Scholar
- MinIO. 2022. MinIO: Multi-Cloud Object Storage. https://min.io.Google Scholar
- Dushyanth Narayanan, Austin Donnelly, and Antony Rowstron. 2008. Write Off-Loading: Practical Power Management for Enterprise Storage. In 6th USENIX Conference on File and Storage Technologies (FAST 08).Google ScholarDigital Library
- James O'Toole, David Gifford, Pierre Jouvelot, and Mark Sheldon. 1997. Semantic File Systems. ACM SIGOPS Operating Systems Review 25 (11 1997).Google Scholar
- Swapnil Patil, Anand Kashyap, Gopalan Sivathanu, and Erez Zadok. 2004. FS: An In-Kernel Integrity Checker and Intrusion Detection File System. In Proceedings of the 18th USENIX Conference on System Administration (LISA '04). 67âĂŞ78.Google ScholarDigital Library
- David A. Patterson, Garth Gibson, and Randy H. Katz. 1988. A Case for Redundant Arrays of Inexpensive Disks (RAID). In Proceedings of the 1988 ACM SIGMOD International Conference on Management of Data. 109âĂŞ116.Google Scholar
- R. Hugo Patterson and Stephen Manley. 2002. SnapMirror: File-System-Based Asynchronous Mirroring for Disaster Recovery. In Conference on File and Storage Technologies (FAST 02).Google Scholar
- Vijayan Prabhakaran, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. 2005. Analysis and Evolution of Journaling File Systems. In Proceedings of the Annual Conference on USENIX Annual Technical Conference (ATEC '05). 8.Google Scholar
- Tom Rhodes. 2007. FreeBSD Handbook - Chapter 19: GEOM: Modular Disk Transformation Framework. FreeBSD Handbook (2007).Google Scholar
- David Rosenthal. 1990. Evolving the Vnode Interface. In In USENIX Conference Proceedings. 107--118.Google Scholar
- Douglas S. Santry, Michael J. Feeley, Norman C. Hutchinson, Alistair C. Veitch, Ross W. Carton, and Jacob Ofir. 1999. Deciding When to Forget in the Elephant File System. SIGOPS Oper. Syst. Rev. (dec 1999), 110âĂŞ123.Google ScholarDigital Library
- Mohit Saxena, Michael M. Swift, and Yiying Zhang. 2012. FlashTier: A Lightweight, Consistent and Durable Storage Cache. In Proceedings of the 7th ACM European Conference on Computer Systems (EuroSys '12). 267âĂŞ280.Google ScholarDigital Library
- Seagate. 2022. CORTX Intelligent Object Storage Software. https://www.seagate.com/products/storage/object-storage-software/.Google Scholar
- Seagate. 2022. CORTX Motr. https://github.com/Seagate/cortx-motr.Google Scholar
- Muthian Sivathanu, Vijayan Prabhakaran, Florentina I. Popovici, Timothy E. Denehy, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. 2003. Semantically-Smart Disk Systems. In 2nd USENIX Conference on File and Storage Technologies (FAST 03).Google Scholar
- Swift. 2022. Swift: OpenStack Object Storage. https://wiki.openstack.org/wiki/Swift.Google Scholar
- Akshat Verma, Ricardo Koller, Luis Useche, and Raju Rangaswami. 2010. SRCMap: Energy Proportional Storage Using Dynamic Consolidation. In 8th USENIX Conference on File and Storage Technologies (FAST 10).Google Scholar
- Sage Weil, Scott A. Brandt, Ethan L. Miller, Darrell D. E. Long, and Carlos Maltzahn. 2006. Ceph: A Scalable, High-Performance Distributed File System. Proc. USENIX OSDI (November 2006).Google Scholar
- Kan Wu, Zhihan Guo, Guanzhou Hu, Kaiwei Tu, Ramnatthan Alagappan, Rathijit Sen, Kwanghyun Park, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. 2021. The Storage Hierarchy is Not a Hierarchy: Optimizing Caching on Modern Storage Devices with Orthus. In 19th USENIX Conference on File and Storage Technologies (FAST 21). 307--323.Google Scholar
- XiaoJian Wu and A. L. Narasimha Reddy. 2012. A Novel Approach to Manage A Hybrid Storage System. J. Commun. 7 (2012), 473--483.Google ScholarCross Ref
- Erez Zadok, Ion Badulescu, and Alex Shender. 1999. Extending File Systems Using Stackable Templates. In 1999 USENIX Annual Technical Conference (USENIX ATC 99). USENIX Association.Google Scholar
- Yucheng Zhang, Wen Xia, Dan Feng, Hong Jiang, Yu Hua, and Qiang Wang. 2019. Finesse: Fine-Grained Feature Locality based Fast Resemblance Detection for Post-Deduplication Delta Compression. In 17th USENIX Conference on File and Storage Technologies (FAST 19). 121--128.Google Scholar
Index Terms
- Infusing pub-sub storage with transactions
Recommendations
Safe open-nested transactions through ownership
PPoPP '09Researchers in transactional memory (TM) have proposed open nesting as a methodology for increasing the concurrency of transactional programs. The idea is to ignore ``low-level'' memory operations of an open-nested transaction when detecting conflicts ...
Comments