ABSTRACT
Object storage clouds (e.g., Amazon S3) have become extremely popular due to their highly usable interface and cost-effectiveness. They are, therefore, widely used by various applications (e.g., Dropbox) to host user data. However, because object storage clouds are flat and lack the concept of a directory, it becomes necessary to maintain file meta-data and directory structure in a separate index cloud. This paper investigates the possibility of using a single object storage cloud to efficiently host the whole filesystem for users, including both the file content and directories, while avoiding meta-data loss caused by index cloud failures. We design a novel data structure, Hierarchical Hash (or H2), to natively enable the efficient mapping from filesystem operations to object-level operations. Based on H2, we implement a prototype system, H2Cloud, that can maintain large filesystems of users in an object storage cloud and support fast directory operations. Both theoretical analysis and real-world experiments confirm the efficacy of our solution: H2Cloud achieves faster directory operations than OpenStack Swift by orders of magnitude, and has similar performance to Dropbox but yet does not need a separate index cloud.
- Aliyun Object Storage Service 2018. (2018). https://intl.aliyun.com/product/oss.Google Scholar
- Amazon S3 (Simple Storage Service) 2018. (2018). http://aws.amazon.com/s3.Google Scholar
- Alysson Neves Bessani, Ricardo Mendes, Tiago Oliveira, Nuno Ferreira Neves, Miguel Correia, Marcelo Pasin, and Paulo Verissimo. 2014. SCFS: A Shared Cloud-backed File System. In Proc. of ATC. USENIX, 169--180. Google ScholarDigital Library
- Scott A Brandt, Ethan L Miller, Darrell DE Long, and Lan Xue. 2003. Efficient Metadata Management in Large Distributed Storage Systems. In Proc. of MSST. IEEE, 290--298. Google ScholarDigital Library
- Building a Consistent Hashing Ring (for OpenStack Swift) 2018. (2018). http://docs.openstack.org/developer/swift/ring_background.html.Google Scholar
- Camlistore 2018. (2018). https://camlistore.org.Google Scholar
- Thierry Titcheu Chekam, Ennan Zhai, Zhenhua Li, Yong Cui, and Kui Ren. 2016. On the Synchronization Bottleneck of OpenStack Swift-like Cloud Storage Systems. In Proc. of INFOCOM. IEEE, 1--9.Google ScholarDigital Library
- Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swaminathan Sivasubramanian, Peter Vosshall. and Werner Vogels. 2007. Dynamo: Amazon's Highly Available Key-value Store. ACM SIGOPS operating systems review 41, 6 (2007), 205--220. Google ScholarDigital Library
- Alan Demers, Dan Greene, Carl Hauser, Wes Irish, John Larson, Scott Shenker, Howard Sturgis, Dan Swinehart, and Doug Terry. 1987. Epidemic Algorithms for Replicated Database Maintenance. In Proc. of PODC. ACM, 1--12. Google ScholarDigital Library
- Idilio Drago, Marco Mellia, Maurizio M Munafo, Anna Sperotto, Ramin Sadre, and Aiko Pras. 2012. Inside Dropbox: Understanding Personal Cloud Storage Services. In Proc. of IMC. ACM, 481--494. Google ScholarDigital Library
- Dropbox confirms that a bug within Selective Sync may have caused data loss 2014. (2014). https://news.ycombinator.com/item?id=8440985.Google Scholar
- Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung. 2003. The Google File System. In ACM SIGOPS Operating Systems Review, Vol. 37. ACM, 29--43. Google ScholarDigital Library
- How a bug in Dropbox permanently deleted my 8000 photos 2014. (2014). https://news.ycombinator.com/item?id=8101579.Google Scholar
- John Howard, Michael Kazar, Sherri Menees, et al. 1988. Scale and Performance in a Distributed File System. ACM Transactions on Computer Systems (TOCS) 6, 1 (1988), 51--81. Google ScholarDigital Library
- John H Howard et al. 1988. An Overview of the Andrew File System. Carnegie Mellon University, Information Technology Center.Google Scholar
- Yu Hua, Hong Jiang, Yifeng Zhu, Dan Feng, and Lei Tian. 2009. SmartStore: Anew Metadata Organization Paradigm with Semantic-Awareness for Next-Generation File Systems. In Proc. of SC. ACM, 10. Google ScholarDigital Library
- Yu Hua, Hong Jiang, Yifeng Zhu, Dan Feng, and Lei Tian. 2012. Semantic-aware Metadata Organization Paradigm in Next-generation File Systems. IEEE Transactions on Parallel and Distributed Systems 23, 2 (2012), 337--344. Google ScholarDigital Library
- Felix Hupfeld, Toni Cortes, Björn Kolbeck, Jan Stender, Erich Focht, Matthias Hess, Jesus Malo, Jonathan Marti, and Eugenio Cesario. 2008. The XtreemFS Architecture-a Case for Object-based File Systems in Grids. Concurrency and computation: Practice and experience 20, 17 (2008), 2049--2060. Google ScholarDigital Library
- Inside the Magic Pocket 2018. (2018). http://blogs.dropbox.com/tech/2016/05/inside-the-magic-pocket.Google Scholar
- David Karger, Eric Lehman, Tom Leighton, Rina Panigrahy, Matthew Levine, and Daniel Lewin. 1997. Consistent Hashing and Random Trees: Distributed Caching Protocols for Relieving Hot Spots on the World Wide Web. In Proc. of STOC. ACM, 654--663. Google ScholarDigital Library
- Avinash Lakshman and Prashant Malik. 2010. Cassandra: a Decentralized Structured Storage System. ACM SIGOPS Operating Systems Review 44, 2 (2010), 35--40. Google ScholarDigital Library
- Leslie Lamport. 2001. Paxos Made Simple. ACM SIGACT News 32, 4 (2001), 18--25.Google Scholar
- Paul J Leach, Michael Mealling, and Rich Salz. 2005. A Universally Unique Identifier (UUID) URN Namespace. (2005).Google Scholar
- Zhenhua Li, Cheng Jin, Tianyin Xu, et al. 2014. Towards Network-level Efficiency for Cloud Storage Services. In Proc. of IMC. ACM, 115--128. Google ScholarDigital Library
- Zhenhua Li, Christo Wilson, Zhefu Jiang, Yao Liu, Ben Y Zhao, Cheng Jin, Zhi-Li Zhang, and Yafei Dai. 2013. Efficient batched synchronization in dropbox-like cloud storage services. In Proc. of Middleware. Springer, 307--327.Google Scholar
- Jinjun Liu, Dan Feng, Yu Hua, Bin Peng, and Zhenhua Nie. 2015. Using Provenance to Efficiently Improve Metadata Searching Performance in Storage systems. Future Generation Computer Systems 50 (2015), 99--110. Google ScholarDigital Library
- Jake Luciani. 2012. Cassandra File System Design. DATATAX Blog {online} http://www.datastax.com/dev/blog/cassandra-file-system-design (2012).Google Scholar
- Micheal Moore, David Bonnie, Becky Ligon, Mike Marshall, Walt Ligon, Nicholas Mills, Elaine Quarles, Sam Sampson, Shuangyang Yang, and Boyd Wilson. 2011. OrangeFS: Advancing PVFS. In Proc. of FAST poster. USENIX.Google Scholar
- Subramanian Muralidhar et al. 2014. f4: Facebook's Warm BLOB Storage System. In Proc. of OSDI. USENIX Association, 383--398. Google ScholarDigital Library
- Salman Niazi, Mahmoud Ismail, Seif Haridi, Jim Dowling, Steffen Grohsschmiedt, and Mikael Ronström. 2017. HopsFS: Scaling Hierarchical File System Metadata Using NewSQL Databases. In Proc. of FAST. USENIX, 89--104. Google ScholarDigital Library
- Fatma Özcan, Nesime Tatbul, Daniel J Abadi, Marcel Kornacker, C Mohan, Karthik Ramasamy, and Janet Wiener. 2014. Are We Experiencing a Big Data Bubble?. In Proc. of SIGMOD. ACM, 1407--1408. Google ScholarDigital Library
- Leandro Pacheco, Raluca Halalai, Valerio Schiavoni, Fernando Pedone, Etienne Riviere, and Pascal Felber. 2016. GlobalFS: A Strongly Consistent Multi-site File System. In Proc. of SRDS. IEEE, 147--156.Google ScholarCross Ref
- Swapnil Patil and Garth A Gibson. 2011. Scale and Concurrency of GIGA+: File System Directories with Millions of Files. In Proc. of FAST. USENIX, 13--13. Google ScholarDigital Library
- Brian Pawlowski, Chet Juszczak, Peter Staubach, Carl Smith, Diane Lebel, and Dave Hitz. 1994. NFS Version 3: Design and Implementation. In USENIX Summer. Boston, MA, 137--152.Google Scholar
- T. S. Pillai et al. 2014. All File Systems Are Not Created Equal: On the Complexity of Crafting Crash-Consistent Applications. In Proc. of OSDI. 433--448. Google ScholarDigital Library
- Gerald Popek and Bruce J Walker. 1985. The LOCUS Distributed System Architecture. The MIT press. Google ScholarDigital Library
- Sean Quinlan and Sean Dorward. 2002. Venti: A New Approach to Archival Storage. In Proc. of FAST. 89--101. Google ScholarDigital Library
- Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp, and Scott Shenker. 2001. A Scalable Content-Addressable Network. In Proc. of SIGCOMM. ACM. Google ScholarDigital Library
- Sean Rhea, Russ Cox, and Alex Pesterev. 2008. Fast, Inexpensive Content-Addressed Storage in Foundation. In Proc. of ATC. USENIX Association, 143--156. Google ScholarDigital Library
- Mahadev Satyanarayanan, James Kistler, and Kumarand others. 1990. Coda: A Highly Available File System for a Distributed Workstation Environment. IEEE Trans. Comput. 39, 4 (1990), 447--459. Google ScholarDigital Library
- Konstantin Shvachko and Yuxiang Chen. 2017. Scaling Namespace Operations with Giraffa File System. USENIX;log in: 42, 2 (2017), 27--30.Google Scholar
- Konstantin Shvachko, Hairong Kuang, Sanjay Radia, and Robert Chansler. 2010. The Hadoop Distributed File System. In Proc. of MSST. IEEE, 1--10. Google ScholarDigital Library
- Mandayam C Srivas et al. 2017. Map-Reduce Ready Distributed File System. (2017). US Patent App. 15/668,666.Google Scholar
- Michael Stonebraker. 2012. NewSQL: An Alternative to NoSQL and Old SQL for New OLTP Apps. Commun. ACM (2012), 07--06.Google Scholar
- Adam Sweeney, Doug Doucette, Wei Hu, Curtis Anderson, Mike Nishimoto, and Geoff Peck. 1996. Scalability in the XFS File System. In Proc. of ATC. USENIX. Google ScholarDigital Library
- The Open Group Base Specifications Issue 7-IEEE Std 1003.1 2018. (2018). http://pubs.opengroup.org/onlinepubs/9699919799/.Google Scholar
- Alexander Thomson and Daniel J Abadi. 2015. CalvinFS: Consistent WAN Replication and Scalable Metadata Management for Distributed File Systems. In Proc. of FAST. USENIX, 1--14. Google ScholarDigital Library
- Niraj Tolia, Michael Kozuch, Mahadev Satyanarayanan, Brad Karp, Thomas Bressoud, and Adrian Perrig. 2003. Opportunistic Use of Content Addressable Storage for Distributed File Systems. In Proc. of ATC. 127--140.Google Scholar
- Michael Vrable, Stefan Savage, and Geoffrey M Voelker. 2009. Cumulus: Filesystem Backup to the Cloud. ACM Transactions on Storage (TOS) 5, 4 (2009), 14. Google ScholarDigital Library
- Michael Vrable, Stefan Savage, and Geoffrey M Voelker. 2012. Bluesky: A Cloud-backed File System for the Enterprise. In Proc. of FAST. USENIX, 19--19. Google ScholarDigital Library
- H. Wang, R. Shea, F. Wang, and J. Liu. 2012. On the Impact of Virtualization on Dropbox-like Cloud File Storage/Synchronization Services. In Proc. of IWQoS. Google ScholarDigital Library
- Sage A Weil, Scott A Brandt, Ethan L Miller, Darrell DE Long, and Carlos Maltzahn. 2006. Ceph: A Scalable, High-Performance Distributed File System. In Proc. of OSDI. USENIX Association, 307--320. Google ScholarDigital Library
- Sage A Weil, Kristal T Pollack, Scott A Brandt, and Ethan L Miller. 2004. Dynamic Metadata Management for Petabyte-Scale File Systems. In Proc. of SC. IEEE. Google ScholarDigital Library
- Brent Welch, Marc Unangst, Zainul Abbasi, Garth A Gibson, Brian Mueller, Jason Small, Jim Zelenka, and Bin Zhou. 2008. Scalable Performance of the Panasas Parallel File System. In Proc. of FAST. USENIX, 17--33. Google ScholarDigital Library
- Why Dropbox decided to drop AWS and build its own infrastructure and network 2017. (2017). https://techcrunch.com/2017/09/15/why-dropbox-decided-to-dropaws-and-build-its-own-infrastructure-and-network.Google Scholar
- Y. Yu, D. Belazzougui, C. Qian, and Q. Zhang. 2018. Memory-efficient and Ultrafast Network Lookup and Forwarding using Othello Hashing. IEEE/ACM Transactions on Networking (2018).Google Scholar
- Yupu Zhang, Chris Dragga, Andrea C Arpaci-Dusseau, and Remzi H Arpaci-Dusseau. 2014. ViewBox: Integrating Local File Systems with Cloud Storage Services. In Proc. of FAST. USENIX, 119--132. Google ScholarDigital Library
Index Terms
- H2Cloud: Maintaining the Whole Filesystem in an Object Storage Cloud
Recommendations
A multiple-file write scheme for improving write performance of small files in Fast File System
Fast File System (FFS) stores files to disk in separate disk writes, each of which incurs a disk positioning (seek + rotation) limiting the write performance for small files. We propose a new scheme called co-writing to accelerate small file writes in ...
HPDA: A hybrid parity-based disk array for enhanced performance and reliability
Flash-based Solid State Drive (SSD) has been productively shipped and deployed in large scale storage systems. However, a single flash-based SSD cannot satisfy the capacity, performance and reliability requirements of the modern storage systems that ...
Higher reliability redundant disk arrays: Organization, operation, and coding
Parity is a popular form of data protection in redundant arrays of inexpensive/independent disks (RAID). RAID5 dedicates one out of N disks to parity to mask single disk failures, that is, the contents of a block on a failed disk can be reconstructed by ...
Comments