skip to main content
10.1145/2663716.2663747acmconferencesArticle/Chapter ViewAbstractPublication PagesimcConference Proceedingsconference-collections
research-article

Towards Network-level Efficiency for Cloud Storage Services

Published: 05 November 2014 Publication History

Abstract

Cloud storage services such as Dropbox, Google Drive, and Microsoft OneDrive provide users with a convenient and reliable way to store and share data from anywhere, on any device, and at any time. The cornerstone of these services is the data synchronization (sync) operation which automatically maps the changes in users' local filesystems to the cloud via a series of network communications in a timely manner. If not designed properly, however, the tremendous amount of data sync traffic can potentially cause (financial) pains to both service providers and users.
This paper addresses a simple yet critical question: Is the current data sync traffic of cloud storage services efficiently used? We first define a novel metric named TUE to quantify the Traffic Usage Efficiency} of data synchronization. Based on both real-world traces and comprehensive experiments, we study and characterize the TUE of six widely used cloud storage services. Our results demonstrate that a considerable portion of the data sync traffic is in a sense wasteful, and can be effectively avoided or significantly reduced via carefully designed data sync mechanisms. All in all, our study of TUE of cloud storage services not only provides guidance for service providers to develop more efficient, traffic-economic services, but also helps users pick appropriate services that best fit their needs and budgets.

References

[1]
Amazon S3 pricing policy (Jan. 2014). http://aws.amazon.com/s3/#pricing.
[2]
Bandwidth costs for cloud storage. http://blog.dshr.org/2012/11/bandwidth-costs-for-cloud-storage.html.
[3]
Bandwidth limitations are a concern with cloud backup. http://searchdatabackup.techtarget.com/video/Bandwidth-limitations-are-a-concern-with-cloud-backup.
[4]
Cisco Global Cloud Index: Forecast and Methodology, 2012--2017. Trend 3: Remote Data Services and Storage Access Services Growth. http://www.cisco.com/en/US/solutions/collateral/ns341/ns525/ns537/ns705/ns1175/Cloud_Index_White_Paper.html.
[5]
Dirty Secrets: 5 Weaknesses of Cloud Storage Gateways. http://www.nasuni.com/blog/28-dirty_secrets_5_weaknesses_of_cloud_storage.
[6]
Dropbox Is Now The Data Fabric Tying Together Devices For 100M Registered Users Who Save 1B Files A Day. http://techcrunch.com/2012/11/13/dropbox-100-million.
[7]
Google Drive Now Has 10 Million Users: Available on iOS and Chrome OS. http://techcrunch.com/2012/06/28/google-drive-now-has-10-million-users-available-on-ios-and-chrome-os-offline-editing-in-docs.
[8]
Hidden Costs of Cloud Storage. http://www.onlinefilestorage.com/hidden-costs-of-cloud-storage-1756.
[9]
How fast is SkyDrive (OneDrive) growing? http://www.liveside.net/2012/10/27/how-fast-is-skydrive-growing.
[10]
iCloud Drive features preview. http://www.apple.com/ios/ios8/icloud-drive.
[11]
JavaScript Tutorials, Refernces, and Documentation. http://developer.mozilla.org/en-US/docs/Web/javascript.
[12]
Large-scale Dropbox trace collected at the ISP level. http://traces.simpleweb.org/wiki/Dropbox_Traces.
[13]
OpenStack Installation Guide for Ubuntu 12.04/14.04 (LTS). http://docs.openstack.org/icehouse/install-guide/install/apt/content.
[14]
PUE (Power Usage Effectiveness). http://en.wikipedia.org/wiki/Power_usage_effectiveness.
[15]
A question about the default chunk size of rsync. http://lists.samba.org/archive/rsync/2001- November/000595.html.
[16]
rsync web site. http://www.samba.org/rsync.
[17]
Why RESTful Design for Cloud is Best. http://www.redhat.com/promo/summit/2010/presentations/cloud/fri/galder-945-why-RESTful/RestfulDesignJBWRH2010.pdf.
[18]
Wireshark network protocol analyzer. http://www.wireshark.org.
[19]
B. Aggarwal, A. Akella, A. Anand, A. Balachandran, P. Chitnis, C. Muthukrishnan, R. Ramjee, and G. Varghese. EndRE: An End-system Redundancy Elimination Service for Enterprises. In Proc. of NSDI, pages 419--432. USENIX, 2010.
[20]
A. Bergen, Y. Coady, and R. McGeer. Client Bandwidth: The Forgotten Metric of Online Storage Providers. In Proc. of PacRim, pages 543--548. IEEE, 2011.
[21]
A. Bessani, M. Correia, B. Quaresma, F. André, and P. Sousa. DepSky: Dependable and Secure Storage in a Cloud-of-clouds. ACM Transactions on Storage (TOS), 9(4):12, 2013.
[22]
B. Calder, J. Wang, A. Ogus, N. Nilakantan, A. Skjolsvold, S. McKelvie, Y. Xu, et al. Windows Azure Storage: A Highly Available Cloud Storage Service with Strong Consistency. In Proc. of SOSP, pages 143--157. ACM, 2011.
[23]
Y. Chen, K. Srinivasan, G. Goodson, and R. Katz. Design Implications for Enterprise Storage Systems via Multi-dimensional Trace Analysis. In Proc. of SOSP, pages 43--56. ACM, 2011.
[24]
I. Drago, E. Bocchi, M. Mellia, H. Slatman, and A. Pras. Benchmarking Personal Cloud Storage. In Proc. of IMC, pages 205--212. ACM, 2013.
[25]
I. Drago, M. Mellia, M.M Munafò, A. Sperotto, R. Sadre, and A. Pras. Inside Dropbox: Understanding Personal Cloud Storage Services. In Proc. of IMC, pages 481--494. ACM, 2012.
[26]
R.T. Fielding. Architectural Styles and the Design of Network-based Software Architectures. PhD thesis, University of California, Irvine, 2000.
[27]
S. Halevi, D. Harnik, B. Pinkas, and A. Shulman-Peleg. Proofs of Ownership in Remote Storage Systems. In Proc. of CCS, pages 491--500. ACM, 2011.
[28]
D. Harnik, R. Kat, D. Sotnikov, A. Traeger, and O. Margalit. To Zip or Not to Zip: Effective Resource Usage for Real-Time Compression. In Proc. of FAST, pages 229--242. USENIX, 2013.
[29]
D. Harnik, B. Pinkas, and A. Shulman-Peleg. Side Channels in Cloud Services: Deduplication in Cloud Storage. IEEE Security & Privacy, 8(6):40--47, 2010.
[30]
W. Hu, T. Yang, and J.N. Matthews. The Good, the Bad and the Ugly of Consumer Cloud Storage. ACM SIGOPS Operating Systems Review, 44(3):110--115, 2010.
[31]
Y. Huang, Z. Li, G. Liu, and Y. Dai. Cloud Download: Using Cloud Utilities to Achieve High-quality Content Distribution for Unpopular Videos. In Proc. of ACM Multimedia, pages 213--222. ACM, 2011.
[32]
D. Kholia and P. Wegrzyn. Looking Inside the (Drop) box. In Proc. of the 7th USENIX Workshop on Offensive Technologies (WOOT), 2013.
[33]
A. Li, X. Yang, S. Kandula, and M. Zhang. CloudCmp: Comparing Public Cloud Providers. In Proc. of IMC, pages 1--14. ACM, 2010.
[34]
Z. Li, Y. Huang, G. Liu, F. Wang, Z.-L. Zhang, and Y. Dai. Cloud Transcoder: Bridging the Format and Resolution Gap between Internet Videos and Mobile Devices. In Proc. of NOSSDAV, pages 33--38. ACM, 2012.
[35]
Z. Li and J. Li. Deficiency of Scientific Research behind the Price War of Cloud Storage Services. Communications of China Computer Federation (CCCF), 10(8):36--41, 2014.
[36]
Z. Li, C. Wilson, Z. Jiang, Y. Liu, B.Y. Zhao, C. Jin, Z.-L. Zhang, and Y. Dai. Efficient Batched Synchronization in Dropbox-like Cloud Storage Services. In Proc. of Middleware, pages 307--327. Springer, 2013.
[37]
Z. Li, Z.-L. Zhang, and Y. Dai. Coarse-grained Cloud Synchronization Mechanism Design May Lead to Severe Traffic Overuse. Elsevier Journal of Tsinghua Science and Technology, 18(3):286--297, 2013.
[38]
P. Mahajan, S. Setty, S. Lee, A. Clement, L. Alvisi, M. Dahlin, and M. Walfish. Depot: Cloud Storage with Minimal Trust. ACM Transactions on Computer Systems (TOCS), 29(4):12, 2011.
[39]
D.T. Meyer and W.J. Bolosky. A Study of Practical Deduplication. ACM Transactions on Storage (TOS), 7(4):14, 2012.
[40]
M. Mulazzani, S. Schrittwieser, M. Leithner, M. Huber, and E. Weippl. Dark Clouds on the Horizon: Using Cloud Storage as Attack Vector and Online Slack Space. In Proc. of USENIX Security, 2011.
[41]
V.S. Prakash, X. Zhao, Y. Wen, and W. Shi. Back to the Future: Using Magnetic Tapes in Cloud Based Storage Infrastructures. In Proc. of Middleware, pages 328--347. Springer, 2013.
[42]
P. Shilane, M. Huang, G. Wallace, and W. Hsu. WAN-optimized Replication of Backup Datasets using Stream-informed Delta Compression. ACM Transactions on Storage (TOS), 8(4):13, 2012.
[43]
M. Vrable, S. Savage, and G.M. Voelker. Cumulus: Filesystem Backup to the Cloud. ACM Transactions on Storage (TOS), 5(4):14, 2009.
[44]
M. Vrable, S. Savage, and G.M. Voelker. Bluesky: A Cloud-backed File System for the Enterprise. In Proc. of FAST. USENIX, 2012.
[45]
G. Wallace, F. Douglis, H. Qian, P. Shilane, S. Smaldone, M. Chamness, and W. Hsu. Characteristics of Backup Workloads in Production Systems. In Proc. of FAST. USENIX, 2012.
[46]
E. Zhai, R. Chen, D.I. Wolinsky, and B. Ford. An Untold Story of Redundant Clouds: Making Your Service Deployment Truly Reliable. In Proc. of HotDep. ACM, 2013.
[47]
E. Zhai, R. Chen, D.I. Wolinsky, and B. Ford. Heading Off Correlated Failures through Independence-as-a-Service. In Proc. of OSDI. USENIX, 2014.
[48]
Y. Zhang, C. Dragga, A. Arpaci-Dusseau, and R. Arpaci-Dusseau. ViewBox: Integrating Local File Systems with Cloud Storage Services. In Proc. of FAST, pages 119--132. USENIX, 2014.

Cited By

View all
  • (2024)Homomorphic Encryption Enabled Delta Encoding2024 32nd International Conference on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS)10.1109/MASCOTS64422.2024.10786566(1-8)Online publication date: 21-Oct-2024
  • (2024)LearnedSync: A Learning-Based Sync Optimization for Cloud StorageAlgorithms and Architectures for Parallel Processing10.1007/978-981-97-0801-7_1(1-21)Online publication date: 1-Mar-2024
  • (2023)Wolf in Sheep's Clothing: Evaluating Security Risks of the Undelegated Record on DNS Hosting ServicesProceedings of the 2023 ACM on Internet Measurement Conference10.1145/3618257.3624839(188-197)Online publication date: 24-Oct-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
IMC '14: Proceedings of the 2014 Conference on Internet Measurement Conference
November 2014
524 pages
ISBN:9781450332132
DOI:10.1145/2663716
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 November 2014

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. cloud storage service
  2. data synchronization
  3. network-level efficiency
  4. traffic usage efficiency

Qualifiers

  • Research-article

Conference

IMC '14
Sponsor:
IMC '14: Internet Measurement Conference
November 5 - 7, 2014
BC, Vancouver, Canada

Acceptance Rates

IMC '14 Paper Acceptance Rate 32 of 103 submissions, 31%;
Overall Acceptance Rate 277 of 1,083 submissions, 26%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)36
  • Downloads (Last 6 weeks)6
Reflects downloads up to 03 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Homomorphic Encryption Enabled Delta Encoding2024 32nd International Conference on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS)10.1109/MASCOTS64422.2024.10786566(1-8)Online publication date: 21-Oct-2024
  • (2024)LearnedSync: A Learning-Based Sync Optimization for Cloud StorageAlgorithms and Architectures for Parallel Processing10.1007/978-981-97-0801-7_1(1-21)Online publication date: 1-Mar-2024
  • (2023)Wolf in Sheep's Clothing: Evaluating Security Risks of the Undelegated Record on DNS Hosting ServicesProceedings of the 2023 ACM on Internet Measurement Conference10.1145/3618257.3624839(188-197)Online publication date: 24-Oct-2023
  • (2023)Enabling Cost-Benefit Analysis of Data Sync ProtocolsComputer10.1109/MC.2023.325119556:10(62-71)Online publication date: 20-Sep-2023
  • (2022)WebAssembly-based Delta Sync for Cloud Storage ServicesACM Transactions on Storage10.1145/350284718:3(1-31)Online publication date: 21-Sep-2022
  • (2022)Cloud Object Storage Synchronization: Design, Analysis, and ImplementationIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2022.318506733:12(4295-4310)Online publication date: 1-Dec-2022
  • (2022)NetSync: a Network Adaptive and Deduplication-Inspired Delta Synchronization Approach for Cloud Storage ServicesIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2022.3145025(1-1)Online publication date: 2022
  • (2022)UFC2: User-Friendly Collaborative CloudIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2021.313249633:9(2163-2182)Online publication date: 1-Sep-2022
  • (2022)An In-Network Replica Selection Framework for Latency-Critical Distributed Data StoresIEEE Transactions on Cloud Computing10.1109/TCC.2020.297600810:2(944-956)Online publication date: 1-Apr-2022
  • (2022)Improving Performance and Capacity Utilization in Cloud Storage for Content Delivery and Sharing ServicesIEEE Transactions on Cloud Computing10.1109/TCC.2020.296844410:1(439-450)Online publication date: 1-Jan-2022
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media