Abstract
Cloud backup has been an important issue ever since large quantities of valuable data have been stored on the personal computing devices. Data reduction techniques, such as deduplication, delta encoding, and Lempel-Ziv (LZ) compression, performed at the client side before data transfer can help ease cloud backup by saving network bandwidth and reducing cloud storage space. However, client-side data reduction in cloud backup services faces efficiency and privacy challenges. In this paper, we present Pangolin, a secure and efficient cloud backup service for personal data storage by exploiting application awareness. It can speedup backup operations by application-aware client-side data reduction technique, and mitigate data security risks by integrating selective encryption into data reduction for sensitive applications. Our experimental evaluation, based on a prototype implementation, shows that our scheme can improve data reduction efficiency over the state-of-the-art methods by shortening the backup window size to 33%~75%, and its security mechanism for sensitive applications has negligible impact on backup window size.
Similar content being viewed by others
References
Armbrust M, Fox A, Griffith R, Joseph A D, Katz R, Konwinski A, Lee G, Patterson D, Rabkin A, Stoica I, Zaharia M. A view of cloud computing. Communications of the ACM, 2010, 53(4): 50–58.
Biggar H. Experiencing data de-duplication: Improving efficiency and reducing capacity requirements. White Paper, the Enterprise Strategy Group, Feb. 2007. www.abtechsystems.com/files/pdfs/WP001 04.pdf, Dec. 2012.
Ponemon L. The cost of a lost laptop. White Paper, Ponemon Institute, Apr. 2009. http://communities.intel.com/docs/DOC-3076, Dec. 2012.
Storer M W, Greenan K, Long D D, Miller E L. Secure data deduplication. In Proc. the 4th StorageSS, Oct. 2008, pp.1-10.
Harnik D, Pinkas B, Shulman-Peleg A. Side channels in cloud services: Deduplication in cloud storage. IEEE Security & Privacy, 2010, 8(6): 40–47.
Halevi S, Harnik D, Pinkas B, Shulman-Peleg A. Proofs of ownership in remote storage systems. In Proc. the 18th CCS, Oct. 2011, pp.491-500.
Blelloch G E. Introduction to data compression. Technical Report, Computer Science Department, Carnegie Mellon University, Oct. 2001. http://www.cs.cmu.edu/afs/cs/project/pscico-guyb/realworld/www/compression.pdf, Oct. 2013.
Douglis F, Iyengar A. Application-specific delta-encoding via resemblance detection. In Proc. the USENIX ATC, Jun. 2003, pp.113-126.
Shilane P, Huang M, Wallace G, Hsu W. WAN optimized replication of backup datasets using stream-informed delta compression. ACM Transactions on Storage, 2012, 8(4): Article No. 13.
Zhu B, Li K, Patterson H. Avoiding the disk bottleneck in the data domain deduplication file system. In Proc. the 6th FAST, Feb. 2008, pp.269-282.
Bois L D, Amatruda R. Backup and recovery: Accelerating efficiency and driving down IT costs using data deduplication. Technical Report, EMC Corporation, Feb. 2010.
Shilane P, Wallace G, Huang M, Hsu W. Delta compressed and deduplicated storage using stream-in- formed locality. In Proc. the 4th HotStorage, June 2012, Article No. 10.
Maximizing data efficiency: Benefits of global deduplication. White Paper, NEC, June 2009. http://www.knowledge-storm.com/sol summary 5136573.asp, Dec. 2013.
Anderson P, Zhang L. Fast and secure laptop backups with encrypted de-duplication. In Proc. the 24th LISA, Dec. 2010, Article No. 3.
Lillibridge M, Eshghi K, Bhagwat D, Deolalikar V, Trezise G, Camble P. Sparse indexing: Large scale, inline deduplication using sampling and locality. In Proc. the 7th FAST, Feb. 2009, pp.111-123.
Meister D, Brinkmann A. Multi-level comparison of data deduplication in a backup scenario. In Proc. the SYSTOR, May 2009, Article No. 8.
Agrawal N, Bolosky W J, Douceur J R, Lorch J R. A five-year study of file-system metadata. In Proc. the 5th FAST, Feb. 2007, pp.31-45.
Bhagwat D, Eshghi K, Long D D, Lillibridge M. Extreme binning: Scalable, parallel deduplication for chunk based file backup. In Proc. the 17th MASCOTS, Sept. 2009, pp.1-9.
Tan Y, Jiang H, Feng D, Tian L, Yan Z, Zhou G. SAM: A semantic-aware multi-tiered source de-duplication frame work for cloud backup. In Proc. the 39th ICPP, Sept. 2010, pp.614-623.
Vrable M, Savage S, Voelker G M. Cumulus: Filesystem backup to the cloud. In Proc. the 7th FAST, Feb. 2009, pp.225-238.
MacDonald J. File system support for delta compression [Master’s Thesis]. Department of Electrical Engineering and Computer Science, University of California at Berkeley, 2000.
Asenjo J C. The advanced encryption standard — Implementation and transition to a new cryptographic benchmark. Network Security, 2002, 2002(7): 7–9.
Fu Y, Jiang H, Xiao N, Tian L, Liu F. AA-Dedupe: An application-aware source deduplication approach for cloud backup services in the personal computing environment. In Proc. the IEEE CLUSTER, Sept. 2011, pp.112-120.
Author information
Authors and Affiliations
Corresponding author
Additional information
This work was supported in part by the National High Technology Research and Development 863 Program of China under Grant No. 2013AA013201, the National Natural Science Foundation of China under Grant Nos. 61025009, 61232003, 61120106005, 61170288, and 61379146.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Fu, YJ., Xiao, N., Liao, XK. et al. Application-Aware Client-Side Data Reduction and Encryption of Personal Data in Cloud Backup Services. J. Comput. Sci. Technol. 28, 1012–1024 (2013). https://doi.org/10.1007/s11390-013-1394-5
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11390-013-1394-5