Skip to main content
Log in

Application-Aware Client-Side Data Reduction and Encryption of Personal Data in Cloud Backup Services

  • Regular Paper
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

Cloud backup has been an important issue ever since large quantities of valuable data have been stored on the personal computing devices. Data reduction techniques, such as deduplication, delta encoding, and Lempel-Ziv (LZ) compression, performed at the client side before data transfer can help ease cloud backup by saving network bandwidth and reducing cloud storage space. However, client-side data reduction in cloud backup services faces efficiency and privacy challenges. In this paper, we present Pangolin, a secure and efficient cloud backup service for personal data storage by exploiting application awareness. It can speedup backup operations by application-aware client-side data reduction technique, and mitigate data security risks by integrating selective encryption into data reduction for sensitive applications. Our experimental evaluation, based on a prototype implementation, shows that our scheme can improve data reduction efficiency over the state-of-the-art methods by shortening the backup window size to 33%~75%, and its security mechanism for sensitive applications has negligible impact on backup window size.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Armbrust M, Fox A, Griffith R, Joseph A D, Katz R, Konwinski A, Lee G, Patterson D, Rabkin A, Stoica I, Zaharia M. A view of cloud computing. Communications of the ACM, 2010, 53(4): 50–58.

    Article  Google Scholar 

  2. Biggar H. Experiencing data de-duplication: Improving efficiency and reducing capacity requirements. White Paper, the Enterprise Strategy Group, Feb. 2007. www.abtechsystems.com/files/pdfs/WP001 04.pdf, Dec. 2012.

  3. Ponemon L. The cost of a lost laptop. White Paper, Ponemon Institute, Apr. 2009. http://communities.intel.com/docs/DOC-3076, Dec. 2012.

  4. Storer M W, Greenan K, Long D D, Miller E L. Secure data deduplication. In Proc. the 4th StorageSS, Oct. 2008, pp.1-10.

  5. Harnik D, Pinkas B, Shulman-Peleg A. Side channels in cloud services: Deduplication in cloud storage. IEEE Security & Privacy, 2010, 8(6): 40–47.

    Article  Google Scholar 

  6. Halevi S, Harnik D, Pinkas B, Shulman-Peleg A. Proofs of ownership in remote storage systems. In Proc. the 18th CCS, Oct. 2011, pp.491-500.

  7. Blelloch G E. Introduction to data compression. Technical Report, Computer Science Department, Carnegie Mellon University, Oct. 2001. http://www.cs.cmu.edu/afs/cs/project/pscico-guyb/realworld/www/compression.pdf, Oct. 2013.

  8. Douglis F, Iyengar A. Application-specific delta-encoding via resemblance detection. In Proc. the USENIX ATC, Jun. 2003, pp.113-126.

  9. Shilane P, Huang M, Wallace G, Hsu W. WAN optimized replication of backup datasets using stream-informed delta compression. ACM Transactions on Storage, 2012, 8(4): Article No. 13.

  10. Zhu B, Li K, Patterson H. Avoiding the disk bottleneck in the data domain deduplication file system. In Proc. the 6th FAST, Feb. 2008, pp.269-282.

  11. Bois L D, Amatruda R. Backup and recovery: Accelerating efficiency and driving down IT costs using data deduplication. Technical Report, EMC Corporation, Feb. 2010.

  12. Shilane P, Wallace G, Huang M, Hsu W. Delta compressed and deduplicated storage using stream-in- formed locality. In Proc. the 4th HotStorage, June 2012, Article No. 10.

  13. Maximizing data efficiency: Benefits of global deduplication. White Paper, NEC, June 2009. http://www.knowledge-storm.com/sol summary 5136573.asp, Dec. 2013.

  14. Anderson P, Zhang L. Fast and secure laptop backups with encrypted de-duplication. In Proc. the 24th LISA, Dec. 2010, Article No. 3.

  15. Lillibridge M, Eshghi K, Bhagwat D, Deolalikar V, Trezise G, Camble P. Sparse indexing: Large scale, inline deduplication using sampling and locality. In Proc. the 7th FAST, Feb. 2009, pp.111-123.

  16. Meister D, Brinkmann A. Multi-level comparison of data deduplication in a backup scenario. In Proc. the SYSTOR, May 2009, Article No. 8.

  17. Agrawal N, Bolosky W J, Douceur J R, Lorch J R. A five-year study of file-system metadata. In Proc. the 5th FAST, Feb. 2007, pp.31-45.

  18. Bhagwat D, Eshghi K, Long D D, Lillibridge M. Extreme binning: Scalable, parallel deduplication for chunk based file backup. In Proc. the 17th MASCOTS, Sept. 2009, pp.1-9.

  19. Tan Y, Jiang H, Feng D, Tian L, Yan Z, Zhou G. SAM: A semantic-aware multi-tiered source de-duplication frame work for cloud backup. In Proc. the 39th ICPP, Sept. 2010, pp.614-623.

  20. Vrable M, Savage S, Voelker G M. Cumulus: Filesystem backup to the cloud. In Proc. the 7th FAST, Feb. 2009, pp.225-238.

  21. MacDonald J. File system support for delta compression [Master’s Thesis]. Department of Electrical Engineering and Computer Science, University of California at Berkeley, 2000.

  22. Asenjo J C. The advanced encryption standard — Implementation and transition to a new cryptographic benchmark. Network Security, 2002, 2002(7): 7–9.

    Article  Google Scholar 

  23. Fu Y, Jiang H, Xiao N, Tian L, Liu F. AA-Dedupe: An application-aware source deduplication approach for cloud backup services in the personal computing environment. In Proc. the IEEE CLUSTER, Sept. 2011, pp.112-120.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nong Xiao.

Additional information

This work was supported in part by the National High Technology Research and Development 863 Program of China under Grant No. 2013AA013201, the National Natural Science Foundation of China under Grant Nos. 61025009, 61232003, 61120106005, 61170288, and 61379146.

Electronic supplementary material

Below is the link to the electronic supplementary material.

(DOC 29 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fu, YJ., Xiao, N., Liao, XK. et al. Application-Aware Client-Side Data Reduction and Encryption of Personal Data in Cloud Backup Services. J. Comput. Sci. Technol. 28, 1012–1024 (2013). https://doi.org/10.1007/s11390-013-1394-5

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11390-013-1394-5

Keywords

Navigation