Skip to main content

FC-LID: File Classifier Based Linear Indexing for Deduplication in Cloud Backup Services

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9581))

Abstract

Data deduplication techniques are optimal solutions for reducing both bandwidth and storage space requirements for cloud backup services in data centers. During deduplication process, maintaining an index in RAM is a fundamental operation. Very large index needs more storage space. It is hard to put such a large index totally in RAM and accessing large disk also decreases throughput. To overcome this problem, index system is developed based on File classifier based Linear Indexing Deduplication called FC-LID which utilizes Linear Hashing with Representative Group (LHRG). The proposed Linear Index structure reduces deduplication computational overhead and increases deduplication efficiency.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Sun, Z., Shen, J., Yong, J.: DeDu: building a deduplication storage system over cloud computing. In: 15th IEEE International Conference on Computer Supported Cooperative Work in Design (2011)

    Google Scholar 

  2. Yinjin, F., et al.: AA-Dedupe: an application-aware source deduplication approach for cloud backup services in the personal computing environment. In: IEEE International Conference on Cluster Computing, pp. 112–120 (2011)

    Google Scholar 

  3. Zhonglin, H., Yuhua, H.: A study on cloud backup technology and its development. In: International Conference, ICCIC 2011, pp 1–7. Wuhan, China, 17–18 September 2011

    Google Scholar 

  4. Zhu, B., Li, K., Patterson, H.: Avoiding the disk bottleneck in the data domain deduplication file system. In: Proceedings of the 6th Conference on USENIX Conference on File and Storage Technologies, San Jose, CA, USA, pp. 269–282. USENIX Association, Berkeley, CA, USA, 26–29, 2008

    Google Scholar 

  5. Neelaveni, P., Vijayalakshmi, M.: A survey on deduplication in cloud storage. Asian J. Inf. Technol. 13, 320–330 (2014)

    Google Scholar 

  6. Meyer, D.T., Bolosky, W.J.: A study of practical deduplication. In: FAST 2011: Proceedings of the 9th Conference on File and Storage Technologies (2011)

    Google Scholar 

  7. Harnik, D., Pinkas, B., Shulman-Peleg, A.: Side channels in cloud services: deduplication in cloud storage. IEEE Secur. Priv. 8(6), 40–47 (2010)

    Article  Google Scholar 

  8. Lillibridge, M., Eshghi, K., Bhagwat, D., Deolalikar, V., Trezise, G., Camble, P.: Sparse indexing: large scale, inline deduplication using sampling and locality. In: Proceedings of the 7th Conference on USENIX Conference on File and Storage Technologies, San Francisco, CA, USA, pp. 111–123. USENIX Association, Berkeley, CA, USA, 24–27, 2009

    Google Scholar 

  9. Bhagwat, D., Eshghi, K., Long, D., Lillibridge, M.: Extreme binning: scalable, parallel deduplication for chunk-based file backup. In: Proceedings of the 17th Annual Meeting of the IEEEIACM International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems, London, UK, pp. 1–9. IEEE Computer Society, Washington, DC, USA, 21–23, 2014

    Google Scholar 

  10. Eshghi, K., Lillibridge, M., Wilcock, L., Belrose, G., Hawkes, R.: Jumbo store: providing efficient incremental upload and versioning for a utility rendering service. In: Proceedings of the 5th Conference on USENIX Conference on File and Storage Technologies, San Jose, CA, USA, pp. 123–138. USENIX Association, Berkeley, CA, USA, 13–16, 2007

    Google Scholar 

  11. Dong, W., Douglis, F., Li, K., Patterson, H., Reddy, S., Shilane, P.: Tradeoffs in scalable data routing for deduplication clusters. In: Proceedings of the 9th Conference on USENIX Conference on File and Storage Technologies, San Jose, CA, USA, pp. 15–29. USENIX Association, Berkeley, CA USA, 15–17, 2011

    Google Scholar 

  12. Mell, P., Grance, T.: The NIST Definition of Cloud Computing, Draft by The National Institute of Standards and Technology (NIST). United States Department of Commerce Version 15 (2009)

    Google Scholar 

  13. Tan, Y., Jiang, H., Sha, E.H.-M., Yan, Z., Feng, D.: SAFE: a source deduplication framework for efficient cloud backup services. J. Sign Process Syst. 72, 209–228 (2013). Springer Science, Business Media, New York

    Article  Google Scholar 

  14. Zhu, B., Li, K., Patterson, H.: Avoiding the disk bottleneck in the data domain deduplication file system. In: Proceedings of the 6th USENIX Conference on File and Storage Technologies, FAST 2008, pp. 18:1–18:14. USENIX Association, Berkeley, CA, USA

    Google Scholar 

  15. Wei, J., Jiang, H., Zhou, K., Feng, D.: Mad2: a scalable high-throughput exact deduplication approach for network backup services. In: IEEE NASA Goddard Conference on Mass Storage Systems and Technologies, pp. 1–14 (2010)

    Google Scholar 

  16. http://open.eucalyptus.com/wiki/EucalyptusInstall_v2.0

  17. Amazon’s Elastic Block Storage. Elastic Block Storage. http://aws.amazon.com/ebs/

  18. Amazon’s Simple Storage Service. Simple Storage Service. http://aws.amazon.com/s3/

  19. Gluster file system. http://www.gluster.org

  20. http://gluster.com/community/documentation/index.php/MainPag

  21. http://open.eucalyptus.com/wiki/EucalyptusWalrusInteracting_v.0

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to P. Neelaveni .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Neelaveni, P., Vijayalakshmi, M. (2016). FC-LID: File Classifier Based Linear Indexing for Deduplication in Cloud Backup Services. In: Bjørner, N., Prasad, S., Parida, L. (eds) Distributed Computing and Internet Technology. ICDCIT 2016. Lecture Notes in Computer Science(), vol 9581. Springer, Cham. https://doi.org/10.1007/978-3-319-28034-9_28

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-28034-9_28

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-28033-2

  • Online ISBN: 978-3-319-28034-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics