Skip to main content

User Similarity-Aware Data Deduplication Scheme for IoT Applications

  • Conference paper
  • First Online:
Frontiers in Cyber Security (FCS 2020)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1286))

Included in the following conference series:

  • 1248 Accesses

Abstract

As an important technology in cloud storage, deduplication is widely used to reserve network bandwidth and storage resources. While deduplication brings us convenience, there are also security risks that we have to confront. If internal data from organizations are treated in the same way of ordinary data, deduplication may lead to unexpected data leakage and other issues. A user similarity-aware data deduplication algorithm is proposed which can properly handle internal data uploaded by group users. This scheme can recognize the situation that uploaders with similar attributes hold the same data in the process of deduplication. The goal of our scheme is to ensure that the participation of group users will not change the current popularity of uploaded data. In the aspect of attribute distance calculation, we divide attribute types and introduce specific attribute distance calculation methods for each type. We determine user category by comparing the similarities of their attributes. Finally, the counting method of uploaded data is adjusted adaptively according to the current popularity status of data and user categories. This scheme can avoid potential internal data leakage caused by deduplication. Through experiment evaluation, we show that our scheme is efficient, and is of great scalability and practicability.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Mcsharry, P.E., Little, M.A., Rodda, H.J.E., et al.: Quantifying flood risk of extreme events using density forecasts based on a new digital archive and weather ensemble predictions. Q. J. R. Meteorol. Soc. 139(671), 328–333 (2013)

    Article  Google Scholar 

  2. Wang, C., Chow, S.M., Wang, Q., et al.: Privacy-preserving public auditing for secure cloud storage. IEEE Trans. Comput. 62(2), 362–375 (2013)

    Article  MathSciNet  Google Scholar 

  3. Wang, Q., Wang, C., Ren, K., Lou, W.J., Li, J.: Enabling public auditability and data dynamics for storage security in cloud computing. IEEE Trans. Parallel Distrib. Syst. 22(5), 847–859 (2011)

    Article  Google Scholar 

  4. Yuan, H.R., Chen, X.F., Jiang, T., et al.: DedupDUM: secure and scalable data deduplication with dynamic user management. Inf. Sci. 456, 159–173 (2018)

    Article  Google Scholar 

  5. Jayapandian, N., Md Zubair Rahman, A.M.J.: Secure deduplication for cloud storage using interactive message-locked encryption with convergent encryption, to reduce storage space. Braz. Arch. Biol. Technol. 61, e17160609 (2018)

    Google Scholar 

  6. Stanek, J., Kencl, L.: Enhanced secure thresholded data deduplication scheme for cloud storage. IEEE Trans. Dependable Secure Comput. 15(4), 694–707 (2018)

    Article  Google Scholar 

  7. Fu, Y.J., Xiao, N., Liu, F.: Research and development on key techniques of data deduplication. J. Comput. Res. Dev. 49(1), 12–20 (2012)

    Google Scholar 

  8. Diao, K., Papapanagiotou, I., Hacker, T.J.: HARENS: hardware accelerated redundancy elimination in network systems. In: IEEE International Conference on Cloud Computing Technology & Science (2017)

    Google Scholar 

  9. Stanek, J., Sorniotti, A., Androulaki, E., et al.: A secure data deduplication scheme for cloud storage. IBM Corporation (2014)

    Google Scholar 

  10. Puzio, P., Molva, R., Önen, M., Loureiro, S.: PerfectDedup: secure data deduplication. In: Garcia-Alfaro, J., Navarro-Arribas, G., Aldini, A., Martinelli, F., Suri, N. (eds.) DPM/QASA -2015. LNCS, vol. 9481, pp. 150–166. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-29883-2_10

    Chapter  Google Scholar 

  11. Zhang, S.G., Xian, H.Q., Liu, H.Y., et al.: Research on encrypted deduplication method based on offline key transfer in cloud storage environment. Net info Secur. 7, 66–72 (2017)

    Google Scholar 

  12. Liu, J., Asokan, N., Pinkas, B.: Secure deduplication of encrypted data without additional independent servers. In: ACM SIGSAC Conference on Computer & Communications Security. ACM (2015)

    Google Scholar 

  13. Yang, C., Ji, Q., Xiong, S.C., et al.: New method for file deduplication in cloud storage. J. Commun. 38, 25–33 (2017)

    Google Scholar 

  14. Zhou, Y., Dan, F., Wen, X., et al.: SecDep: a user-aware efficient fine-grained secure deduplication scheme with multi-level key management (2015)

    Google Scholar 

  15. Meyer, D.T., Bolosky, W.J.: A study of practical deduplication. ACM Trans. Storage 7(4), 1–20 (2012)

    Article  Google Scholar 

  16. Yang, Y., Zheng, X., Guo, W., et al.: (Revised Version) privacy-preserving smart IoT-based healthcare big data storage and self-adaptive access control system. Inf. Sci. 479, 567–592 (2018)

    Google Scholar 

  17. Zhu, L.F., Dong, Z.H., Xu, L.Y.: Similarity measurement for retrieval based on hybrid attribute distance. J. Tongji Univ. 43(7), 1089–1096 (2015)

    Google Scholar 

  18. Cao, B.Y.: Fuzzy Mathematics and System. Science Press, Beijing (2005)

    Google Scholar 

  19. Peng, Z.Z., Sun, W.Y.: Fuzzy Mathematics and Applications. Wuhan University Press, Wuhan (2007)

    Google Scholar 

  20. Hu, Q.Z., Zhang, W.H.: Research and Application of Interval Number Theory. Science Press, Beijing (2010)

    Google Scholar 

  21. Bao, Y.E., Peng, X.Q., Zhao, B.: The interval number distance and completeness based on the expectation and width. Fuzzy Syst. Math. 27(6), 133–139 (2013)

    MathSciNet  Google Scholar 

  22. Xingui, H.: Semantic distance and fuzzy users’ view in fuzzy databases. Chin. J. Comput. 12(10), 757 (1989)

    Google Scholar 

  23. Leydesdorff, L., Bornmann, L.: How fractional counting of citations affects the impact factor: normalization in terms of differences in citation potentials among fields of science (2011)

    Google Scholar 

  24. Lynn, B.: The pairing-based cryptographic library (2015). http://crypto.Stanford.edu/pbc/

  25. Loukides, M., Oram, A.: Programming with GNU Software, vol. 86, no. 3, pp. 350–359. O’Reilly & Associates (1997)

    Google Scholar 

  26. Steiner, M.: The PBC_bce broadcast encryption library (2006). https://crypto.stanford.edu/pbc/bce/

  27. Hu, X.T., Qin, Z.P., Zhang, H., Hao, G.S.: Research and improved implementation of AES algorithm in OpenSSL. Control Autom. 25(12), 83–85 (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hequn Xian .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Gao, Y., Xian, H., Teng, Y. (2020). User Similarity-Aware Data Deduplication Scheme for IoT Applications. In: Xu, G., Liang, K., Su, C. (eds) Frontiers in Cyber Security. FCS 2020. Communications in Computer and Information Science, vol 1286. Springer, Singapore. https://doi.org/10.1007/978-981-15-9739-8_4

Download citation

  • DOI: https://doi.org/10.1007/978-981-15-9739-8_4

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-15-9738-1

  • Online ISBN: 978-981-15-9739-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics