Abstract
As an important technology in cloud storage, deduplication is widely used to reserve network bandwidth and storage resources. While deduplication brings us convenience, there are also security risks that we have to confront. If internal data from organizations are treated in the same way of ordinary data, deduplication may lead to unexpected data leakage and other issues. A user similarity-aware data deduplication algorithm is proposed which can properly handle internal data uploaded by group users. This scheme can recognize the situation that uploaders with similar attributes hold the same data in the process of deduplication. The goal of our scheme is to ensure that the participation of group users will not change the current popularity of uploaded data. In the aspect of attribute distance calculation, we divide attribute types and introduce specific attribute distance calculation methods for each type. We determine user category by comparing the similarities of their attributes. Finally, the counting method of uploaded data is adjusted adaptively according to the current popularity status of data and user categories. This scheme can avoid potential internal data leakage caused by deduplication. Through experiment evaluation, we show that our scheme is efficient, and is of great scalability and practicability.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Mcsharry, P.E., Little, M.A., Rodda, H.J.E., et al.: Quantifying flood risk of extreme events using density forecasts based on a new digital archive and weather ensemble predictions. Q. J. R. Meteorol. Soc. 139(671), 328–333 (2013)
Wang, C., Chow, S.M., Wang, Q., et al.: Privacy-preserving public auditing for secure cloud storage. IEEE Trans. Comput. 62(2), 362–375 (2013)
Wang, Q., Wang, C., Ren, K., Lou, W.J., Li, J.: Enabling public auditability and data dynamics for storage security in cloud computing. IEEE Trans. Parallel Distrib. Syst. 22(5), 847–859 (2011)
Yuan, H.R., Chen, X.F., Jiang, T., et al.: DedupDUM: secure and scalable data deduplication with dynamic user management. Inf. Sci. 456, 159–173 (2018)
Jayapandian, N., Md Zubair Rahman, A.M.J.: Secure deduplication for cloud storage using interactive message-locked encryption with convergent encryption, to reduce storage space. Braz. Arch. Biol. Technol. 61, e17160609 (2018)
Stanek, J., Kencl, L.: Enhanced secure thresholded data deduplication scheme for cloud storage. IEEE Trans. Dependable Secure Comput. 15(4), 694–707 (2018)
Fu, Y.J., Xiao, N., Liu, F.: Research and development on key techniques of data deduplication. J. Comput. Res. Dev. 49(1), 12–20 (2012)
Diao, K., Papapanagiotou, I., Hacker, T.J.: HARENS: hardware accelerated redundancy elimination in network systems. In: IEEE International Conference on Cloud Computing Technology & Science (2017)
Stanek, J., Sorniotti, A., Androulaki, E., et al.: A secure data deduplication scheme for cloud storage. IBM Corporation (2014)
Puzio, P., Molva, R., Önen, M., Loureiro, S.: PerfectDedup: secure data deduplication. In: Garcia-Alfaro, J., Navarro-Arribas, G., Aldini, A., Martinelli, F., Suri, N. (eds.) DPM/QASA -2015. LNCS, vol. 9481, pp. 150–166. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-29883-2_10
Zhang, S.G., Xian, H.Q., Liu, H.Y., et al.: Research on encrypted deduplication method based on offline key transfer in cloud storage environment. Net info Secur. 7, 66–72 (2017)
Liu, J., Asokan, N., Pinkas, B.: Secure deduplication of encrypted data without additional independent servers. In: ACM SIGSAC Conference on Computer & Communications Security. ACM (2015)
Yang, C., Ji, Q., Xiong, S.C., et al.: New method for file deduplication in cloud storage. J. Commun. 38, 25–33 (2017)
Zhou, Y., Dan, F., Wen, X., et al.: SecDep: a user-aware efficient fine-grained secure deduplication scheme with multi-level key management (2015)
Meyer, D.T., Bolosky, W.J.: A study of practical deduplication. ACM Trans. Storage 7(4), 1–20 (2012)
Yang, Y., Zheng, X., Guo, W., et al.: (Revised Version) privacy-preserving smart IoT-based healthcare big data storage and self-adaptive access control system. Inf. Sci. 479, 567–592 (2018)
Zhu, L.F., Dong, Z.H., Xu, L.Y.: Similarity measurement for retrieval based on hybrid attribute distance. J. Tongji Univ. 43(7), 1089–1096 (2015)
Cao, B.Y.: Fuzzy Mathematics and System. Science Press, Beijing (2005)
Peng, Z.Z., Sun, W.Y.: Fuzzy Mathematics and Applications. Wuhan University Press, Wuhan (2007)
Hu, Q.Z., Zhang, W.H.: Research and Application of Interval Number Theory. Science Press, Beijing (2010)
Bao, Y.E., Peng, X.Q., Zhao, B.: The interval number distance and completeness based on the expectation and width. Fuzzy Syst. Math. 27(6), 133–139 (2013)
Xingui, H.: Semantic distance and fuzzy users’ view in fuzzy databases. Chin. J. Comput. 12(10), 757 (1989)
Leydesdorff, L., Bornmann, L.: How fractional counting of citations affects the impact factor: normalization in terms of differences in citation potentials among fields of science (2011)
Lynn, B.: The pairing-based cryptographic library (2015). http://crypto.Stanford.edu/pbc/
Loukides, M., Oram, A.: Programming with GNU Software, vol. 86, no. 3, pp. 350–359. O’Reilly & Associates (1997)
Steiner, M.: The PBC_bce broadcast encryption library (2006). https://crypto.stanford.edu/pbc/bce/
Hu, X.T., Qin, Z.P., Zhang, H., Hao, G.S.: Research and improved implementation of AES algorithm in OpenSSL. Control Autom. 25(12), 83–85 (2009)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Gao, Y., Xian, H., Teng, Y. (2020). User Similarity-Aware Data Deduplication Scheme for IoT Applications. In: Xu, G., Liang, K., Su, C. (eds) Frontiers in Cyber Security. FCS 2020. Communications in Computer and Information Science, vol 1286. Springer, Singapore. https://doi.org/10.1007/978-981-15-9739-8_4
Download citation
DOI: https://doi.org/10.1007/978-981-15-9739-8_4
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-9738-1
Online ISBN: 978-981-15-9739-8
eBook Packages: Computer ScienceComputer Science (R0)