Skip to main content
Log in

Improving file locality in multi-keyword top-k search based on clustering

  • Methodologies and Application
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

Nowadays, fast growing number of users and business are motivated to outsource their private data to public cloud servers. Taking into consideration security issues, private data should be encrypted before being outsourced to remote servers, though this makes traditional plaintext keyword search rather difficult. For this reason, there exists an urgent need of an efficient and secure searchable encryption technology. In this paper, an affinity propagation (AP) K-means clustering method (CAK-means, a combination of AP and K-means clustering) is proposed to realize fast searchable encryption in Big Data environments. CAK-means clustering utilizes affinity propagation to initialize K-means clustering, thereby making the clustering process faster, stable and effectively improving the initial clustering center quality of the K-means. As the AP algorithm identifies the clustering center with much lower errors than other methods, it significantly improves the search accuracy. Simultaneously, the related files in one cluster are stored at the contiguous locality of disks which will substantially improve the file locality and speedup the read and write disk I/O. Additionally, the coordinated matching measure is utilized to support accurate ranking of search results. Experimental results show that the proposed CAK-means-based multi-keyword ranked searchable encryption scheme (MRSE-CAK) has higher search efficiency and accuracy while simultaneously ensuring equivalent security.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  • Asharov G, Naor M, Segev G, et al (2016) Searchable symmetric encryption: optimal locality in linear space via two-dimensional balanced allocations. In: Proceedings of the international conference on ACM symposium on theory of computing, Cambridge, MA, USA, pp 1101–1114

  • Cao N, Wang C, Li M et al (2014) Privacy-preserving multi-keyword ranked search over encrypted cloud data. IEEE Trans Parallel Distrib Syst 25(1):222–233

    Article  Google Scholar 

  • Cash D, Tessaro S (2014) The locality of searchable symmetric encryption. In: Proceedings of the international conference on the theory and applications of cryptographic techniques, Copenhagen, Denmark, pp 351–368

  • Chen C, Zhu X, Shen P et al (2016) An efficient privacy-preserving ranked keyword search method. IEEE Trans Parallel Distrib Syst 27(4):951–963

    Article  Google Scholar 

  • Chen L, Qiu L, Li KC et al (2017) DMRS: an efficient dynamic multi-keyword ranked search over encrypted cloud data. Soft Comput 21(16):4829–4841

    Article  Google Scholar 

  • Chen L, Qiu L, Li K-C, Zhou S (2018) A secure multi-keyword ranked search over encrypted cloud data against memory leakage attack. J Internet Technol 19(1):179–188

    Google Scholar 

  • Curtmola R, Garay J, Kamara S, et al (2006) Searchable symmetric encryption: improved definitions and efficient constructions. In: Proceedings of the international conference on ACM conference on computer and communications security, Alexandria, VA, USA, pp 79–88

  • Demertzis I, Papamanthou C (2017) Fast searchable encryption with tunable locality. In: Proceedings of the international conference ACM international conference on management of data, Chicago, Illinois, USA, pp 1053–1067

  • Feingold DG, Varga RS (1962) Block diagonally dominant matrices and generalizations of the Gerschgorin circle theorem. Pac J Math 12(4):1241–1250

    Article  MathSciNet  MATH  Google Scholar 

  • Frey BJ, Dueck D (2007) Clustering by passing messages between data points. Science 315(5814):972–976

    Article  MathSciNet  MATH  Google Scholar 

  • Fu Z, Sun X, Liu Q, Zhou L, Shu J (2015) Achieving efficient cloud search services: multi-keyword ranked search over encrypted cloud data supporting parallel computing. IEICE Trans Commun 98(1):190–200

    Article  Google Scholar 

  • Huang Z (1998) Extensions to the k-means algorithm for clustering large data sets with categorical values. Data Min Knowl Disc 2(3):283–304

    Article  MathSciNet  Google Scholar 

  • Ishai Y, Kushilevitz E, Ostrovsky R (2006) Cryptography from anonymity. In: Proceedings of the international conference on foundations of computer science, Washington, DC, USA, pp 239–248

  • Kamara S, Moataz T (2017) Boolean searchable symmetric encryption with worst-case sub-linear complexity. In: Proceedings of the international conference on the theory and applications of cryptographic techniques, Paris, France, pp 94–124

  • MacQueen J (1967) Some methods for classification and analysis of multivariate observations. In: Proceedings of the international conference on Berkeley symposium on mathematical statistics and probability, California, USA, pp 281–297

  • Miers I, Mohassel P (2017) IO-DSSE: scaling dynamic searchable encryption to millions of indexes by improving locality. In: Proceedings of the international conference on network and distributed system security symposium, San Diego, California, pp 1–13

  • Poh GS, Chin JJ, Yau WC et al (2017) Searchable symmetric encryption: designs and challenges. ACM Comput Surv 50(3):40

    Article  Google Scholar 

  • Wang J, Chen X, Li J et al (2017) Towards achieving flexible and verifiable search for outsourced database in cloud computing. Future Gener Comput Syst 67:266–275

    Article  Google Scholar 

  • Wang B, Yu S, Lou W, et al (2014) Privacy-preserving multi-keyword fuzzy search over encrypted data in the cloud. In: Proceedings of the international conference on computer communications, Toronto, Canada, pp 2112–2120

  • Witten IH, Moffat A, Bell TC (1999) Managing gigabytes: compressing and indexing documents and images. Morgan Kaufmann Publishing, San Francisco

    MATH  Google Scholar 

  • Xia Z, Wang X, Sun X et al (2016) A secure and dynamic multi-keyword ranked search scheme over encrypted cloud data. IEEE Trans Parallel Distrib Syst 27(2):340–352

    Article  Google Scholar 

  • Zhu Y, Yu J, Jia C (2009) Initializing K-means clustering using affinity propagation. In: Proceedings of the international conference on hybrid intelligent systems, Shenyang, China, pp 338–343

Download references

Acknowledgements

This work was supported by the Natural Science Foundation of China (Nos. 61602118, 61572010 and 61472074), Fujian Normal University Innovative Research Team (No. IRTL1207), Natural Science Foundation of Fujian Province (Nos. 2015J01240, 2017J01738), Science and Technology Projects of Educational Office of Fujian Province (No. JK2014009), and Fuzhou Science and Technology Plan Project (No. 2014-G-80).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lanxiang Chen.

Ethics declarations

Conflict of interest

All authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Communicated by V. Loia.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, L., Zhang, N., Li, KC. et al. Improving file locality in multi-keyword top-k search based on clustering. Soft Comput 22, 3111–3121 (2018). https://doi.org/10.1007/s00500-018-3145-6

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-018-3145-6

Keywords

Navigation