Enabling Cooperative Privacy-preserving Personalized search in cloud environments

doi:10.1016/j.ins.2018.12.016

Information Sciences

Volume 480, April 2019, Pages 1-13

https://doi.org/10.1016/j.ins.2018.12.016 Get rights and content

Abstract

With the booming of the information society, individuals and companies have been generating huge amounts of data every second. As a result, as a new storage tool, cloud computing has provided great convenience for data storage and application. However, it is worth noting that security and privacy issues in cloud computing have hindered its further development. To solve this problem, researchers proposed to encrypt the sensitive data. However, the encrypted data brought some follow-up issues such as an increased computational overhead and information retrieval inconvenience. In this paper, to enable users to obtain the most satisfactory results in searching outsourced encrypted data and reduce the computational overhead of cloud servers as well, the Cooperative Privacy-preserving Personalized Search (CPPS) scheme in cloud environments is proposed, which makes use of matrix encryption to ensure the privacy of the user. Combining the search of multiple users has greatly reduced the computing cost of the cloud server and guaranteed the accuracy of the user’s personalized search results at the same time. Experiments on the Yelp dataset indicate that the CPPS scheme has low index construction overhead, low computational overhead and communication overhead while providing users with personalized search results.

Introduction

The rapid increase in the amount of data, has brought society’s increasing concern to the value of data, followed by the increasingly prominent issues such as the privacy and security of our data [11], [28], [31]. As a recent example shows, Hillary’s support rate dropped dramatically since her privacy and private mailboxes had been excavated in receiving and sending public e-mails. Apart from the security issue, how to make good use of data is also a thorny issue. The cloud computing has superior storage and computing capabilities, so as to provide a good platform for big data storage and application. Currently, encryption could be the best choice for data security and users’ privacy [30], yet, the availability of encrypted data is deteriorating. Therefore, how to guarantee data availability while ensuring data security and users’ privacy will be a major challenge.

To address this problem, searchable encryption came into being. Researchers [4], [29], [36] have put forward many constructive schemes in different applications. However, most of the present schemes are keyword-based searchable encryption schemes which can no longer fully meet new challenges and the growing needs of users. It is mainly reflected in the following two aspects.

When the same keywords are used for searching, the required information varies from person to person due to different personal experiences, interests, and the like. For example, a user who is interested in electronic products may type in “Apple” for a search when he wants search results related to the Apple company, while for a user who is in favor of eating fruit, when he searches for “Apple”, he probably wants to obtain the relevant information about a certain kind of fruit. However, most of the current schemes failed to notice this. In these schemes, the cloud server will feed back all search results related to user search keywords. This will not only increase the bandwidth consumption of the network, but also confuse users when facing sea of data. As a result, the user has to spend a lot of time before obtaining the required information, which will seriously affect user’s search experience. Therefore, it is necessary to design a search scheme that can understand the user’s search intentions [18].

The other one is that most of the existing schemes failed to take into account the computational overhead and communication overhead in the cloud service process, and it is meaningful only when the data stored in the cloud can be easily searched. Liu [21] proposed a cooperative private searching protocol, which is compared with Ostrovsky protocol [27], discovering that its computational overhead and communication overhead are far lower than those of the Ostrovsky protocol. However, its computational overhead is still too high, and the long waiting time will seriously reduce the user’s search experience. In addition, the protocol only supports keyword search, with the search results unable to be customized for each person, which will reduce the search accuracy to a certain extent. Therefore, it is necessary to develop an effective search scheme characterized as privacy-preserving with low bandwidth and computing consumption.

Under this condition, on the premise of guaranteeing users’ privacy, this paper aims to design a personalized search scheme suitable for ciphertext environment with low computational overhead and low communication overhead. In CPPS, an aggregation and distribution layer (ADL) is introduced, and thus a larger number of user’s personalized queries can be aggregated and uploaded to the cloud server, so that each user can get search results related to the query words and user’s interesting model.

The main contributions of this paper compared with [5], [14], [20], [21] are listed below:

•
1) The CPPS scheme is the first cooperative personalized search in cloud environments, which merges user’s queries and returns personalized search results for each user by introducing an ADL server, thus improving the efficiency of our scheme.
•
2) With the assistance of secure kNN [33], the search of the merged query is implemented on the cloud server while the distribution of the results is displayed on the ADL server, so as to ensure the privacy of users’ information.
•
3) The scheme was tested that it can well protect user’s privacy, with its efficiency and feasibility also being proven through experiments. When the number of merged users is 3, $k_{i} = 10$ $(i = 1, 2, 3),$ $K = 20,$ the communication overhead of the CPPS scheme is 66.7% compared to the circumstance with no ADL scheme, and the computing overhead of the cloud server is 33.3% compared to the circumstance with no ADL scheme, so the CPPS scheme can effectively alleviate the performance bottleneck of the cloud server.

The remainder of this paper is organized as follows. In Section 2, we discuss related work for searchable encryption and personalized search. In Section 3, we give the relevant definitions in the paper. In Section 4, we propose the system model, the threat model, the design goals and summarize the notations of the paper. In Section 5, we propose CPPS scheme. Its security and privacy analysis is presented in Section 6. In Section 7, we present simulation experiment. Finally, Section 8 with concluding remarks is the end of this paper.

Section snippets

Searchable encryption

To ensure data security and user privacy, data encryption is a usual practice. However, the availability of encrypted data is degraded. How to find the files you need in a large amount of ciphertext is a question worth considering. In order to solve the search problem over encrypted data, Song et al. [29] proposed the symmetric searchable encryption scheme in the first, which used stream ciphers to encrypt keywords. In the process of retrieval, through keywords and one-to-one matches between

Preliminaries

In this section, relevant definitions will be introduced as follows:

“TF-IDF”. “TF-IDF” is a frequently used weighting technique for information retrieval. “TF” is the abbreviation of term frequency, and “IDF” is the abbreviation of inverse document frequency, which has been applied in many literatures [1], [10], [15]. For this reason, “TF-IDF” is used to obtain the weight of keywords and keywords of files in CPPS scheme.

User interest model. User-specific search results are inseparable from user

System model and notations

CPPS is involved with four different entities, namely, the data owner, the data user, the ADL and the cloud server. The overview of CPPS scheme is shown in Fig. 1 and summary of notations is shown in Table 1.

For the sake of understanding, in this paper, we just use an ADL server. However, if necessary, CPPS scheme is also applicable to multiple ADL scenarios. On the user side, a user interest model is stored to protect the user’s private information. At the same time, to better reflect the

CPPS scheme

The CPPS scheme merges user queries by introducing an ADL server and feeds top-k_i sorted results back to each user. The scheme can effectively reduce the computational overhead of the cloud server and the communication overhead between the cloud server and the ADL server, thus making the scheme very suitable for resource-saving cloud environments. The privacy of user and index is well protected by introducing a secure inner product [33]. At the same time, the user can quickly obtain

Security and privacy analysis

In this section, the CPPS scheme against the cloud server or the ADL server is analyzed, which is subject to the HBC model. We also simply analyzed the CPPS scheme against malicious attackers. Specific analysis is as follows:

Simulation experiment

The “business” and “review” data in the Yelp dataset are used to analyze the CPPS scheme. Some “business” data and all “review” data related to these “buiness” data are selected randomly as the experiment dataset.

The entire experiment is implemented with a 2.6GHz Intel (R) Core (TM) i7-6700HQ CPU, Windows 10 operating system with RAM of 16GB. The Matlab R2016b is used to implement the simulation code, with OriginPro 2017 being used to simulate the experimental data.

Conclusion

With the advent of the era of big data, information overload and privacy protection issues have received more and more attention. Through introducing the ADL server, this paper proposes the CPPS scheme that allows users to have cooperative personalized search. Merging the user’s query and distributing the search results returned by the cloud server can effectively alleviate the performance bottleneck of the cloud server, so as to better fit a resource-saving cloud environment. Meanwhile, the

Acknowledgments

This work is supported in part by the National Natural Science Foundation of China under Grants 61632009 & 61472451, in part by the Guangdong Provincial Natural Science Foundation under Grant 2017A030308006, and High-Level Talents Program of Higher Education in Guangdong Province under Grant 2016ZJ01, the Fundamental Research Funds for the Central Universities of Central South University under Grant Numbers 2017zzts141.

References (42)

O. Alhabashneh et al.
Fuzzy rule based profiling approach for enterprise information seeking and retrieval
Inf. Sci.
(2017)
J. Cui et al.
Akser: attribute-based keyword search with efficient revocation in cloud computing
Inf. Sci.
(2018)
D. Datta et al.
Multimodal retrieval using mutual information based textual query reformulation
Expert Syst. Appl.
(2017)
C.J. D’Orazio et al.
A technique to circumvent ssl/tls validations on ios devices
Future Gener. Comput. Syst.
(2017)
Z. Fu et al.
Enabling personalized search over encrypted outsourced data with efficiency improvement
IEEE Trans. Parallel Distrib. Syst.
(2016)
E. Luo et al.
Privacy-preserving multi-hop profile-matching protocol for proximity mobile social networks
Future Gener. Comput. Syst.
(2017)
E. Luo et al.
Hierarchical multi-authority and attribute-based encryption friend discovery scheme in mobile social networks
IEEE Commun. Lett.
(2016)
Y. Miao et al.
Enabling verifiable multiple keywords search over encrypted cloud data
Inf. Sci.
(2018)
R. Ostrovsky et al.
Private searching on streaming data
Annual International Cryptology Conference
(2005)
T. Peng et al.
Collaborative trajectory privacy preserving scheme in location-based services
Inf. Sci.
(2017)

I. Stojmenovic et al.

An overview of fog computing and its security issues

Concurrency Comput.

(2016)

Y. Yang et al.

Expressive query over outsourced encrypted data

Inf. Sci.

(2018)

Q. Zhang et al.

Prms: a personalized mobile search over encrypted outsourced data

IEEE Access

(2018)

Q. Zhang et al.

Attribute-based encryption with personalized search

2017 IEEE International Symposium on Parallel and Distributed Processing with Applications and 2017 IEEE International Conference on Ubiquitous Computing and Communications (ISPA/IUCC)

(2017)

M.R. Baye et al.

Search engine optimization: what drives organic traffic to retail sites?

J. Econ. Manage. Strategy

(2016)

D. Boneh, G. Di Crescenzo, R. Ostrovsky, G. Persiano, Public key encryption with keyword search 3027 (16) (2004)...

D. Boneh et al.

Public key encryption that allows pir queries

Annual International Cryptology Conference

(2007)

N. Cao et al.

Privacy-preserving multi-keyword ranked search over encrypted cloud data

IEEE Trans. Parallel Distrib. Syst.

(2014)

Y.-C. Chang et al.

Privacy preserving keyword searches on remote encrypted data

International Conference on Applied Cryptography and Network Security

(2005)

B. Chor et al.

Private information retrieval

Proceedings 36th Annual Symposium on Foundations of Computer Science

(1995)

R. Curtmola et al.

Searchable symmetric encryption: improved definitions and efficient constructions

J. Comput. Secur.

(2011)

Cited by (18)

Preserving privacy while revealing thumbnail for content-based encrypted image retrieval in the cloud
2022, Information Sciences
Owing to the rapid development of cloud services and personal privacy requirements, content-based encrypted image retrieval in the cloud has been increasing. Outsourced images are encrypted into noiselike ones to protect privacy, however, the obtained unrecognized appearance limits their availability. Besides, users have to decrypt all search results to browse, while some of them may not be needed, which undoubtedly wastes bandwidth and computing resources. To cope with this problem, a compromise strategy is proposed that considers the tradeoff between privacy and usability of cipher images. Wherein, a thumbnail preserving encryption (TPE) based on genetic algorithm is proposed. The pixels in the sub-blocks of the plain image are scrambled and diffused at the bit-level through crossover and mutation operators of the genetic algorithm. Moreover, two new operators of Mutation Compensation and Mutation Failure are defined and incorporated into the traditional genetic algorithm to achieve an ideal TPE, that cipher image has the same thumbnail as the original image. Additionally, a color histogram-based retrieval algorithm is introduced to retrieve cipher images using the color information preserved by thumbnails; and to improve retrieval accuracy by using the Bhattacharyya distance. A series of simulations verify the security and effectiveness of our scheme.
DMSE: Dynamic Multi-keyword Search Encryption based on inverted index
2021, Journal of Systems Architecture
Citation Excerpt :
Next, we find the OABKS [18] and DMSE schemes can be applied in a multi-DO/multi-DU scenario, i.e., a more practical application scenario. However, the DOAS [7], ECPPS [10], MRSMS [11] and IIMP [13] only consider single DO or single DU application scenario, which will greatly limit the scalability of a KSE scheme. Besides, only our scheme takes into account keyword updating among above six schemes.
With the popularity of cloud storage, increasing people are willing to upload their files to cloud services. They will encrypt these files before uploading to protect the privacy of files. However, encryption makes effective search very difficult and inefficient. In this paper, we design a dynamic multi-keyword searchable encryption scheme on the encrypted cloud files, called DMSE. Firstly, the outsourced files in DMSE are pre-classified and preprocessed according to their privacy to achieve classified search. Secondly, based on the inverted index, DMSE realizes multiple keywords search, which protects the privacy of outsourced files and users and greatly improves the search efficiency. In addition, our DMSE extends the inverted index to the multi-DO/multi-DU scenario, which is the distinguishable difference between DMSE and other existing multi-keyword search schemes. Thirdly, our DMSE can further support the update of keywords to cope with the mismatch dilemma between the first extracted keywords and outsourced files, thus improving the practicality of DMSE scheme. Finally, comprehensive performance evaluation shows our DMSE scheme is effective and feasible in practical application.
Efficient personalized search over encrypted data for mobile edge-assisted cloud storage
2021, Computer Communications
Cloud storage services allow a data owner to share her/his outsourced data with other users, and enable the users to search target data by keywords. To ensure the data confidentiality, data owner always encrypt data using traditional encryption schemes before outsourcing. Whereas, it makes efficiently searching impossible. Symmetric searchable encryption (SSE) is a cryptographic primitive that resolves this tension. However, most existing SSE schemes do not consider the individual characteristics of users during the search, such that they cannot support personalized search services over encrypted data. Meanwhile, security and efficiency issues in the cloud service model have also severely affected the user’s search experience, and the introduction of mobile edge servers can solve these problems to some extent. In this paper, we propose a personalized searchable encryption scheme (PSED) for mobile edge-assisted cloud storage. Our contribution consists of three aspects. First, we incorporate the user’s preference factors into the user’s query which enable users to get accurate personalized search results. Second, the computational overhead of the cloud server is reduced by calculating the relevance scores of the subqueries and subindexes on mobile edge servers. Third, by cutting the index and the query matrix, the encryption efficiency of the index and the query matrix is improved. Security analysis shows that PSED can guarantee the privacy of the data and the user. Experimental results demonstrate that the proposed schemes are highly efficient and accurate.
Privacy preserving and data transpiration in multiple cloud using secure and robust data access management algorithm
2021, Microprocessors and Microsystems
Citation Excerpt :
It gains by Fully Homomorphic Encryption (FHE), which empowers calculations on private health information without actually observing the basic information by Kocabas et al. [1]. Zhang et al. [2] expressed public auditing scheme with identity security for cloud storage. Here, single client, including a group manager, can't know the signer's identity.
Nowadays, privacy preserving is playing important role in cloud computing where content based privacy is challengeable task in un-trusted cloud environment. Based on literature studies, the method has key complexities and access polices authentication issues. The system need to concentrate to bring strong encryption function and efficient key distribution polices to meet future challenges. The method need to address the real time application in cloud environments minimal computation cost, inherent defects in key management and flexible access control policy. Current approaches have still believe in user identity, mutual privacy and key agreement session wise among content owner, Trusted client, and cloud service provider. The Proposed work focused on designing a framework named Secure and Robust Data Access Management (SRDAM) Algorithm proposed to maintain enhanced privacy, secure data transportation, and data access managements. Proposed algorithm consolidates validating cloud service providers and after that considers the property necessity of cloud user and cloud service provider (CSP).Proposed SRDAM Algorithm reduces the 1.79 s Data uploading time (DUT), Data 1.80 s Downloading Time (DDT) and 11.02 s Communication Overhead (CO) for document, images and video for conventional methodologies.
A trajectory privacy-preserving scheme based on a dual-K mechanism for continuous location-based services
2020, Information Sciences
Location-based services (LBSs) have increasingly provided by a broad range of devices and applications, but one associated risk is location disclosure. To solve this problem, a commonly method is to adopt K-anonymity in the centralized architecture based on a single trusted anonymizer. However, this strategy may compromise user privacy involving continuous LBSs. In this study, we propose a dual-K mechanism (DKM) to protect the users’ trajectory privacy for continuous LBSs. The proposed DKM method firstly inserted multiple anonymizers between the user and the location service provider (LSP), and K query locations are sent to different anonymizers to achieve K-anonymity. Simultaneously, we combined the dynamic pseudonym and the location selection mechanisms to improve user trajectory privacy. Hence, neither the LSP nor the anonymizer can obtain the user trajectory. Security analyses demonstrates that our proposed scheme can effectively enhance user trajectory privacy protection, and the simulation results prove that the DKM scheme can preserve user trajectory privacy with low overhead on a single anonymizer.
An Edge Computing-enhanced Internet of Things Framework for Privacy-preserving in Smart City
2020, Computers and Electrical Engineering
To supervise massive generated data by the Internet of Things (IoT) efficiently, we face two issues that should be addressed which are: (1) heterogeneity or satisfying diversity among IoT devices, and (2) privacy-preserving or preventing unintentional disclosure of sensitive data. Through observation, we found that existing solutions apply one common privacy-preserving rule for all devices while they address the heterogeneity issue separately that lead to unappealing performance. In this paper, we propose a framework for addressing the heterogeneity issue and privacy-preserving of IoT devices at the network edge using a novel proposed ontology data model. Besides, it leverages the proposed ontology to obtain a privacy-preserving method by frequently changing the privacy-preserving behaviors of IoT devices. Through simulation, we show that our solution overhead is less than 9 percent in the worst situation so that it is affordable to most IoT devices in one of its applications that is smart city.

View all citing articles on Scopus

View full text

Enabling Cooperative Privacy-preserving Personalized search in cloud environments

Abstract

Introduction

Section snippets

Searchable encryption

Preliminaries

System model and notations

CPPS scheme

Security and privacy analysis

Simulation experiment

Conclusion

Acknowledgments

Inf. Sci.

Inf. Sci.

Expert Syst. Appl.

Future Gener. Comput. Syst.

IEEE Trans. Parallel Distrib. Syst.

Future Gener. Comput. Syst.

IEEE Commun. Lett.

Inf. Sci.

Inf. Sci.

Concurrency Comput.

Inf. Sci.

IEEE Access

Search engine optimization: what drives organic traffic to retail sites?

J. Econ. Manage. Strategy

Public key encryption that allows pir queries

Annual International Cryptology Conference

Privacy-preserving multi-keyword ranked search over encrypted cloud data

IEEE Trans. Parallel Distrib. Syst.

Privacy preserving keyword searches on remote encrypted data

International Conference on Applied Cryptography and Network Security

Private information retrieval

Proceedings 36th Annual Symposium on Foundations of Computer Science

Searchable symmetric encryption: improved definitions and efficient constructions

J. Comput. Secur.