Skip to main content
Log in

A quantitative measure of the information leaked from queries to search engines and a scheme to reduce it

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

In recent years, the opportunity to use search engines has increased due to the greater variety and number of Internet-capable devices. Search engines have become indispensable for many users, who provide vast amounts of information as input. However, there has been recent recognition of the risk entailed by search engine providers storing and analyzing information related to user privacy. Existing research evaluates protection of user privacy from search engines that try to extract related information from users’ query strings (Jones et al. I know what you did last summer—query logs and user privacy. In: Proceedings of the sixteenth ACM conference on conference on information and knowledge management, pp. 909–914, 2007). The searchable encryption technique is effective when searching encrypted queries and is useful when users search their own data stored in external cloud storage. However, search engine providers make a profit based on the query strings of their many users, so they are not expected to adopt this approach. Private information retrieval (PIR) is an established technique which ensures that no information is leaked to the search engine. However, PIR is based on a model with strict limitations on retrieval and is impractical. In this paper, we define a measure to quantify the amount of information that is leaked during a search. The measure is defined based on the entropy of query strings. We propose a practical search scheme that reduces the amount of leaked information. The proposed scheme is simple and can be implemented using a typical personal computer. We evaluate the system by experiment and confirm that the proposed scheme works as intended and is of acceptable usability.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

References

  1. Yamamoto H, Hiraide Y (2016) A study on the information content leaked from queries to search engines and its reduction. In: Proc. of the 22nd International Conference on Parallel and Distributed Processing Techniques and Applications, pp 277–282

  2. Preibush S (2013) The value of privacy in Web search. In: Twelfth Workshop on the Economics of Information Security

  3. Jones R et al (2007) I know what you did last summer—query logs and user privacy. In: Proceedings of the Sixteenth ACM Conference on Conference on Information and Knowledge Management, pp 909–914

  4. Gentry C (2009) A fully homomorphic encryption scheme. Diss. Stanford University

  5. Chor B, Goldreich O, Kushilevitz E, Sudan M (1995) Private information retrieval. In: Proceedings of the 36th annual foundations of computer science. IEEE Computer Society Press, p 41

  6. Chor B, Goldreich O, Kushilevitz E, Sudan M (1998) Private information retrieval. J ACM 45(6):965–981

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hiroshi Yamamoto.

Additional information

A preliminary version of this paper has appeared in PDPTA 2016 [1].

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yamamoto, H., Hiraide, Y. & Ishii, H. A quantitative measure of the information leaked from queries to search engines and a scheme to reduce it. J Supercomput 73, 2494–2505 (2017). https://doi.org/10.1007/s11227-016-1942-1

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-016-1942-1

Keywords