ABSTRACT
The Web is the largest repository of information. Personal information is usually scattered on various pages of different websites. Search engines have made it easier to find personal information. An attacker may collect a user's scattered information together via search engines, and infer some privacy information. We call this kind of privacy attack Privacy Inference Attack via Search Engines.
In this paper, we propose a user-side automatic detection service for detecting the privacy leakage before publishing personal information. In the user-side service, we construct a User Information Correlation (UICA) graph to model the association between user information returned by search engines. We map the privacy inference attack into a decision problem of searching a privacy inferring path with the maximal probability in the UICA graph. We propose a Privacy Leakage Detection Probability (PLD-Probability) algorithm to find the privacy inferring path. Extensive experiments indicate that the algorithm is reasonable and effective.
- T. McIntosh and J. R. Curran, Weighted Mutual Exclusion Bootstrapping for Domain Independent Lexicon and Template Acquisition, in Proceeding of the Australasian Language Technology Workshop, Hobart, Australia, 2008.Google Scholar
- L. Sweeney, "K-anonymity: A model for protecting privacy," International Journal on uncertainty, Fuzziness and Knowledge-based System, vol. 10, pp. 557--570, 2002. Google ScholarDigital Library
- P. Samarati and L. Sweeney, "Generalizing data to provide anonymity when disclosing information," in PODS98. Google ScholarDigital Library
- C. Dwyer and S. R. Hiltz, "Trust and privacy concern within social networking sites: A comparison of facebook and myspace," in Proceedings of AMCIS 2007, Colorado.Google Scholar
- R. Feizy, "An evaluation of identity on online social networking: Myspace (poster)," in ACM Hypertext and Hypermedia (HT), 2007.Google Scholar
- M. Mannan and P. C. van Oorschot, "Privacy-enhanced sharing of personal content on the web," in Proceeding of the 17th international conference on World Wide Web (WWW'08), Beijing, China, 2008, pp. 487--496. Google ScholarDigital Library
- M. Bellare and C. Namprempre, "Authenticated encryption: Relations among notions and analysis of the generic composition paradigm," in AsiaCrypt, 2000. Google ScholarDigital Library
- R. Gross and A. Acquisti, "Information revelation and privacy in online social networks," in ACM Workshop on Privacy in the Electronic Society (WPES), 2005. Google ScholarDigital Library
- G. Luo, C. Tang, and Y. li Tian, "Answering relationship queries on the web," in Proceeding of the 16th international conference on World Wide Web (WWW'07), Banff, Canada, May 2007, pp. 561--570. Google ScholarDigital Library
- F. L. Sanda Harabagiu and A. Hickl, "Answering complex questions with random walk models," in SIGIR'06, 2006, pp. 220--227. Google ScholarDigital Library
- V. K. P. Tan and J. Srivastava, "Indirect association: Mining higher order dependencies in data," in PKDD'00, 2000, pp. 632--637. Google ScholarDigital Library
Index Terms
- Preserving privacy on the searchable internet
Recommendations
Understanding the Privacy Risks of Popular Search Engine Advertising Systems
IMC '23: Proceedings of the 2023 ACM on Internet Measurement ConferenceWe present the first extensive measurement of the privacy properties of the advertising systems used by privacy-focused search engines. We propose an automated methodology to study the impact of clicking on search ads on three popularprivate search ...
The “Panopticon” of search engines: the response of the European data protection framework
Special Issue on Digital privacy: theory, policies and technologiesNowadays, Internet users are depending on various search engines in order to be able to find requested information on the Web. Although most users feel that they are and remain anonymous when they place their search queries, reality proves otherwise. ...
The best privacy defense is a good privacy offense: obfuscating a search engine user's profile
User privacy on the internet is an important and unsolved problem. So far, no sufficient and comprehensive solution has been proposed that helps a user to protect his or her privacy while using the internet. Data are collected and assembled by numerous ...
Comments