Skip to main content

City of Disguise: A Query Obfuscation Game on the ClueWeb

  • Conference paper
  • First Online:
Advances in Information Retrieval (ECIR 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13186))

Included in the following conference series:

  • 2365 Accesses

Abstract

We present City of Disguise, a retrieval game that tests how well searchers are able to reformulate some sensitive query in a ‘Taboo’-style setup but still retrieve good results. Given one of 200 sensitive information needs and a relevant example document, the players use a special ClueWeb12 search interface that also hints at potentially useful search terms. For an obfuscated query, the system assigns points depending on the result quality and the formulated query. In a pilot study with 72 players, we observed that they find obfuscations to retrieve relevant documents but often only when they relied on the suggested terms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    cnet.com/news/google-is-giving-data-to-police-based-on-search-keywords-court-docs-show/

  2. 2.

    Demo:https://demo.webis.de/city-of-disguise Screencast:https://demo.webis.de/city-of-disguise/screencast Code and Data:https://github.com/webis-de/ecir22-query-obfuscation-game.

References

  1. Ahmad, W.U., Rahman, M., Wang, H.: Topic model based privacy protection in personalized web search. In: Perego, R., Sebastiani, F., Aslam, J.A., Ruthven, I., Zobel, J. (eds.) Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval, SIGIR 2016, Pisa, Italy, pp. 1025–1028. ACM (2016)

    Google Scholar 

  2. Arampatzis, A., Drosatos, G., Efraimidis, P.S.: A versatile tool for privacy-enhanced web search. In: Serdyukov, P., et al. (eds.) ECIR 2013. LNCS, vol. 7814, pp. 368–379. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-36973-5_31

    Chapter  Google Scholar 

  3. Arampatzis, A., Drosatos, G., Efraimidis, P.S.: Versatile query scrambling for private web search. Inf. Retr. J. 18(4), 331–358 (2015)

    Article  Google Scholar 

  4. Arampatzis, A., Efraimidis, P., Drosatos, G.: Enhancing deniability against query-logs. In: Clough, P., et al. (eds.) ECIR 2011. LNCS, vol. 6611, pp. 117–128. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-20161-5_13

    Chapter  Google Scholar 

  5. Arampatzis, A., Efraimidis, P.S., Drosatos, G.: A query scrambler for search privacy on the internet. Inf. Retr. 16(6), 657–679 (2013)

    Article  Google Scholar 

  6. Bevendorff, J., Stein, B., Hagen, M., Potthast, M.: Elastic ChatNoir: search engine for the ClueWeb and the common crawl. In: Pasi, G., Piwowarski, B., Azzopardi, L., Hanbury, A. (eds.) ECIR 2018. LNCS, vol. 10772, pp. 820–824. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-76941-7_83

    Chapter  Google Scholar 

  7. Collins-Thompson, K., Bennett, P.N., Diaz, F., Clarke, C., Voorhees, E.M.: TREC 2013 Web track overview. In: Voorhees, E.M. (ed.) Proceedings of The Twenty-Second Text REtrieval Conference, TREC 2013, Gaithersburg, Maryland, USA, November 19–22, 2013, NIST Special Publication, vol. 500–302, National Institute of Standards and Technology (NIST) (2013)

    Google Scholar 

  8. Collins-Thompson, K., Macdonald, C., Bennett, P.N., Diaz, F., Voorhees, E.M.: TREC 2014 Web track overview. In: Voorhees, E.M., Ellis, A. (eds.) Proceedings of The Twenty-Third Text REtrieval Conference, TREC 2014, Gaithersburg, Maryland, USA, November 19–21, 2014, NIST Special Publication, vol. 500–308, National Institute of Standards and Technology (NIST) (2014)

    Google Scholar 

  9. Culpepper, J.S., Diaz, F., Smucker, M.D.: Research frontiers in information retrieval: Report from the third strategic workshop on information retrieval in Lorne (SWIRL 2018). SIGIR Forum 52(1), 34–90 (2018)

    Article  Google Scholar 

  10. Domingo-Ferrer, J., Solanas, A., Castellà-Roca, J.: H(k)-private information retrieval from privacy-uncooperative queryable databases. Online Inf. Rev. 33(4), 720–744 (2009)

    Article  Google Scholar 

  11. Fröbe, M., Schmidt, E.O., Hagen, M.: Efficient query obfuscation with keyqueries. In: 20th International IEEE/WIC/ACM Conference on Web Intelligence (WI-IAT 2021). ACM, December 2021. https://doi.org/10.1145/3486622.3493950, https://dl.acm.org/doi/10.1145/3486622.3493950

  12. Hong, Y., He, X., Vaidya, J., Adam, N.R., Atluri, V.: Effective anonymization of query logs. In: Cheung, D.W., Song, I., Chu, W.W., Hu, X., Lin, J.J. (eds.) Proceedings of the 18th ACM Conference on Information and Knowledge Management, CIKM 2009, Hong Kong, China, pp. 1465–1468. ACM (2009)

    Google Scholar 

  13. Karpukhin, V., et al.: Dense passage retrieval for open-domain question answering. In: Webber, B., Cohn, T., He, Y., Liu, Y. (eds.) Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, EMNLP 2020, Online, November 16–20, 2020, pp. 6769–6781. Association for Computational Linguistics (2020)

    Google Scholar 

  14. Kumar, R., Novak, J., Pang, B., Tomkins, A.: On anonymizing query logs via token-based hashing. In: Proceedings of the 16th International Conference on World Wide Web, WWW 2007, Banff, Alberta, Canada, pp. 629–638 (2007)

    Google Scholar 

  15. Ma, H., Chandrasekar, R., Quirk, C., Gupta, A.: Page hunt: improving search engines using human computation games. In: Allan, J., Aslam, J.A., Sanderson, M., Zhai, C., Zobel, J. (eds.) Proceedings of the 32nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2009, Boston, MA, USA, July 19–23, 2009, pp. 746–747. ACM (2009)

    Google Scholar 

  16. Pass, G., Chowdhury, A., Torgeson, C.: A picture of search. In: Jia, X. (ed.) Proceedings of the 1st International Conference on Scalable Information Systems, Infoscale 2006, Hong Kong, 30 May–1 June, 2006. ACM International Conference Proceeding Series, vol. 152, p. 1. ACM (2006)

    Google Scholar 

  17. Peddinti, S.T., Saxena, N.: On the privacy of web search based on query obfuscation: a case study of TrackMeNot. In: Atallah, M.J., Hopper, N.J. (eds.) PETS 2010. LNCS, vol. 6205, pp. 19–37. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-14527-8_2

    Chapter  Google Scholar 

  18. Peddinti, S.T., Saxena, N.: Web search query privacy: evaluating query obfuscation and anonymizing networks. J. Comput. Secur. 22(1), 155–199 (2014)

    Article  Google Scholar 

  19. Toubiana, V., Subramanian, L., Nissenbaum, H.: TrackMeNot: enhancing the privacy of web search. CoRR arXiv:1109.4677 (2011)

  20. Yang, P., Fang, H., Lin, J.: Anserini: Enabling the use of Lucene for information retrieval research. In: Kando, N., Sakai, T., Joho, H., Li, H., de Vries, A.P., White, R.W. (eds.) Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, Shinjuku, Tokyo, Japan, pp. 1253–1256. ACM (2017)

    Google Scholar 

  21. Yates, A., Nogueira, R., Lin, J.: Pretrained transformers for text ranking: BERT and beyond. In: Diaz, F., Shah, C., Suel, T., Castells, P., Jones, R., Sakai, T. (eds.) SIGIR ’21: The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, Virtual Event, Canada, July 11–15, 2021, pp. 2666–2668. ACM (2021)

    Google Scholar 

  22. Yu, P., Ahmad, W.U., Wang, H.: Hide-n-Seek: an intent-aware privacy protection plugin for personalized web search. In: Collins-Thompson, K., Mei, Q., Davison, B.D., Liu, Y., Yilmaz, E. (eds.) The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, SIGIR 2018, Ann Arbor, MI, USA, pp. 1333–1336. ACM (2018)

    Google Scholar 

  23. Zhang, S., Yang, G.H., Singh, L.: Anonymizing query logs by differential privacy. In: Perego, R., Sebastiani, F., Aslam, J.A., Ruthven, I., Zobel, J. (eds.) Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2016, Pisa, Italy, pp. 753–756. ACM (2016)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Maik Fröbe .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Fröbe, M., Libera, N.L., Hagen, M. (2022). City of Disguise: A Query Obfuscation Game on the ClueWeb. In: Hagen, M., et al. Advances in Information Retrieval. ECIR 2022. Lecture Notes in Computer Science, vol 13186. Springer, Cham. https://doi.org/10.1007/978-3-030-99739-7_34

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-99739-7_34

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-99738-0

  • Online ISBN: 978-3-030-99739-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics