skip to main content
10.1145/3132847.3132998acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

From Fingerprint to Footprint: Revealing Physical World Privacy Leakage by Cyberspace Cookie Logs

Published:06 November 2017Publication History

ABSTRACT

It is well-known that online services resort to various cookies to track users through users' online service identifiers (IDs) - in other words, when users access online services, various "fingerprints" are left behind in the cyberspace. As they roam around in the physical world while accessing online services via mobile devices, users also leave a series of "footprints" -- i.e., hints about their physical locations - in the physical world. This poses a potent new threat to user privacy: one can potentially correlate the "fingerprints" left by the users in the cyberspace with "footprints" left in the physical world to infer and reveal leakage of user physical world privacy, such as frequent user locations or mobility trajectories in the physical world - we refer to this problem as user physical world privacy leakage via user cyberspace privacy leakage. In this paper we address the following fundamental question: what kind - and how much - of user physical world privacy might be leaked if we could get hold of such diverse network datasets even without any physical location information. In order to conduct an in-depth investigation of these questions, we utilize the network data collected via a DPI system at the routers within one of the largest Internet operator in Shanghai, China over a duration of one month. We decompose the fundamental question into the three problems: i) linkage of various online user IDs belonging to the same person via mobility pattern mining; ii) physical location classification via aggregate user mobility patterns over time; and iii) tracking user physical mobility. By developing novel and effective methods for solving each of these problems, we demonstrate that the question of user physical world privacy leakage via user cyberspace privacy leakage is not hypothetical, but indeed poses a real potent threat to user privacy.

References

  1. Amazon. {n. d.}. Alexa's digital marketing tools. http://www.alexa.com/topsites/countries/CN.Google ScholarGoogle Scholar
  2. Randy Baden, Adam Bender, Neil Spring, Bobby Bhattacharjee, and Daniel Starin. 2009. Persona: an online social network with user-defined privacy Proc. ACM SIGCOMM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Eunjoon Cho, Seth A. Myers, and Jure Leskovec. 2011. Friendship and mobility: user movement in location-based social networks Proc. ACM SIGKDD. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Justin Cranshaw, Eran Toch, Jason Hong, Aniket Kittur, and Norman Sadeh. 2010. Bridging the gap between physical location and online social networks Proc. ACM Ubicomp. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Yves-Alexandre de Montjoye, César A Hidalgo, Michel Verleysen, and Vincent D. Blondel. 2013. Unique in the crowd: The privacy bounds of human mobility. Scientific reports Vol. 3 (2013).Google ScholarGoogle Scholar
  6. Yves-Alexandre de Montjoye, Laura Radaelli, and Vivek Kumar Singh. 2015. Unique in the shopping mall: On the reidentifiability of credit card metadata. Science, Vol. 347, 6221 (2015), 536--539.Google ScholarGoogle Scholar
  7. Josep Domingo-Ferrer and Rolando Trujillo-Rasua. 2012. Microaggregation-and permutation-based anonymization of movement data. Information Sciences Vol. 208 (2012), 55--80. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Lujun Fang and Kristen LeFevre. 2010. Privacy wizards for social networking sites. In Proc. WWW. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Huiji Gao, Jiliang Tang, Xia Hu, and Huan Liu. 2013. Modeling temporal effects of human mobile behavior on location-based social networks Proc. ACM CIKM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Oana Goga, Howard Lei, Sree Hari Krishnan Parthasarathi, Gerald Friedland, Robin Sommer, and Renata Teixeira. 2013. Exploiting innocuous activity for correlating users across sites Proc. WWW. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Oana Goga, Patrick Loiseau, Robin Sommer, Renata Teixeira, and Krishna P. Gummadi. 2015. On the reliability of profile matching across large online social networks Proc. ACM SIGKDD. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Marta C Gonzalez, Cesar A Hidalgo, and Albert-Laszlo Barabasi. 2008. Understanding individual human mobility patterns. Nature, Vol. 453, 7196 (2008), 779--782.Google ScholarGoogle Scholar
  13. Google. {n. d.}. Chrome developer tools. https://developer.chrome.com/devtools/.Google ScholarGoogle Scholar
  14. Marco Gramaglia and Marco Fiore. 2015. Hiding Mobile Traffic Fingerprints with GLOVE. In Proc. ACM CoNEXT. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Ehsan Kazemi, S Hamed Hassani, and Matthias Grossglauser. 2015. Growing a graph matching from a handful of seeds. Proceedings of the VLDB Endowment Vol. 8, 10 (2015), 1010--1021. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Hidetoshi Kido, Yutaka Yanagisawa, and Tetsuji Satoh. 2005. Protection of location privacy using dummies for location-based services Proc. ICDE Workshops. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Nitish Korula and Silvio Lattanzi. 2014. An efficient reconciliation algorithm for social networks. Proceedings of the VLDB Endowment Vol. 7, 5 (2014), 377--388. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Michal Kosinski, David Stillwell, and Thore Graepel. 2013. Private traits and attributes are predictable from digital records of human behavior. PNAS, Vol. 110, 15 (2013), 5802--5805.Google ScholarGoogle ScholarCross RefCross Ref
  19. Balachander Krishnamurthy, Konstantin Naryshkin, and Craig E Wills. 2011. Privacy leakage vs. Protection measures: the growing disconnect Proc. W2SP.Google ScholarGoogle Scholar
  20. Balachander Krishnamurthy and Craig Wills. 2009 a. Privacy diffusion on the web: a longitudinal perspective Proc. WWW. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Balachander Krishnamurthy and Craig E. Wills. 2006. Generating a privacy footprint on the internet. In Proc. ACM IMC. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Balachander Krishnamurthy and Craig E Wills. 2009 b. On the leakage of personally identifiable information via online social networks Proc. ACM WOSN. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Stevens Le Blond, Chao Zhang, Arnaud Legout, Keith Ross, and Walid Dabbous. 2011. I know where you are and what you are sharing: exploiting P2P communications to invade users' privacy. In Proc. ACM IMC. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Xutao Li, Tuan-Anh Nguyen Pham, Gao Cong, Quan Yuan, Xiao-Li Li, and Shonali Krishnaswamy. 2015. Where you instagram? Associating your instagram photos with points of interest Proc. ACM CIKM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Yabing Liu, Krishna P Gummadi, Balachander Krishnamurthy, and Alan Mislove. 2011. Analyzing facebook privacy settings: user expectations vs. reality Proc. ACM IMC. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Noman Mohammed, Benjamin Fung, and Mourad Debbabi. 2009. Walking in the crowd: anonymizing trajectory data for pattern analysis Proc. ACM CIKM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Anna Monreale, Gennady L Andrienko, Natalia V Andrienko, Fosca Giannotti, Dino Pedreschi, Salvatore Rinzivillo, and Stefan Wrobel. 2010. Movement Data Anonymity through Generalization. Transactions on Data Privacy Vol. 3, 2 (2010), 91--121. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Arvind Narayanan and Vitaly Shmatikov. 2008. Robust de-anonymization of large sparse datasets. Proc. IEEE SP. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Jahna Otterbacher. 2010. Inferring gender of movie reviewers: exploiting writing style, content and metadata Proc. ACM CIKM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, et almbox.. 2011. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research Vol. 12, Oct (2011), 2825--2830. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. David Martin Powers. 2011. Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation. (2011).Google ScholarGoogle Scholar
  32. Gyan Ranjan, Hui Zang, Zhi-Li Zhang, and Jean Bolot. 2012. Are call detail records biased for sampling human mobility? ACM SIGMOBILE Mobile Computing and Communications Review, Vol. 16, 3 (2012), 33--44. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Jingjing Ren, Ashwin Rao, Martina Lindorfer, Arnaud Legout, and David Choffnes. 2015. Recon: Revealing and controlling privacy leaks in mobile network traffic Proc. MobiSys. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Christopher Riederer, Yunsung Kim, Augustin Chaintreau, Nitish Korula, and Silvio Lattanzi. 2016. Linking Users Across Domains with Location Data: Theory and Validation Proc. WWW. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Luca Rossi and Mirco Musolesi. 2014. It's the way you check-in: identifying users in location-based social networks Proc. ACM COSN. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Chaoming Song, Tal Koren, Pu Wang, and Albert-László Barabási. 2010 a. Modelling the scaling properties of human mobility. Nature Physics, Vol. 6, 10 (2010), 818--823.Google ScholarGoogle ScholarCross RefCross Ref
  37. Chaoming Song, Zehui Qu, Nicholas Blumm, and Albert-László Barabási. 2010 b. Limits of predictability in human mobility. Science, Vol. 327, 5968 (2010), 1018--1021.Google ScholarGoogle Scholar
  38. Yi Song, Daniel Dahlmeier, and Stéphane Bressan. 2014. Not So Unique in the Crowd: a Simple and Effective Algorithm for Anonymizing Location Data ACM PIR. 19--24.Google ScholarGoogle Scholar
  39. Yi Song, Panagiotis Karras, Sadegh Nobari, Giorgos Cheliotis, Mingqiang Xue, and Stéphane Bressan. 2012. Discretionary social network data revelation with a user-centric utility guarantee Proc. ACM CIKM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Johan AK Suykens and Joos Vandewalle. 1999. Least squares support vector machine classifiers. Neural processing letters Vol. 9, 3 (1999), 293--300. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Ionut Trestian, Supranamaya Ranjan, Aleksandar Kuzmanovic, and Antonio Nucci. 2009. Measuring serendipity: connecting people, locations and interests in a mobile 3G network Proc. ACM IMC. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Chuang Wang, Xing Xie, Lee Wang, Yansheng Lu, and Wei-Ying Ma. 2005. Web resource geographic location classification and detection Proc. WWW. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Ning Xia, Han Hee Song, Yong Liao, Marios Iliofotou, Antonio Nucci, Zhi-Li Zhang, and Aleksandar Kuzmanovic. 2013. Mosaic: Quantifying privacy leakage in mobile networks Proc. ACM SIGCOMM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Hui Zang and Jean Bolot. 2011. Anonymization of location data does not work: A large-scale measurement study Proc. ACM MobiCom. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. From Fingerprint to Footprint: Revealing Physical World Privacy Leakage by Cyberspace Cookie Logs

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      CIKM '17: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management
      November 2017
      2604 pages
      ISBN:9781450349185
      DOI:10.1145/3132847

      Copyright © 2017 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 6 November 2017

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      CIKM '17 Paper Acceptance Rate171of855submissions,20%Overall Acceptance Rate1,861of8,427submissions,22%

      Upcoming Conference

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader