A Literature Survey and Classifications on Data Deanonymisation

Al-Azizy, Dalal; Millard, David; Symeonidis, Iraklis; O’Hara, Kieron; Shadbolt, Nigel

doi:10.1007/978-3-319-31811-0_3

Dalal Al-Azizy^15,16,
David Millard¹⁵,
Iraklis Symeonidis¹⁷,
Kieron O’Hara¹⁵ &
…
Nigel Shadbolt¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9572))

Included in the following conference series:

International Conference on Risks and Security of Internet and Systems

1390 Accesses
1 Altmetric

Abstract

The problem of disclosing private anonymous data has become increasingly serious particularly with the possibility of carrying out deanonymisation attacks on publishing data. The related work available in the literature is inadequate in terms of the number of techniques analysed, and is limited to certain contexts such as Online Social Networks. We survey a large number of state-of-the-art techniques of deanonymisation achieved in various methods and on different types of data. Our aim is to build a comprehensive understanding about the problem. For this survey, we propose a framework to guide a thorough analysis and classifications. We are interested in classifying deanonymisation approaches based on type and source of auxiliary information and on the structure of target datasets. Moreover, potential attacks, threats and some suggested assistive techniques are identified. This can inform the research in gaining an understanding of the deanonymisation problem and assist in the advancement of privacy protection.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Protecting privacy in the age of big data: exploring data linking methods for quasi-identifier selection

Article Open access 03 December 2024

Personal Data, Non-personal Data, Anonymised Data, Pseudonymised Data, De-identified Data

A Formal Concept of Domain Pseudonymous Signatures

References

Ding, X., Zhang, L., Wan, Z., Gu, M.: A brief survey on de-anonymization attacks. In: Online Social Networks, in International Conference on Computational Aspects of Social Networks, pp. 611–615 (2010)
Google Scholar
El Emam, K., Jonker, E., Arbuckle, L., Malin, B.: A systematic review of re-identification attacks on health data. PLoS ONE 6(12), e28071 (2011)
Article Google Scholar
Sharma, S., Gupta, P., Bhatnagar, V.: Anonymisation in social network: a literature survey and classification. Int. J. Soc. Netw. 1(1), 51–66 (2012)
Google Scholar
Toch, E., Wang, Y., Cranor, L.F.: Personalization and privacy: a survey of privacy risks and remedies in personalization-based systems. User Model. User-adapt. Interact. 22(1–2), 203–220 (2012)
Article Google Scholar
Ohm, P.: Broken promises of privacy: responding to the surprising failure of anonymization. UCLA Law Rev. 57, 1701 (2010)
Google Scholar
Alexin, Z.: Does fair anonymization exist? Int. Rev. Law, Comput. Technol. 28(1), 21–44 (2014)
Article Google Scholar
Dwork, C., Naor, M.: On the difficulties of disclosure prevention in statistical databases or the case for differential privacy. J. Priv. Confidentiality 2(1), 93–107 (2008)
Google Scholar
O’Hara, K.: Transparent Government, Not Transparent Citizens: A Report on Privacy and Transparency for the Cabinet Office (2011)
Google Scholar
Sun, X., Wang, H., Zhang, Y.: On the identity anonymization of high-dimensional rating data, No. March (2011), pp. 1108–1122 (2012)
Google Scholar
Sweeney, L.: k-anonymity: a model for protecting privacy. Int. J. Uncertainty, Fuzziness Knowl. Based Syst. 10(05), 557–570 (2002)
Article MathSciNet MATH Google Scholar
Bayardo, R.J., Agrawal, R.: Data privacy through optimal k-anonymization. In: 21st International Conference Data and Engineering, pp. 217–228 (2005)
Google Scholar
Li, N.: Provably Private Data Anonymization: Or, k-Anonymity Meets Differential Privacy (2010)
Google Scholar
Machanavajjhala, A., Kifer, D., Gehrke, J., Venkitasubramaniam, M.: L-diversity: privacy beyond k-anonymity. ACM Trans. Knowl. Discov. Data 1(1), 3–es (2007)
Article Google Scholar
Zhou, B., Pei, J.: The k-anonymity and l-diversity approaches for privacy preservation in social networks against neighborhood attacks. Know. Inf. Syst. 28(1), 47–77 (2010)
Article Google Scholar
Li, N.: t-closeness: privacy beyond k-anonymity and -diversity. ICDE 7, 106–115 (2007)
Google Scholar
Domingo-Ferrer, J., Torra, V.: A critique of k-anonymity and some of its enhancements. In: Third International Conference Availability, Reliability and Security, pp. 990–993 (2008)
Google Scholar
Narayanan, A., Shi, E., Rubinstein, B.I.P.: Link prediction by de-anonymization: how we won the Kaggle social network challenge. In: Neural Networks (IJCNN) (2011)
Google Scholar
Sharad, K., Danezis, G.: De-anonymizing D4D datasets. In: Workshop on Hot Topics in Privacy Enhancing Technologies (2013)
Google Scholar
Narayanan, A., Shmatikov, V.: Robust de-anonymization of large sparse datasets. In: IEEE Symposium on Security and Privacy, pp. 111–125 (2008)
Google Scholar
Bender, S., Brand, R., Bacher, J.: Re-identifying register data by survey data: an empirical study. Stat. J. United Nations ECE 18(00311), 373–381 (2001)
MATH Google Scholar
Gulyás, G., Imre, S.: Analysis of identity separation against a passive clique-based de-anonymization attack. Infocomm. J. 3(4), 1–10 (2011)
Google Scholar
Torra, V., Stokes, K.: A formalization of re-identification in terms of compatible probabilities. CoRR, abs/1301.5, pp. 1–20 (2013)
Google Scholar
Datta, A., Sharma, D., Sinha, A.: Provable de-anonymization of large datasets with sparse dimensions. in principles of security and trust (2012)
Google Scholar
Gulyas, G.G., Imre, S.: Measuring importance of seeding for structural de-anonymization attacks in social networks. In: The Sixth IEEE Workshop on SECurity and SOCial Networking, pp. 610–615 (2014)
Google Scholar
Hay, M., Miklau, G., Jensen, D.: Resisting structural re-identification in anonymized social networks. Proceedings of the VLDB Endowment 1(1), 102–114 (2008)
Article Google Scholar
Dankar, F.K., El Emam, K., Neisa, A., Roffey, T.: Estimating the re-identification risk of clinical data sets. BMC Med. Inf. Decis. Making 12(1), 66 (2012)
Article Google Scholar
Cecaj, A., Mamei, M., Bicocchi, N.: Re-identification of anonymized CDR datasets using social network data. In: The Third IEEE International Workshop on the Impact of Human Mobility in Pervasive Systems and Applications, pp. 237–242 (2014)
Google Scholar
Zhang, A., Xie, X., Chang, K.C.-C., Gunter, C.A., Han, J., Wang, X.F.: Privacy risk in anonymized heterogeneous information networks. In: EDBT (2014)
Google Scholar
Pedarsani, P., Grossglauser, M.: On the privacy of anonymized networks. In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD 2011, p. 1235 (2011)
Google Scholar
Zhu, T., Wang, S., Li, X., Zhou, Z., Zhang, R.: Structural attack to anonymous graph of social networks. Math. Probl. Eng. 2013, 1–8 (2013)
MATH Google Scholar
Narayanan, A., Shmatikov, V.: De-anonymizing social networks. In: 30th IEEE Symposium on Security and Privacy, pp. 173–187 (2009)
Google Scholar
Srivatsa, M., Hicks, M.: Deanonymizing mobility traces: using social networks as a side-channel. In: Proceedings of the 2012 ACM Conference on Computer and Communications Security. ACM (2012)
Google Scholar
Sharad, K., Danezis, G.: An automated social graph de-anonymization technique. arXiv Prepr, arXiv:1408.1276 (2014)
Nilizadeh, S., Kapadia, A., Ahn, Y.-Y.: Community-enhanced de-anonymization of online social networks. In: CCS 2014 (2014)
Google Scholar
Peng, W., Li, F., Zou, X., Wu, J.: A two-stage deanonymization attack against anonymized social networks. IEEE Trans. Comput. 63(2), 290–303 (2014)
Article MathSciNet Google Scholar
Backstrom, L., Dwork, C., Kleinberg, J.: Wherefore art thou R3579X? anonymized social networks, hidden patterns, and structural steganography. In: Proceedings of the 16th International Conference on World Wide Web. ACM (2007)
Google Scholar
Bringmann, K., Friedrich, T., Krohmer, A.: De-anonymization of heterogeneous random graphs in quasilinear time. In: ESA, pp. 197–208 (2014)
Google Scholar
Simon, B., Gulyás, G.G., Imre, S.: Analysis of grasshopper, a novel social network de-anonymization algorithm. Periodica Polytechnica Electr. Eng. Comput. Sci. 58(4), 161–173 (2014)
Article Google Scholar
Kazemi, E., Hassani, S.H., Grossglauser, M.: Growing a graph matching from a handful of seeds. In: 41st International Conference on Very Large Data Bases (2015)
Google Scholar
Ding, X., Zhang, L., Wan, Z., Gu, M.: De-anonymizing dynamic social networks. In: IEEE Global Telecommunications Conference – GLOBECOM, pp. 1–6 (2011)
Google Scholar
Gambs, S., Killijian, M.-O., Núñez del Prado Cortez, M.: De-anonymization attack on geolocated data. J. Comput. Syst. Sci. 80(8), 1597–1614 (2014)
Article MathSciNet MATH Google Scholar
Ji, S., Li, W., Srivatsa, M., He, J.S., Beyah, R.: Structure based data de-anonymization of social networks and mobility traces (2014)
Google Scholar
Okuno, T., Ichino, M., Kuboyama, T., Yoshiura, H.: Content-based de-anonymisation of tweets. In: The Seventh International Conference on Intelligent Information Hiding and Multimedia Signal Processing, pp. 53–56 (2011)
Google Scholar
Fu, H., Zhang, A., Xie, X.: Effective social graph de-anonymization based on graph structure and descriptive information. ACM Trans. Intell. Syst. Technol. 6(4), 1–29 (2008)
Article Google Scholar
Unnikrishnan, J., Naini, F. M.: De-anonymizing private data by matching statistics. In: Allerton Conference on Communication, Control, and Computing, No. EPFL-CONF-196580 (2013)
Google Scholar
Zheleva, E., Getoor, L.: To join or not to join: the illusion of privacy in social networks with mixed public and private user profiles, pp. 531–540 (2009)
Google Scholar
Wondracek, G., Holz, T., Kirda, E., Kruegel, C.: A practical attack to de-anonymize social network users. In: IEEE Symposium on Security and Privacy, pp. 223–238 (2010)
Google Scholar
Korayem, M., Crandall, D. J.: De-anonymizing users across heterogeneous social computing platforms. In: Proceedings of the Seventh International AAAI Conference on Weblogs and Social Media, pp. 1–4 (2013)
Google Scholar
Lane, N.D., Xie, J., Moscibroda, T., Zhao, F.: On the feasibility of user de-anonymization from shared mobile sensor data. In: Proceedings of the Third International Workshop on Sensing Applications on Mobile Phones - PhoneSense 2012, pp. 1–5 (2012)
Google Scholar
Merener, M.M.: Theoretical results on de-anonymization via linkage attacks. Trans. Data Priv. 5(2), 377–402 (2012)
MathSciNet Google Scholar
Frankowski, D., Cosley, D., Sen, S., Terveen, L., Riedl, J.: You are what you say: privacy risks of public mentions. In: Proceedings of the 29th SIGIR 2006, pp. 565–572 (2006)
Google Scholar
Malin, B., Sweeney, L.: How (not) to protect genomic data privacy in a distributed network: using trail re-identification to evaluate and design anonymity protection systems. J. Biomed. Inform. 37(3), 179–192 (2004)
Article Google Scholar
Malin, B., Sweeney, L., Newton, E.: Trail re-identification: learning who you are from where you have been. In: Workshop on Privacy in Data (2003)
Google Scholar
Foukarakis, M., Antoniades, D., Antonatos, S., Markatos, E.P.: On the anonymization and deanonymization of netflow traffic. In: Proceedings of FloCon (2008)
Google Scholar
Biryukov, A., Pustogarov, I., Weinmann, R.-P.: Trawling for tor hidden services: detection, measurement, deanonymization. In: 2013 IEEE Symposium on Security and Privacy, pp. 80–94 (2013)
Google Scholar
Pataky, M.: De-anonymization of an Internet user based on his web browser. In: CER Comparative European Research, pp. 125–128 (2014)
Google Scholar
Danezis, G., Troncoso, C.: You cannot hide for long: de-anonymization of real-world dynamic behaviour. In: WPES 2013, pp. 49–59 (2013)
Google Scholar
Calandrino, J.A., Kilzer, A., Narayanan, A., Felten, E.W., Shmatikov, V.: You might also like: privacy risks of collaborative filtering, privacy risks of collaborative filtering. In: IEEE Symposium on Security and Privacy. IEEE (2011)
Google Scholar
Danezis, G., Troncoso, C.: Vida: how to use Bayesian inference to de-anonymize persistent communications. In: Privacy Enhancing Technologies (2009)
Google Scholar
Ji, S., Li, W., Srivatsa, M., Beyah, R.: Structural data de-anonymization: quantification, practice, and implications. In: CCS 2014 (2014)
Google Scholar
Ji, S., Li, W., Gong, N.Z., Mittal, P., Beyah, R.: On your social network de-anonymizablity: quantification and large scale evaluation with seed knowledge. In: The 2015 Network and Distributed System Security (NDSS) Symposium, San Diego, CA, US, pp. 8–11 (2015)
Google Scholar

Download references

Acknowledgments

This research is funded by University of Tabuk in Saudi Arabia and supported by Saudi Arabian Cultural Bureau in London.

Author information

Authors and Affiliations

Web and Internet Science, School of Electronics and Computer Science, University of Southampton, Southampton, UK
Dalal Al-Azizy, David Millard & Kieron O’Hara
University of Tabuk, Tabuk, Saudi Arabia
Dalal Al-Azizy
ESAT/COSIC, KU Leuven and iMinds, Leuven, Belgium
Iraklis Symeonidis
Department of Computer Science, University of Oxford, Oxford, UK
Nigel Shadbolt

Authors

Dalal Al-Azizy
View author publications
You can also search for this author in PubMed Google Scholar
David Millard
View author publications
You can also search for this author in PubMed Google Scholar
Iraklis Symeonidis
View author publications
You can also search for this author in PubMed Google Scholar
Kieron O’Hara
View author publications
You can also search for this author in PubMed Google Scholar
Nigel Shadbolt
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dalal Al-Azizy .

Editor information

Editors and Affiliations

University of Piraeus, Piraeus, Greece
Costas Lambrinoudakis
Université de la Polynésie Française, Faa'a, French Polynesia
Alban Gabillon

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Al-Azizy, D., Millard, D., Symeonidis, I., O’Hara, K., Shadbolt, N. (2016). A Literature Survey and Classifications on Data Deanonymisation. In: Lambrinoudakis, C., Gabillon, A. (eds) Risks and Security of Internet and Systems. CRiSIS 2015. Lecture Notes in Computer Science(), vol 9572. Springer, Cham. https://doi.org/10.1007/978-3-319-31811-0_3

Download citation

DOI: https://doi.org/10.1007/978-3-319-31811-0_3
Published: 02 April 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-31810-3
Online ISBN: 978-3-319-31811-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Literature Survey and Classifications on Data Deanonymisation

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Protecting privacy in the age of big data: exploring data linking methods for quasi-identifier selection

Personal Data, Non-personal Data, Anonymised Data, Pseudonymised Data, De-identified Data

A Formal Concept of Domain Pseudonymous Signatures

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

A Literature Survey and Classifications on Data Deanonymisation

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Protecting privacy in the age of big data: exploring data linking methods for quasi-identifier selection

Personal Data, Non-personal Data, Anonymised Data, Pseudonymised Data, De-identified Data

A Formal Concept of Domain Pseudonymous Signatures

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation