Abstract
Individuals are continually observed and monitored by many location-based services, such as social networks, telecommunication companies, mobile networks, etc. The resulting streams of data, which are usually analyzed in real time, can reveal sensitive information about individuals, e.g. home/work location or private mobility patterns. Therefore, there is a need for stream processing algorithms able to anonymize datasets in real time to ensure certain privacy guarantees, but at the same time keeping a low error. In this paper, we describe how statistical disclosure control (SDC) methods can be applied to a Call Detail Record (CDR) database in a stream fashion to mask location information efficiently. Besides, we also provide some experimental results over a real database.
Similar content being viewed by others
Notes
By mechanism, we refer to any kind of function or system used to query for data.
A max-heap is used to keep the greatest element on top of the queue.
References
Becker RA, Cáceres R, Hanson K, Loh JM, Urbanek S, Varshavsky A, Volinsky C (2011) Clustering anonymized mobile call detail records to find usage groups. In: Workshop on pervasive and urban applications
Bifet A, Holmes G, Kirkby R, Pfahringer B (2010) MOA: massive online analysis. J Mach Learn Res 11:1601–1604
Cao J, Carminati B, Ferrari E, Tan K (2011) Castle: continuously anonymizing data streams. IEEE Trans Depend Secure Comput 8(3):337–352
Domingo-Ferrer J, Mateo-Sanz JM (2002) Practical data-oriented microaggregation for statistical disclosure control. IEEE Trans Knowl Data Eng 14(1):189–201
Domingo-Ferrer J, Torra V (2001) Disclosure control methods and information loss for microdata. In: Confidentiality, disclosure, and data access: theory and practical applications for statistical agencies. Elsevier Science, pp 91–110
Domingo-Ferrer J, Torra V (2004) Disclosure risk assessment in statistical data protection. J Comput Appl Math 164:285–293
Domingo-Ferrer J, Sebé F, Solanas A (2008) A polynomial-time approximation to optimal multivariate microaggregation. Comput Math Appl 55(4):714–732
Dwork C (2006) Differential privacy. In: In ICALP, Springer, pp 1–12
Elliot M, Domingo-Ferrer J (2018) The future of statistical disclosure control. Paper published as part of The National Statistician's Quality review. London. https://arxiv.org/abs/1812.09204
Gaber MM, Zaslavsky A, Krishnaswamy S (2005) Mining data streams: a review. ACM Sigmod Rec 34(2):18–26
Gambs S, Killijian MO, del Prado Cortez MN (2010a) Gepeto: a geoprivacy-enhancing toolkit. In: 2010 IEEE 24th international conference on advanced information networking and applications workshops, IEEE, pp 1071–1076
Gambs S, Killijian MO, del Prado Cortez MN (2010b) Show me how you move and i will tell you who you are. In: Proceedings of the 3rd ACM SIGSPATIAL international workshop on security and privacy in GIS and LBS, ACM, pp 34–41
Gambs S, Killijian MO, del Prado Cortez MN (2014) De-anonymization attack on geolocated data. J Comput Syst Sci 80(8):1597–1614
Horak R (2007) Telecommunications and data communications handbook. Wiley, New York
Hundepool A, Domingo-Ferrer J, Franconi L, Giessing S, Nordholt ES, Spicer K, de Wolf PP (2012) Statistical disclosure control. Wiley, Hoboken
Isaacman S, Becker R, Cáceres R, Kobourov S, Martonosi M, Rowland J, Varshavsky A (2011) Identifying important places in people’s lives from cellular network data. In: International conference on pervasive computing, Springer, pp 133–151
Kifer D, Machanavajjhala A (2011) No free lunch in data privacy. In: Proceedings of the 2011 ACM SIGMOD international conference on management of data, ACM, pp 193–204
Kullback S, Leibler R (1951) On information and sufficiency. Ann Math Stat 22:79–86
Leoni D (2012) Non-interactive differential privacy: a survey. In: Proceedings of the first international workshop on open data, ACM, New York, NY, USA, pp 40–52
Li N, Li T, Venkatasubramanian S (2007) T-closeness: privacy beyond k-anonymity and l-diversity. In: IEEE 23rd international conference on data engineering, IEEE, pp 106–115
Li N, Lyu M, Su D, Yang W (2016) Differential privacy: from theory to practice. Synth Lect Inf Secur Priv Trust 8:1–138
Machanavajjhala A (2007) L-diversity: privacy beyond k-anonymity. ACM Trans Knowl Discov Data 1:24
Martínez-Rodríguez D, Nin J, Nuñez-del-Prado M (2017) Towards the adaptation of SDC methods to stream mining. Comput Secur 70:702–722
Mateo-Sanz JM, Domingo-Ferrer J, Sebé F (2005) Probabilistic information loss measures in confidentiality protection of continuous microdata. Data Min Knowl Discov 11(2):181–193
Mayer J, Mutchler P, Mitchell JC (2016) Evaluating the privacy properties of telephone metadata. Proc Natl Acad Sci 113(20):5536–5541
Office for National Statistics (2014) Statistical disclosure control. http://www.ons.gov.uk/ons/guide-method/method-quality/general-methodology/statistical-disclosure-control/index.html
Ranjan G, Zang H, Zhang ZL, Bolot J (2012) Are call detail records biased for sampling human mobility? ACM SIGMOBILE Mobile Comput Commun Rev 16(3):33–44
Rodríguez DM, Nin J, Núñez-del-Prado M (2017) Towards the adaptation of SDC methods to stream mining. Comput Secur 70:702–722
Salvador S, Chan P (2004) Fastdtw: Toward accurate dynamic time warping in linear time and space. In: KDD workshop on mining temporal and sequential data, pp 70–80
Siła-Nowicka K, Vandrol J, Oshan T, Long JA, Demšar U, Fotheringham AS (2016) Analysis of human mobility patterns from gps trajectories and contextual information. Int J Geogr Inf Sci 30(5):881–906
Song C, Qu Z, Blumm N, Barabási AL (2010) Limits of predictability in human mobility. Science 327(5968):1018–1021
Soria-Comas J, Domingo-Ferrer J, Sánchez D, Martínez S (2014) Enhancing data utility in differential privacy via microaggregation-based k-anonymity. Very Large Data Base J 23(5):771–794
Sweeney L (2002) K-anonymity: a model for protecting privacy. Int J Uncertain Fuzziness Knowl Based Syst 10(5):557–570
Templ M, Meindl B, Kowarik A (2015) Statistical disclosure control for micro-data using the R package sdcMicro. J Stat Software 67(1):1–36
Torra V, Domingo-Ferrer J (2003) Record linkage methods for multidatabase data mining. In: Torra V (ed) Information fusion in data mining, studies in fuzziness and soft computing, vol 123. Springer, Berlin, pp 101–132
Wang H, Calabrese F, Lorenzo GD, Ratti C (2010) Transportation mode inference from anonymized and aggregated mobile phone call detail records. In: 13th international IEEE conference on intelligent transportation systems, pp 318–323
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Nunez-del-Prado, M., Nin, J. Revisiting online anonymization algorithms to ensure location privacy. J Ambient Intell Human Comput 14, 15097–15108 (2023). https://doi.org/10.1007/s12652-019-01371-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12652-019-01371-6