research-article

Analyzing sensitive information leakage in trajectory embedding models

Authors:

Chenghu ZhouAuthors Info & Claims

SIGSPATIAL '22: Proceedings of the 30th International Conference on Advances in Geographic Information Systems

Article No.: 85, Pages 1 - 10

https://doi.org/10.1145/3557915.3561021

Published: 22 November 2022 Publication History

Abstract

With the proliferation of the mobile networks and location-based services, huge volume of user trajectories are collected to analyze the similarity among users and further unveil human mobility patterns for downstream tasks, such as point-of-interest recommendation and tourism planning. In recent works, trajectory embedding methods have been studied as efficient ways of trajectory similarity computation and effective inputs for downstream tasks, which embed trajectories into latent vector spaces equipped with the Euclidean distance to approximate the trajectory similarity and capture the characteristics of human mobility patterns. However, we demonstrate that such embedding, though hiding the locations, can leak the sensitive information of the trajectories, combined with auxiliary data. In this work, we propose trajectory embedding attack schemes to analyze the sensitive information leakage of the embedding vectors. In the experiment, we demonstrate that the passing areas, visited ROIs, and exact shapes of the trajectories are vulnerable under attacks on embedding vectors by the adversary with auxiliary information.

References

[1]

Miguel E Andrés, Nicolás E Bordenabe, Konstantinos Chatzikokolakis, and Catuscia Palamidessi. 2013. Geo-indistinguishability: Differential privacy for location-based systems. In Proceedings of the 2013 ACM SIGSAC conference on Computer & communications security. 901--914.

Digital Library

[2]

Aleksandar Bojchevski and Stephan Günnemann. 2019. Adversarial attacks on node embeddings via graph poisoning. In International Conference on Machine Learning. PMLR, 695--704.

[3]

Heng Chang, Yu Rong, Tingyang Xu, Wenbing Huang, Honglei Zhang, Peng Cui, Wenwu Zhu, and Junzhou Huang. 2020. A restricted black-box adversarial framework towards attacking graph embedding models. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 3389--3396.

[4]

Jinyin Chen, Yangyang Wu, Xuanheng Xu, Yixian Chen, Haibin Zheng, and Qi Xuan. 2018. Fast gradient attack on network embedding. arXiv preprint arXiv:1809.02797 (2018).

[5]

Lei Chen and Raymond Ng. 2004. On the marriage of lp-norms and edit distance. In Proceedings of the Thirtieth international conference on Very large data bases-Volume 30. 792--803.

Digital Library

[6]

Lei Chen, M Tamer Özsu, and Vincent Oria. 2005. Robust and fast similarity search for moving object trajectories. In Proceedings of the 2005 ACM SIGMOD international conference on Management of data. 491--502.

Digital Library

[7]

Yile Chen, Xiucheng Li, Gao Cong, Zhifeng Bao, Cheng Long, Yiding Liu, Arun Kumar Chandran, and Richard Ellison. 2021. Robust Road Network Representation Learning: When Traffic Patterns Meet Traveling Semantics. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management. 211--220.

Digital Library

[8]

Yves-Alexandre De Montjoye, César A Hidalgo, Michel Verleysen, and Vincent D Blondel. 2013. Unique in the crowd: The privacy bounds of human mobility. Scientific reports 3, 1 (2013), 1--5.

[9]

Jiaxin Ding, Chien-Chun Ni, and Jie Gao. 2017. Fighting statistical Re-Identification in human trajectory publication. In Proceedings of the 25th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems. 1--4.

Digital Library

[10]

Jiaxin Ding, Chien-Chun Ni, Mengyu Zhou, and Jie Gao. 2017. Minhash hierarchy for privacy preserving trajectory sensing and query. In 2017 16th ACM/IEEE International Conference on Information Processing in Sensor Networks (IPSN). IEEE, 17--28.

Digital Library

[11]

Nathan Eagle, Alex Pentland, and David Lazer. 2009. Inferring friendship network structure by using mobile phone data. Proceedings of the national academy of sciences 106, 36 (2009), 15274--15278.

[12]

Martin Ester, Hans-Peter Kriegel, Jörg Sander, and Xiaowei Xu. 1996. A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. In Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (Portland, Oregon) (KDD'96). AAAI Press, 226--231.

Digital Library

[13]

Qiang Gao, Fan Zhou, Kunpeng Zhang, Goce Trajcevski, Xucheng Luo, and Fengli Zhang. 2017. Identifying Human Mobility via Trajectory Embeddings. In Proceedings of the 26th International Joint Conference on Artificial Intelligence (Melbourne, Australia) (IJCAI'17). AAAI Press, 1689--1695.

Digital Library

[14]

Chih-Chieh Hung, Wen-Chih Peng, and Wang-Chien Lee. 2015. Clustering and aggregating clues of trajectories for mining trajectory patterns and routes. The VLDB Journal 24, 2 (2015), 169--192.

Digital Library

[15]

Xiangjie Kong, Ximeng Song, Feng Xia, Haochen Guo, Jinzhong Wang, and Amr Tolba. 2018. LoTAD: Long-term traffic anomaly detection based on crowdsourced bus trajectory data. World Wide Web 21, 3 (2018), 825--847.

Digital Library

[16]

Xiucheng Li, Kaiqi Zhao, Gao Cong, Christian S Jensen, and Wei Wei. 2018. Deep representation learning for trajectory similarity computation. In 2018 IEEE 34th International Conference on Data Engineering (ICDE). IEEE, 617--628.

[17]

Yan Lin, Huaiyu Wan, Shengnan Guo, and Youfang Lin. 2021. Pre-training context and time aware location embeddings from spatial-temporal trajectories for user next location prediction. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 4241--4248.

[18]

Luís Moreira-Matias, João Gama, Michel Ferreira, João Mendes-Moreira, and Luis Damas. 2016. Time-evolving OD matrix estimation using high-speed GPS data streams. Expert systems with Applications 44 (2016), 275--288.

[19]

Xudong Pan, Mi Zhang, Shouling Ji, and Min Yang. 2020. Privacy Risks of General-Purpose Language Models. In 2020 IEEE Symposium on Security and Privacy (SP). 1314--1331.

[20]

Chaoming Song, Zehui Qu, Nicholas Blumm, and Albert-László Barabási. 2010. Limits of predictability in human mobility. Science 327, 5968 (2010), 1018--1021.

[21]

Congzheng Song and Ananth Raghunathan. 2020. Information Leakage in Embedding Models (CCS '20). Association for Computing Machinery, New York, NY, USA, 377--390.

Digital Library

[22]

Mingjie Sun, Jian Tang, Huichen Li, Bo Li, Chaowei Xiao, Yao Chen, and Dawn Song. 2018. Data poisoning attack against unsupervised node embedding methods. arXiv preprint arXiv:1810.12881 (2018).

[23]

Huaiyu Wan, Fuchen Li, Shengnan Guo, Zhong Cao, and Youfang Lin. 2019. Learning time-aware distributed representations of locations from spatio-temporal trajectories. In International Conference on Database Systems for Advanced Applications. Springer, 268--272.

[24]

Haotian Wang, Abhirup Ghosh, Jiaxin Ding, Rik Sarkar, and Jie Gao. 2021. Heterogeneous interventions reduce the spread of COVID-19 in simulations on real mobility data. Scientific reports 11, 1 (2021), 1--12.

[25]

Sheng Wang, Mingzhao Li, Yipeng Zhang, Zhifeng Bao, David Alexander Tedjopurnomo, and Xiaolin Qin. 2018. Trip planning by an integrated search paradigm. In Proceedings of the 2018 International Conference on Management of Data. 1673--1676.

Digital Library

[26]

Yong Wang, Guoliang Li, and Nan Tang. 2019. Querying shortest paths on time dependent road networks. Proceedings of the VLDB Endowment 12, 11 (2019), 1249--1261.

Digital Library

[27]

Dong Xie, Feifei Li, and Jeff M Phillips. 2017. Distributed trajectory similarity search. Proceedings of the VLDB Endowment 10, 11 (2017), 1478--1489.

Digital Library

[28]

Can Yang and Gyozo Gidofalvi. 2018. Fast map matching, an algorithm integrating hidden Markov model with precomputation. International Journal of Geographical Information Science 32, 3 (2018), 547--570.

[29]

Peilun Yang, Hanchen Wang, Ying Zhang, Lu Qin, Wenjie Zhang, and Xuemin Lin. 2021. T3S: Effective Representation Learning for Trajectory Similarity Computation. In 2021 IEEE 37th International Conference on Data Engineering (ICDE). IEEE, 2183--2188.

[30]

Di Yao, Gao Cong, Chao Zhang, and Jingping Bi. 2019. Computing Trajectory Similarity in Linear Time: A Generic Seed-Guided Neural Metric Learning Approach. In 2019 IEEE 35th International Conference on Data Engineering (ICDE). 1358--1369.

[31]

Zijun Yao, Yanjie Fu, Bin Liu, Wangsu Hu, and Hui Xiong. 2018. Representing urban functions through zone embedding with human mobility patterns. In Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence (IJCAI-18).

[32]

Hanyuan Zhang, Xingyu Zhang, Qize Jiang, Baihua Zheng, Zhenbang Sun, and Weiwei Sun. 2020. Trajectory similarity learning with auxiliary supervision and optimal matching.(2020). In Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, Yokohama, Japan. 11--17.

[33]

Hengtong Zhang, Tianhang Zheng, Jing Gao, Chenglin Miao, Lu Su, Yaliang Li, and Kui Ren. 2019. Data poisoning attack against knowledge graph embedding. arXiv preprint arXiv:1904.12052 (2019).

[34]

Yifan Zhang, An Liu, Guanfeng Liu, Zhixu Li, and Qing Li. 2019. Deep representation learning of activity trajectory similarity computation. In 2019 IEEE International Conference on Web Services (ICWS). IEEE, 312--319.

[35]

Shenglin Zhao, Tong Zhao, Irwin King, and Michael R Lyu. 2017. Geo-teaser: Geo-temporal sequential embedding rank for point-of-interest recommendation. In Proceedings of the 26th international conference on world wide web companion. 153--162.

Digital Library

[36]

Yu Zheng, Xing Xie, Wei-Ying Ma, et al. 2010. Geolife: A collaborative social networking service among user, location and trajectory. IEEE Data Eng. Bull. 33, 2 (2010), 32--39.

[37]

Yang Zhou and Yan Huang. 2018. Deepmove: Learning place representations through large scale movement data. In 2018 IEEE International Conference on Big Data (Big Data). IEEE, 2403--2412.

Cited By

Han HYang SDing JFu LWang XZhou C(2024)Adversarial Reconstruction of Trajectories: Privacy Risks and Attack Models in Trajectory EmbeddingProceedings of the 32nd ACM International Conference on Advances in Geographic Information Systems10.1145/3678717.3691274(259-269)Online publication date: 29-Oct-2024
https://dl.acm.org/doi/10.1145/3678717.3691274
Cai KZhang JHong ZShand WWang GZhang DChi JTian YBaeza-Yates RBonchi F(2024)Where Have You Been? A Study of Privacy Risk for Point-of-Interest RecommendationProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671758(175-186)Online publication date: 25-Aug-2024
https://dl.acm.org/doi/10.1145/3637528.3671758
Cheng MZhou ZZhang BWang ZGan JRen ZFeng WLyu YZhang HDiao X(2024)Efflex: Efficient and Flexible Pipeline for Spatio-Temporal Trajectory Graph Modeling and Representation Learning2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)10.1109/CVPRW63382.2024.00261(2546-2555)Online publication date: 17-Jun-2024
https://doi.org/10.1109/CVPRW63382.2024.00261

Index Terms

Analyzing sensitive information leakage in trajectory embedding models
1. Computing methodologies
  1. Machine learning
    1. Machine learning approaches
      1. Learning latent representations
2. Security and privacy
  1. Human and societal aspects of security and privacy
    1. Privacy protections

Recommendations

Information Leakage in Embedding Models
CCS '20: Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security

Embeddings are functions that map raw input data to low-dimensional vector representations, while preserving important semantic information about the inputs. Pre-training embeddings on a large amount of unlabeled data and fine-tuning them for downstream ...
Privacy-Preserving Sharing of Sensitive Information

Privacy-preserving sharing of sensitive information (PPSSI) is motivated by the increasing need for entities (organizations or individuals) that don't fully trust each other to share sensitive information. Many types of entities need to collect, analyze,...
An approach for prevention of privacy breach and information leakage in sensitive data mining

Display Omitted It prevents homogeneity, skewness, similarity and background knowledge attacks.The privacy is ensured while publishing sensitive data.Only fewer partitioning need to be done for a stronger privacy requirement.It gives better efficiency ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SIGSPATIAL '22: Proceedings of the 30th International Conference on Advances in Geographic Information Systems

November 2022

806 pages

ISBN:9781450395298

DOI:10.1145/3557915

General Chairs:
Matthias Renz
Kiel University, Germany
,
Mohamed Sarwat
Wherobots Inc. / Arizona State University

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGSPATIAL: ACM Special Interest Group on Spatial Information

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 November 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

NSF China
Natural Science Foundation of Shanghai
Shanghai Sailing Program

Conference

SIGSPATIAL '22

Sponsor:

SIGSPATIAL

SIGSPATIAL '22: The 30th International Conference on Advances in Geographic Information Systems

November 1 - 4, 2022

Washington, Seattle

Acceptance Rates

Overall Acceptance Rate 257 of 1,238 submissions, 21%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3
Total Citations
View Citations
167
Total Downloads

Downloads (Last 12 months)24
Downloads (Last 6 weeks)1

Reflects downloads up to 17 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Han HYang SDing JFu LWang XZhou C(2024)Adversarial Reconstruction of Trajectories: Privacy Risks and Attack Models in Trajectory EmbeddingProceedings of the 32nd ACM International Conference on Advances in Geographic Information Systems10.1145/3678717.3691274(259-269)Online publication date: 29-Oct-2024
https://dl.acm.org/doi/10.1145/3678717.3691274
Cai KZhang JHong ZShand WWang GZhang DChi JTian YBaeza-Yates RBonchi F(2024)Where Have You Been? A Study of Privacy Risk for Point-of-Interest RecommendationProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671758(175-186)Online publication date: 25-Aug-2024
https://dl.acm.org/doi/10.1145/3637528.3671758
Cheng MZhou ZZhang BWang ZGan JRen ZFeng WLyu YZhang HDiao X(2024)Efflex: Efficient and Flexible Pipeline for Spatio-Temporal Trajectory Graph Modeling and Representation Learning2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)10.1109/CVPRW63382.2024.00261(2546-2555)Online publication date: 17-Jun-2024
https://doi.org/10.1109/CVPRW63382.2024.00261

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten