Elsevier

Neurocomputing

Volume 472, 1 February 2022, Pages 201-211
Neurocomputing

OPTDP: Towards optimal personalized trajectory differential privacy for trajectory data publishing

https://doi.org/10.1016/j.neucom.2021.04.137Get rights and content

Abstract

With the development of location-based applications, more and more trajectory data are collected. Trajectory data often contains users’ sensitive information, and direct release it may pose a threat to users’ privacy. Differential privacy, as a privacy preserving method with solid mathematical foundation, has been widely used in trajectory data publishing. However, current trajectory data publishing methods based on differential privacy cannot fully realize the personalized privacy protection. In this paper, an optimal personalized trajectory differential privacy mechanism is proposed. Firstly, by establishing the probabilistic mobility model of trajectories, we cluster the locations to achieve semantic location matching between different trajectories. Based on the semantic similarity, we identify the templet trajectory, and propose a privacy level allocation method based on stay-points and frequent sub-trajectories. Then, according to the location matching results, we can automatically identify the privacy level of all locations. Combined with the optimal location differential privacy mechanism, we disturb the location points on the user’s trajectory before publishing, where different location privacy levels correspond to different privacy budgets. Experiment results on real-world datasets show that our mechanism provides a better tradeoff between privacy protection and data utility compared with traditional differential privacy methods.

Introduction

Recent years, with the widespread usage of location-aware devices, a series of location-based services have been developed, and generate a large amount of trajectory data [1], [2], [3], [4]. With the help of these trajectory data, we can more intelligently conduct urban planning and market analysis, as well as analyzing and mining users’ personal trajectory information to provide personalized recommendations [5], [6], [7].

However, the obtained trajectory data often contains sensitive personal information, and public disclosure of it may bring about serious privacy concerns [8], [9], [10], [11]. An attacker with background knowledge may discover the home address, lifestyle habits, health status and social relationships of one specific user through the trajectory data. Therefore, there is a growing interest in developing privacy protection mechanisms of trajectory data to meet the needs of individual privacy [12], [13], [14], [15].

As a promising technique, differential privacy (DP) [16] has been proposed to protect the trajectory data [17], [18], [19]. Most if not all of these methods are based on the assumption of all location with the same degree of privacy protection, so the same privacy budget is set for all points, and a unified privacy protection for all locations is provided [18], [19]. As a result, some users’ privacy is overprotected, thus reducing the data utility, while other users’ privacy protection is insufficient.

In recent years, in order to balance privacy protection and data utility, researchers have introduced personalized DP mechanism into trajectory privacy protection. Personalized DP allows us to adjust the privacy level of each record based on the privacy needs of the data/user [20], [21]. But existing works just set different privacy protection for different users or different locations, and cannot achieve privacy protection on both user and location level. In reality, the protection needs of each user for each location of their own trajectory may change depending on specific purpose, behavior, location sensitivity and many other factors. Moreover, the trajectory privacy of the existing work needs to be set artificially, which is more troublesome in practical application. Through practical demand analysis, existing personalized trajectory privacy protection mechanisms cannot ensure the realization of personalized protection at the user’s location level.

To fill this gap, in this paper, we propose an optimal personalized trajectory DP mechanism for trajectory data publishing. To avoid manually designate the privacy level, we exploit the relationship between location features and privacy requirements, which facilitates to determine the privacy requirement of a particular user’s location based on whether it belongs to the stay-points and the frequent sub-trajectory points. To identify the location privacy level of a large number of trajectories, and reduce the workload of location feature extraction, we propose a location matching method based on semantic similarity, which aims to automatically learn the privacy level of the location beyond the feature extraction step. More concretely, we first build a probabilistic mobility model of trajectories, cluster the locations on different trajectories based on semantic similarity, and get the best semantic location matching results between different trajectories, and then get the semantic similarity under this matching. Then, the most representative template trajectories are selected according to the semantic similarity, and the template trajectories and their location privacy levels are determined. According to the matching results of template trajectory and other trajectories, the privacy levels of all trajectory locations are obtained. Finally, the corresponding DP budget is allocated according to the privacy level, and the final publishable trajectory data is obtained by combining with the optimal trajectory DP.

The major contributions of this paper are summarized as follows.

  • For the first time, we proposed a location privacy level identification method based on stay-point and frequent sub-trajectory for privacy preference self-learning. This method fully considers the relationship between location statistical features and privacy requirements, divides location privacy into four levels, and realizes the personalized privacy level identification of user level and location level.

  • We proposed a trajectory location matching algorithm which can automatically learn the privacy level of trajectory location without feature extraction. The algorithm establishes a probabilistic mobility model for each user’s trajectory, clusters the locations between the trajectories based on the mobility model and the earth mover’s distance (EMD), so as to obtain the best semantic location matching results and statistical semantic similarity.

  • We proposed the optimal personalized trajectory DP (OPTDP), which realizes the personalized privacy protection based on the optimal trajectory differential trajectory privacy. Experimental results based on real data show that our algorithm has a significant improvement over the existing algorithms in privacy protection and data utility.

The remainder of this paper is organized as follows. Section 2 provides some preliminaries and basic definitions. Section 3 proposes our optimal personalized trajectory DP mechanism. Section 4 presents the experimental evaluation, and Section 5 reviews some related work. Finally, Section 6 concludes this paper.

Section snippets

Location Features on Trajectory Privacy

A trajectory dataset Dis the set of all trajectories: D={T̃i|i=1,2,,|D|}, and a trajectory T̃i is an ordered list of locations: T̃i={l̃i,j|j=1,2,,|T̃i|}, where |D| is the number of trajectories it contains, and |T̃i| is the length of T̃i. l̃i,j is the location of the trajectory T̃i at the jth timestamp tj, which is in the form of (lng,lat), and lngand latdenote the longitude and latitude, respectively.

In order to facilitate the follow-up trajectory processing, we map the GPS coordinates to

PDP trajectory preserving design

We formalize our task as to realize personalized trajectory DP under the scenario of trajectory data publishing, and enable it to better protect data utility while providing strict privacy guarantees. To this end, we propose a personalized trajectory DP system based on semantic location matching. Fig. 1 illustrates the architecture of the system, where the input is the original trajectory dataset, and the output is the protected trajectory dataset for publishing. Our system mainly involves

Performance evaluation

In this section, we first describe the experiment settings and then evaluate the performance of our optimal personalized trajectory DP mechanism.

Early trajectory privacy preserving methods

In recent years, there are many privacy protection methods based on k-anonymity [37], [38], [39] (such as l-diversity [40], t-closeness [41], [42]), which can protect data privacy to a certain extent. These methods are used to protect privacy by concealing or concealing. These methods usually add dummy locations, and require it be indistinguishable between a set of kpoints. However, all of them are sensitive to the adversary’s prior knowledge about the target user’s location distribution [18].

Conclusion

In this paper, we have proposed an optimal personalized trajectory DP mechanism for trajectory data publishing. Different from the traditional trajectory DP mechanisms, our method is able to automatically generate the corresponding privacy level for each user’s location, provide different privacy protection, and achieve the balance between privacy protection and data utility. The experimental results show that our method is able to protect the user’s sensitive locations as well as improve the

CRediT authorship contribution statement

Wenqing Cheng: Conceptualization, Methodology, Writing - original draft, Project administration. Ruxue Wen: Methodology, Writing - original draft, Software, Validation. Haojun Huang: Data curation, Formal analysis. Wang Miao: Formal analysis, Visualization. Chen Wang: Writing - review & editing, Conceptualization, Methodology, Supervision, Funding acquisition.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgement

This work was supported in part by the National Natural Science Foundation of China under Grants 61872416, 62002104, 62071192, and 52031009; by the Fundamental Research Funds for the Central Universities of China under Grant 2019kfyXJJS017; by the Key Research and Development Program of Hubei Province under Grant 2020BAB120; and by the Open Research Project of Hubei Key Laboratory of Intelligent Geo-Information Processing under Grant KLIGIP-2018A03. Part of this work was presented at the

Wenqing Cheng received the B.S. degree in telecommunication engineering and Ph.D. degree in electronics and information engineering from Huazhong University of Science and Technology, China, in 1985 and 2005, respectively. She is currently a full professor with the School of Electronic Information and Communications, Huazhong University of Science and Technology, China. Her research interests focus on information processing in mobile systems. She is a member of IEEE.

References (50)

  • H. Huang, W. Miao, G. Min, C. Huang, X. Zhang, C. Wang, Resilient range-based d)dimensional localization for mobile...
  • X. He et al.

    Leveraging spatial diversity for privacy-aware location-based services in mobile networks

    IEEE Trans. Inf. Forensics Secur.

    (2018)
  • P. Zhao et al.

    ILLIA: Enabling k-anonymity-based privacy preserving against location injection attacks in continuous LBS queries

    IEEE Internet Things J.

    (2018)
  • C. Li et al.

    Privacy in internet of things: From principles to technologies

    IEEE Internet Things J.

    (2019)
  • Y. Liu et al.

    DeePGA: A privacy-preserving data aggregation game in crowdsensing via deep reinforcement learning

    IEEE Internet Things J.

    (2020)
  • S. Ghane et al.

    TGM: A generative mechanism for publishing trajectories with differential privacy

    IEEE Internet Things J.

    (2020)
  • W. Zhang et al.

    Online location trace privacy: An information theoretic approach

    IEEE Trans. Inf. Forensics Secur.

    (2019)
  • Z. Xiao et al.

    QLDS: A novel design scheme for trajectory privacy protection with utility guarantee in participatory sensing

    IEEE Trans. Mob. Comput.

    (2018)
  • C. Dwork

    Differential privacy

    Proc. ICALP

    (2006)
  • R. Chen, B. Fung, B.C. Desai, Differentially private trajectory data publication,...
  • M.E. Andres, N.E. Bordenabe, K. Chatzikokolakis, C. Palamidessi, Geo-indistinguishability: Differential privacy for...
  • N.E. Bordenabe et al.

    Optimal geo-indistinguishable mechanisms for location privacy

    Proc. ACM CCS

    (2014)
  • F. Tian et al.

    A novel personalized differential privacy mechanism for trajectory data publication

    Proc. NaNA

    (2017)
  • L. Rossi et al.

    It’s the way you check-in: identifying users in location-based social networks

    Proc. ACM WOSN

    (2014)
  • R. Shokri, G. Theodorakopoulos, J. Le Boudec, J. Hubaux, Quantifying location privacy, in: Proceedings of IEEE S&P,...
  • Cited by (17)

    • Clustering and deep learning based trajectory privacy protection mechanism for Internet of vehicles

      2024, Zhejiang Daxue Xuebao (Gongxue Ban)/Journal of Zhejiang University (Engineering Science)
    View all citing articles on Scopus

    Wenqing Cheng received the B.S. degree in telecommunication engineering and Ph.D. degree in electronics and information engineering from Huazhong University of Science and Technology, China, in 1985 and 2005, respectively. She is currently a full professor with the School of Electronic Information and Communications, Huazhong University of Science and Technology, China. Her research interests focus on information processing in mobile systems. She is a member of IEEE.

    Ruxue Wen received the B.E. degree from Wuhan University of Technology, China, in 2019. She is currently pursuing the M.S. degree in Electronics and Information Engineering at Huazhong University of Science and Technology, China. Her research interests focus on data privacy and mobile computing.

    Haojun Huang received the B.S. degree from the School of Computer Science and Technology, Wuhan University of Technology, China, in 2005, and the Ph.D. degree in School of Communication and Information Engineering, University of Electronic Science and Technology, China, in 2012. He was a post-doctoral researcher with the Research Institute of Information Technology, Tsinghua University, Beijing, from 2012 to 2015, and an assistant professor with Wuhan University, China, from 2015 to 2017. He is currently an associate professor at Huazhong University of Science and Technology, China. His research interests include wireless networks, big data, and software-defined networking.

    Wang Miao received his Ph.D. degree in Computer Science from the University of Exeter, United Kingdom in 2017. He is currently a Postdoctoral Research Associate at the College of Engineering, Mathematics, and Physical Sciences of the University of Exeter. His research interests focus on Vehicle Edge Computing, Artificial Intelligence, Wireless Communication Networks, and Performance Modelling and Analysis.

    Chen Wang received the B.S. and Ph.D. degrees from the Department of Automation, Wuhan University, China, in 2008 and 2013, respectively. From 2013 to 2017, he was a postdoctoral research fellow in the Networked and Communication Systems Research Lab, Huazhong University of Science and Technology, China. Thereafter, he joined the faculty of Huazhong University of Science and Technology where he is currently an associate professor. His research interests are in the broad areas of wireless networking, Internet of Things, and mobile computing, with a recent focus on privacy issues in wireless and mobile systems. He is a senior member of IEEE and ACM.

    View full text