Elsevier

Computers & Security

Volume 110, November 2021, 102446
Computers & Security

An optimized differential privacy scheme with reinforcement learning in VANET

https://doi.org/10.1016/j.cose.2021.102446Get rights and content

Abstract

The protection of vehicle trajectory in Vehicular ad hoc network is facing many challenges. Among these challenges, one of the most critical issues is to keep the balance between geographical location protection and semantic location protection. Traditional trajectory protection schemes either only focus on geographical location protection or only semantic location protection. Moreover, when trajectory privacy protection is carried out, each location is often given the same protection. This may lead to sensitive locations under insufficient protection and unimportant locations under overprotection. In this paper, based on differential privacy, we propose an optimized privacy differential privacy scheme with reinforcement learning in vehicular ad hoc network. The proposed scheme can dynamically optimize the privacy budget allocation for each location on the vehicle trajectory to reach a better balance between geolocation obfuscation and semantic security. Experiments results demonstrate that the proposed scheme can reduce the risk of geographical and semantic location leakage, and therefore ensure the balance between the utility and privacy.

Introduction

Vehicular ad hoc network (VANET) enables drivers to enjoy diverse comfortable and real-time services, such as car parking, lessening car accidents, weather forecast, on-line entertainment sharing and so on Khacheba et al. (2018). However, the information of vehicle’s trajectory may expose vehicle’s sensitive locations throughout data exchanging and sharing processes (Weerasinghe, Fu, 2019, Zidani, Semchedine, Ayaida, 2018). If a vehicle appears frequently in hospitals, the car owner may not be in a very good condition. If a vehicle appears frequently in a bank, the car owner is likely to make frequent cash transactions. Then malicious attackers are likely to infer life habits, financial situation, personal health and even other secret information of car owner (Luo et al., 2019a). Thus, preventing privacy leakage of the vehicle trajectory becomes an urgent and significant topic.

Over the past decade, academia has witnessed a great development in vehicle trajectory privacy protection based on pseudonym and anonymity in VANET. The anonymous scheme (Gedik and Liu, 2007) in VANET hides the corresponding relationship between the published data record and the specific vehicle on the premise of ensuring that the published vehicle location information data is publicly available and the privacy of some key areas can be guaranteed. Pseudonym (Petit et al., 2014) normally means to realize vehicle identity anonymity to cut off the connection between vehicle information and space-time information, make it difficult for service operators to speculate on the location information of specific vehicle, and effectively protect the location privacy of vehicle.

However, the protection of vehicle trajectory is facing a new issue: Pseudonym and anonymity schemes (Al-ani, Zhou, Shi, Baker, Abdlhamed, 2020, Singh, Gowtham, Tamilselvan, Nandi, 2019) do not pay enough concerns to keep the balance between the protection of geographical locations and semantic locations when facing adversaries with enough background knowledge. The semantic location refers to a series of Point of Interests (POI) of geographical location, such as hospitals, banks or schools, etc. We believe that the POI has higher possibility to be identified. For example, if a privacy mechanism discloses a vehicle location that is in hundreds of meters away from the real location of an insurance company, which is a POI, the attacker still has high opportunity to affirm that the user is with this insurance company.

Differential privacy is widely used for trajectory privacy preservation (Ftaimi, Mazri, 2020, Ghosh, Varghese, Gupta, Kherani, Muthaiah, 2009, Wang, Li, Zhao, Xia, 2016). Under differential privacy, the effect of changing, deleting, or adding a single record of the data set is very small for a query result. This property makes differential privacy suitable for location protection on vehicle trajectory.

However, differential privacy related mechanisms also ignore the higher probability of POI disclosure as differential privacy considers all points equally. In privacy protection of motor vehicle geographical trajectory, privacy budget allocation usually refers to the way to determine the intensity of protection of each location. Generally speaking, the smaller the privacy budget allocated for a location, the higher the degree of protection of this location. Traditional privacy budget allocation in differential privacy takes more account of providing high level of geographical location protection to some key areas through historical access frequency. This straight-forward distribution pattern often ignores the security of POI. The balance between semantic location protection and geographical location protection is also an important issue. Due to the randomness of the noise, if the deviation between an original location and its noised location is too tiny, the exact semantic location also has the possibility to be inferred by adversaries with background knowledge. On the other hand, if the security of semantic location is emphasized too much, the obfuscation of geo-location may be huge so that the utility of privacy protection algorithm may reduce too much. In differential privacy algorithm, privacy budget allocation has always been highly concerned. It can be seen that traditional differential privacy mechanism applied on vehicle trajectory in VANET is facing an unavoidable challenges: A fixed privacy budget allocation scheme may lead to non-optimal privacy budget allocation policy to protect semantic location, which fails to keep the balance between geographical locations security and semantic locations security. This paper argues that the POI points may need more randomization than other points, and the randomization can actually be controlled by the privacy budget.

Reinforcement learning is a suitable way for us to link the location privacy with the differential privacy budget allocation. Reinforcement learning, has a clear goal and can choose actions to influence the environment. The reinforcement learning constantly tries and updates well-designed value function to determine the optimal strategy step by step. The properties of reinforcement learning enable us to take vehicle as an agent and the VANET as the environment. Each location on a vehicle trajectory is regarded as a state, and the privacy budget allocation adjustment of each location is regarded as a specific action taken in current state. In this way, we can consider the privacy budget allocation in vehicle trajectory privacy protection within the framework of reinforcement learning.

This paper proposes an optimized differential privacy scheme with reinforcement learning in VANET. The proposed scheme not only concerns geographic locations security but also semantic locations security on vehicle trajectory by using reinforcement learning model and differential privacy. Our contributions are exhibited as below:

  • -

    1) We embed reinforcement learning model into differential privacy mechanism to adjust current privacy budget allocation policy automatically and obtain an optimal privacy budget allocation policy.

  • -

    2) We combine segment similarity, which to be used to measure the similarity between two paths, and semantic location security to generate reward function which can more effectively record the cumulative benefits in reinforcement learning.

  • -

    3) We use three well-designed evaluation indicators: security risk, utility loss and privacy gain to measure the balance between geographical location protection and semantic location protection.

The rest parts of this paper is organized by 6 sections. The related works are presented in Section 2. We display the background in Section 3 and describe the problem to be solved in this paper in Section 4. Then we introduce proposed scheme in Section 5. Moreover, Section 6 demonstrates the detailed results of extensive experiments. The final conclusion is given in Section 7.

Section snippets

Related work

Research communities have presented a great deal of effective privacy protection methods for vehicle trajectory in VANET. These methods are roughly divided into two main categories: 1) anonymity or pseudonym, 2) differential privacy.

Trajectory privacy in VANETs

Aiming to explain vehicle trajectory privacy issues in VANET, we firstly introduce the basic structure of VANET. Vehicles share real-time position data via on-board devices. VANET has at least one Trusted Authority (TA) and some Roadside Units (RSUs). Vehicles transmit their requests to RSUs and then RSUs send these requests to trusted server. When vehicles fail to create connections with any RSU or other vehicles, some requests can be also sent directly to LBS server via mobile communication

Problem description

In this section, we introduce the structure of vehicle trajectory, semantic location transfer matrix and possible attack strategy of adversaries. The notations involved in problem description are described in the Table. 1.

Proposed method

In this section, we introduce an optimized differential privacy scheme with reinforcement learning in VANET, named ODPRL (Optimized Differential Privacy Scheme with Reinforcement Learning), to select optimized budget allocation and protect semantic security of vehicle trajectory privacy. ODPRL scheme works as Fig. 4.

A reinforcement learning model would run in a specific environment. In this environment, in each process of state transfer, agent can feel the information feedback from the

Experimental results

The proposed ODPRL algorithm introduces three metrics of SSR, ULS and PriGain to evaluate the semantic security, data utility and the privacy gain in four datasets. Moreover, during the evaluation of PriGain, we compare ODPRL algorithm with PSTPRL (Wang et al., 2019) and PDRL (Berri et al., 2020). Finally, we observe the effects of some parameters in ODPRL.

Conclusion

In this paper, we embed reinforcement learning mechanism into differential privacy to enable the privacy budget allocation to constantly update to seek an optimal policy to protect the privacy of geographical and semantic location. Proposed algorithm effectively combines two metrics of seg-similarity and seman-security to define reward function between different states, which guarantees the effect and efficiency of learning. In addition, Proposed algorithm uses well-designed metrics to evaluate

CRediT authorship contribution statement

Xin Chen: Conceptualization, Methodology, Writing – original draft. Tao Zhang: Writing – original draft, Validation. Sheng Shen: Software, Validation. Tianqing Zhu: Conceptualization, Methodology. Ping Xiong: Methodology, Writing – original draft.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Xin Chen is currently a lecturer in Faculty of Mathematics and Computer Science, Wuhan Polytechnic University. He received his BEng degrees from Central China Normal University, China in 2001 and MEng degrees from Wuhan University, China, in 2008. His research focuses on differential privacy. Email: [email protected]

References (34)

  • F. Cuppens et al.

    Optimal distribution of privacy budget in differential privacy

    Risks and Security of Internet and Systems: 13th International Conference, CRiSIS 2018, Arcachon, France, October 16–18, 2018, Revised Selected Papers

    (2019)
  • D. Cynthia

    Differential privacy

    Automata, languages and programming

    (2006)
  • Z. Fan et al.

    Apdpk-means: A new differential privacy clustering algorithm based on arithmetic progression privacy budget allocation

    2019 IEEE 21st International Conference on High Performance Computing and Communications; IEEE 17th International Conference on Smart City; IEEE 5th International Conference on Data Science and Systems (HPCC/SmartCity/DSS)

    (2019)
  • Frejinger, E., 2008. Route choice analysis: data, models, algorithms and...
  • E. Frejinger et al.

    Route choice analysis: data, models, algorithms and applications

    Epfl

    (2012)
  • S. Ftaimi et al.

    A comparative study of machine learning algorithms for vanet networks

    The 3rd International Conference on Networking, Information Systems & Security

    (2020)
  • B. Gedik et al.

    Protecting location privacy with personalized k-anonymity: architecture and algorithms

    IEEE Trans. Mob. Comput.

    (2007)
  • Cited by (19)

    • Privacy-utility trades in crowdsourced signal map obfuscation

      2022, Computer Networks
      Citation Excerpt :

      Moreover, the proposed approaches in [37,38] are based on linear programming and discrete locations which cannot be easily applied under our threat models (continue locations and non-linear adversary). The mechanism proposed by [39] does not formally optimize the privacy-utility tradeoff during trajectory obfuscation. [40] is an empirical study with no formal analysis or obfuscation schemes that formally consider both privacy and utility in their design, and [41] only surveys DP-based obfuscation approaches without considering other obfuscation schemes.

    • Cyber-security and reinforcement learning — A brief survey

      2022, Engineering Applications of Artificial Intelligence
      Citation Excerpt :

      The research article Hao et al. (2021) implements zero trust architecture and FL using A3C for providing ultra-reliable low latency communication services with resource slicing and scheduling on a 6G vehicular network. Articles Chen et al. (2021), Wang et al. (2020), Ren et al. (2022), Ahmadi et al. (2021), Xu et al. (2021), Miao et al. (2021) and Zhan et al. (2020) all combine FL and RL in different IoT environments. Content-centric IoTs that stream multimedia is another hot topic where RL is being applied.

    View all citing articles on Scopus

    Xin Chen is currently a lecturer in Faculty of Mathematics and Computer Science, Wuhan Polytechnic University. He received his BEng degrees from Central China Normal University, China in 2001 and MEng degrees from Wuhan University, China, in 2008. His research focuses on differential privacy. Email: [email protected]

    Tao Zhang received the B.Eng and M.Eng degrees from the Information Engineering School, Nanchang University, China, in 2015 and 2018, respectively. Currently, he works towards his Ph.D degree with the school of Computer Science in the University of Technology Sydney, Australia. His research interests include privacy-preserving, AI fairness, and machine learning. Email: [email protected]

    Ping Xiong received his BEng degree from LanZhou Jiaotong University, China in 1997. He received his MEng and Ph.D. degrees from Wuhan University, China, in 2002 and 2005, respectively. He is currently the professor of School of Information and Security Engineering, Zhongnan University of Economics and Law, China. His research interests are network security, data mining and privacy preservation.Email: [email protected]

    Dr Tianqing Zhu is currently an Associate Professor in School of Computer Science, University of Technology Sydney (UTS), Australia. She received her BEng and MEng degrees from Wuhan University, China, in 2000 and 2004, respectively, and a Ph.D. degree in Computer Science from Deakin University Australia in 2014. Before joining UTS, she was a lecturer in School of Information Technology, Deakin University, Australia from 2014 to 2018. Her research interests include privacy preserving and cyber security. Email:[email protected]

    Sheng Shen, a current Ph.D. student in University of Technology Sydney. He received the Bachelor of Engineering (Honors) degree in Information and Communication Technology from the University of Technology Sydney in 2017, and Master of Information Technology degree in University of Sydney in 2018. His current research interests include data privacy preserving, differential privacy and federated learning. Email: [email protected]

    View full text