An optimized differential privacy scheme with reinforcement learning in VANET
Introduction
Vehicular ad hoc network (VANET) enables drivers to enjoy diverse comfortable and real-time services, such as car parking, lessening car accidents, weather forecast, on-line entertainment sharing and so on Khacheba et al. (2018). However, the information of vehicle’s trajectory may expose vehicle’s sensitive locations throughout data exchanging and sharing processes (Weerasinghe, Fu, 2019, Zidani, Semchedine, Ayaida, 2018). If a vehicle appears frequently in hospitals, the car owner may not be in a very good condition. If a vehicle appears frequently in a bank, the car owner is likely to make frequent cash transactions. Then malicious attackers are likely to infer life habits, financial situation, personal health and even other secret information of car owner (Luo et al., 2019a). Thus, preventing privacy leakage of the vehicle trajectory becomes an urgent and significant topic.
Over the past decade, academia has witnessed a great development in vehicle trajectory privacy protection based on pseudonym and anonymity in VANET. The anonymous scheme (Gedik and Liu, 2007) in VANET hides the corresponding relationship between the published data record and the specific vehicle on the premise of ensuring that the published vehicle location information data is publicly available and the privacy of some key areas can be guaranteed. Pseudonym (Petit et al., 2014) normally means to realize vehicle identity anonymity to cut off the connection between vehicle information and space-time information, make it difficult for service operators to speculate on the location information of specific vehicle, and effectively protect the location privacy of vehicle.
However, the protection of vehicle trajectory is facing a new issue: Pseudonym and anonymity schemes (Al-ani, Zhou, Shi, Baker, Abdlhamed, 2020, Singh, Gowtham, Tamilselvan, Nandi, 2019) do not pay enough concerns to keep the balance between the protection of geographical locations and semantic locations when facing adversaries with enough background knowledge. The semantic location refers to a series of Point of Interests (POI) of geographical location, such as hospitals, banks or schools, etc. We believe that the POI has higher possibility to be identified. For example, if a privacy mechanism discloses a vehicle location that is in hundreds of meters away from the real location of an insurance company, which is a POI, the attacker still has high opportunity to affirm that the user is with this insurance company.
Differential privacy is widely used for trajectory privacy preservation (Ftaimi, Mazri, 2020, Ghosh, Varghese, Gupta, Kherani, Muthaiah, 2009, Wang, Li, Zhao, Xia, 2016). Under differential privacy, the effect of changing, deleting, or adding a single record of the data set is very small for a query result. This property makes differential privacy suitable for location protection on vehicle trajectory.
However, differential privacy related mechanisms also ignore the higher probability of POI disclosure as differential privacy considers all points equally. In privacy protection of motor vehicle geographical trajectory, privacy budget allocation usually refers to the way to determine the intensity of protection of each location. Generally speaking, the smaller the privacy budget allocated for a location, the higher the degree of protection of this location. Traditional privacy budget allocation in differential privacy takes more account of providing high level of geographical location protection to some key areas through historical access frequency. This straight-forward distribution pattern often ignores the security of POI. The balance between semantic location protection and geographical location protection is also an important issue. Due to the randomness of the noise, if the deviation between an original location and its noised location is too tiny, the exact semantic location also has the possibility to be inferred by adversaries with background knowledge. On the other hand, if the security of semantic location is emphasized too much, the obfuscation of geo-location may be huge so that the utility of privacy protection algorithm may reduce too much. In differential privacy algorithm, privacy budget allocation has always been highly concerned. It can be seen that traditional differential privacy mechanism applied on vehicle trajectory in VANET is facing an unavoidable challenges: A fixed privacy budget allocation scheme may lead to non-optimal privacy budget allocation policy to protect semantic location, which fails to keep the balance between geographical locations security and semantic locations security. This paper argues that the POI points may need more randomization than other points, and the randomization can actually be controlled by the privacy budget.
Reinforcement learning is a suitable way for us to link the location privacy with the differential privacy budget allocation. Reinforcement learning, has a clear goal and can choose actions to influence the environment. The reinforcement learning constantly tries and updates well-designed value function to determine the optimal strategy step by step. The properties of reinforcement learning enable us to take vehicle as an agent and the VANET as the environment. Each location on a vehicle trajectory is regarded as a state, and the privacy budget allocation adjustment of each location is regarded as a specific action taken in current state. In this way, we can consider the privacy budget allocation in vehicle trajectory privacy protection within the framework of reinforcement learning.
This paper proposes an optimized differential privacy scheme with reinforcement learning in VANET. The proposed scheme not only concerns geographic locations security but also semantic locations security on vehicle trajectory by using reinforcement learning model and differential privacy. Our contributions are exhibited as below:
- -
1) We embed reinforcement learning model into differential privacy mechanism to adjust current privacy budget allocation policy automatically and obtain an optimal privacy budget allocation policy.
- -
2) We combine segment similarity, which to be used to measure the similarity between two paths, and semantic location security to generate reward function which can more effectively record the cumulative benefits in reinforcement learning.
- -
3) We use three well-designed evaluation indicators: security risk, utility loss and privacy gain to measure the balance between geographical location protection and semantic location protection.
The rest parts of this paper is organized by 6 sections. The related works are presented in Section 2. We display the background in Section 3 and describe the problem to be solved in this paper in Section 4. Then we introduce proposed scheme in Section 5. Moreover, Section 6 demonstrates the detailed results of extensive experiments. The final conclusion is given in Section 7.
Section snippets
Related work
Research communities have presented a great deal of effective privacy protection methods for vehicle trajectory in VANET. These methods are roughly divided into two main categories: 1) anonymity or pseudonym, 2) differential privacy.
Trajectory privacy in VANETs
Aiming to explain vehicle trajectory privacy issues in VANET, we firstly introduce the basic structure of VANET. Vehicles share real-time position data via on-board devices. VANET has at least one Trusted Authority (TA) and some Roadside Units (RSUs). Vehicles transmit their requests to RSUs and then RSUs send these requests to trusted server. When vehicles fail to create connections with any RSU or other vehicles, some requests can be also sent directly to LBS server via mobile communication
Problem description
In this section, we introduce the structure of vehicle trajectory, semantic location transfer matrix and possible attack strategy of adversaries. The notations involved in problem description are described in the Table. 1.
Proposed method
In this section, we introduce an optimized differential privacy scheme with reinforcement learning in VANET, named ODPRL (Optimized Differential Privacy Scheme with Reinforcement Learning), to select optimized budget allocation and protect semantic security of vehicle trajectory privacy. ODPRL scheme works as Fig. 4.
A reinforcement learning model would run in a specific environment. In this environment, in each process of state transfer, agent can feel the information feedback from the
Experimental results
The proposed ODPRL algorithm introduces three metrics of , and to evaluate the semantic security, data utility and the privacy gain in four datasets. Moreover, during the evaluation of , we compare ODPRL algorithm with PSTPRL (Wang et al., 2019) and PDRL (Berri et al., 2020). Finally, we observe the effects of some parameters in ODPRL.
Conclusion
In this paper, we embed reinforcement learning mechanism into differential privacy to enable the privacy budget allocation to constantly update to seek an optimal policy to protect the privacy of geographical and semantic location. Proposed algorithm effectively combines two metrics of seg-similarity and seman-security to define reward function between different states, which guarantees the effect and efficiency of learning. In addition, Proposed algorithm uses well-designed metrics to evaluate
CRediT authorship contribution statement
Xin Chen: Conceptualization, Methodology, Writing – original draft. Tao Zhang: Writing – original draft, Validation. Sheng Shen: Software, Validation. Tianqing Zhu: Conceptualization, Methodology. Ping Xiong: Methodology, Writing – original draft.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Xin Chen is currently a lecturer in Faculty of Mathematics and Computer Science, Wuhan Polytechnic University. He received his BEng degrees from Central China Normal University, China in 2001 and MEng degrees from Wuhan University, China, in 2008. His research focuses on differential privacy. Email: [email protected]
References (34)
- et al.
Multi-level location privacy protection based on differential privacy strategy in vanets
2019 IEEE 89th Vehicular Technology Conference (VTC2019-Spring)
(2019) - et al.
Cpesp: cooperative pseudonym exchange and scheme permutation to preserve location privacy in vanets
Veh. Commun.
(2019) - et al.
Protecting semantic trajectory privacy for vanet with reinforcement learning
ICC 2019-2019 IEEE International Conference on Communications (ICC)
(2019) - et al.
Estimation of neighbors position privacy scheme with an adaptive beaconing approach for location privacy in vanets
Computers & Electrical Engineering
(2018) - et al.
Adjusted location privacy scheme for vanet safety applications
NOMS 2020-2020 IEEE/IFIP Network Operations and Management Symposium
(2020) - et al.
Alloyed pseudonym change strategy for location privacy in vanets
2020 IEEE 17th Annual Consumer Communications & Networking Conference (CCNC)
(2020) - et al.
Privacy-preserving data-prefetching in vehicular networks via reinforcement learning
ICC 2020-2020 IEEE International Conference on Communications (ICC)
(2020) A framework for generating network-based moving objects
Geoinformatica
(2002)- et al.
Strengthening privacy protection in vanets
2008 IEEE International Conference on Wireless and Mobile Computing, Networking and Communications
(2008) - et al.
Differentially private location protection with continuous time stamps for vanets
International Conference on Algorithms and Architectures for Parallel Processing
(2018)
Optimal distribution of privacy budget in differential privacy
Risks and Security of Internet and Systems: 13th International Conference, CRiSIS 2018, Arcachon, France, October 16–18, 2018, Revised Selected Papers
Differential privacy
Automata, languages and programming
Apdpk-means: A new differential privacy clustering algorithm based on arithmetic progression privacy budget allocation
2019 IEEE 21st International Conference on High Performance Computing and Communications; IEEE 17th International Conference on Smart City; IEEE 5th International Conference on Data Science and Systems (HPCC/SmartCity/DSS)
Route choice analysis: data, models, algorithms and applications
Epfl
A comparative study of machine learning algorithms for vanet networks
The 3rd International Conference on Networking, Information Systems & Security
Protecting location privacy with personalized k-anonymity: architecture and algorithms
IEEE Trans. Mob. Comput.
Cited by (19)
A privacy-preserving trajectory data synthesis framework based on differential privacy
2023, Journal of Information Security and ApplicationsHasse sensitivity level: A sensitivity-aware trajectory privacy-enhanced framework with Reinforcement Learning
2023, Future Generation Computer SystemsPrivacy-utility trades in crowdsourced signal map obfuscation
2022, Computer NetworksCitation Excerpt :Moreover, the proposed approaches in [37,38] are based on linear programming and discrete locations which cannot be easily applied under our threat models (continue locations and non-linear adversary). The mechanism proposed by [39] does not formally optimize the privacy-utility tradeoff during trajectory obfuscation. [40] is an empirical study with no formal analysis or obfuscation schemes that formally consider both privacy and utility in their design, and [41] only surveys DP-based obfuscation approaches without considering other obfuscation schemes.
Cyber-security and reinforcement learning — A brief survey
2022, Engineering Applications of Artificial IntelligenceCitation Excerpt :The research article Hao et al. (2021) implements zero trust architecture and FL using A3C for providing ultra-reliable low latency communication services with resource slicing and scheduling on a 6G vehicular network. Articles Chen et al. (2021), Wang et al. (2020), Ren et al. (2022), Ahmadi et al. (2021), Xu et al. (2021), Miao et al. (2021) and Zhan et al. (2020) all combine FL and RL in different IoT environments. Content-centric IoTs that stream multimedia is another hot topic where RL is being applied.
Xin Chen is currently a lecturer in Faculty of Mathematics and Computer Science, Wuhan Polytechnic University. He received his BEng degrees from Central China Normal University, China in 2001 and MEng degrees from Wuhan University, China, in 2008. His research focuses on differential privacy. Email: [email protected]
Tao Zhang received the B.Eng and M.Eng degrees from the Information Engineering School, Nanchang University, China, in 2015 and 2018, respectively. Currently, he works towards his Ph.D degree with the school of Computer Science in the University of Technology Sydney, Australia. His research interests include privacy-preserving, AI fairness, and machine learning. Email: [email protected]
Ping Xiong received his BEng degree from LanZhou Jiaotong University, China in 1997. He received his MEng and Ph.D. degrees from Wuhan University, China, in 2002 and 2005, respectively. He is currently the professor of School of Information and Security Engineering, Zhongnan University of Economics and Law, China. His research interests are network security, data mining and privacy preservation.Email: [email protected]
Dr Tianqing Zhu is currently an Associate Professor in School of Computer Science, University of Technology Sydney (UTS), Australia. She received her BEng and MEng degrees from Wuhan University, China, in 2000 and 2004, respectively, and a Ph.D. degree in Computer Science from Deakin University Australia in 2014. Before joining UTS, she was a lecturer in School of Information Technology, Deakin University, Australia from 2014 to 2018. Her research interests include privacy preserving and cyber security. Email:[email protected]
Sheng Shen, a current Ph.D. student in University of Technology Sydney. He received the Bachelor of Engineering (Honors) degree in Information and Communication Technology from the University of Technology Sydney in 2017, and Master of Information Technology degree in University of Sydney in 2018. His current research interests include data privacy preserving, differential privacy and federated learning. Email: [email protected]