Privacy-preserving raw data collection without a trusted authority for IoT
Introduction
The research effort on the Internet of Things (IoT) is one of the most important areas in information technologies, as it makes data collection more convenient. Currently, the sensors, which can sense, process, and disseminate the real-time information, are widely deployed to monitor and track various objects, such as animals, vehicles, and physical phenomena. IoT applications have proliferated due to the advancements of these wireless sensors. IoT utilizes and further extends the benefits (e.g., data sharing, always-on feature) of the Internet, which facilitates the data collection to achieve more accurate management. However, because the protocols’ standards are publicly known, the IoT systems are facing ever-increasing attacks. To address these security threats, many security protocols using cryptographic techniques have been proposed to guarantee confidentiality, authentication, and integrity. For example, IPSec, Host Identity Protocol (HIP), and Extensible Authentication Protocol (EAP) provide point-to-point authentication for IoT nodes [1], [2]. In addition, the security group communication methods [3], [4], [5] are also useful tools to ensure that the messages are efficiently transmitted among the IoT devices.
After a large amount of data is collected and stored, it should be utilized as soon as possible. Furthermore, more data will bring greater value, and data sharing is necessary. In the data sharing scenarios, the above security protocols are inadequate to address the practical requirements. Specifically, the traditional security protocols only guarantee that the transmitted message is shared among the authorized users, and public data sharing is not allowed. For example, if secure channels are already established using the cryptographic techniques among the users u1, u2, ⋅⋅⋅, un in the group, then u1, u2, ⋅⋅⋅, un send the message m1, m2, ⋅⋅⋅, mn independently. The security protocol guarantees that m1, m2, ⋅⋅⋅, mn are only shared among the group members, and meanwhile, the others can learn nothing. If these data are directly shared with the public, the user's privacy will be violated. For example, the real time electricity consumption report from a smart meter can be used to infer users’ living habits and economic incomes, while the health data from a wearable device contains sensitive physical status information. Hence, the user's data should be processed in order to protect its privacy before being uploaded to the big data center. Denote the IoT devices as u1, u2, ⋅⋅⋅, un and the corresponding collected data as m1, m2, ⋅⋅⋅, mn. The data m1, m2, ⋅⋅⋅, mn are processed, collected, and published for public utility and data analysis, which should satisfy two requirements: 1) the statistical features of the processed data are an approximation to the raw data; and 2) the published data cannot be linked to the corresponding user or the IoT device. Therefore, the privacy concerns are different from the traditional security goals, and in this paper, we assume that the secure channel among the entities in IoT systems is already established by using secure group communication or similar methods. Instead, we emphasize how to process and collect the data m1, m2, ⋅⋅⋅, mn.
Privacy-preserving mechanisms are proposed such as generalization [6], aggregation [7], [8], differential privacy [9], etc. However, in order to protect participants’ privacy, these mechanisms deliberately destroy the integrity of the sensed data. For example, data aggregation is the main method to mask single data values by computing the sum, the median or max/min value in an area [10], [11], [12], [13], which is very useful in many situations, such as smart meter reading, environmental monitoring, etc. However, in some other cases, data aggregation is not as useful, such as in collecting blood pressure or other health data. Moreover, most of the privacy-preserving aggregation schemes need to employ a trusted authority (TA). Recently, differential privacy has been used to protect the individual information in data collection, in which some randomness, such as Laplace noise, is added to the collected data. Later, the approximate aggregation data is obtained by removing the noise. Although the aggregation using differential privacy does not rely on TA, its result is not as accurate as the other methods.
Another method to achieve privacy protection data collection is k-anonymity, introduced by Sweeney and Samarati in 1998, in which the k-anonymity property is defined such that each person cannot be distinguished from at least other k - 1 individuals [14], [15]. For example, if the targeting attributes are age and gender, then a k-anonymized data set needs to have at least k records, and some transforms need to be performed on the original attributes which are a combination of age and gender, such as generalization, global recoding, and suppression. However, k-anonymity has some disadvantages: 1) some attributes of the data need to be deleted or modified; and 2) the individual data can be re-identified when several related data sets are associated together.
Another efficient and novel idea for data collection is n-source anonymity [7], which uses the cryptographic tools to unlink the data from its sender in a set with n members. In fact, it is different from the k-anonymity method. In this paper, our design is similar to the n-source anonymity scheme, which simultaneously realizes two goals: 1) the raw data is collected, and 2) the property of unlinkability between the data and its sender is achieved. Assuming that a data consumer recruits a number of IoT devices to collect the time series sensing data [11], [16], she needs the collaboration of a data collector, which is an entity in IoT systems.
Inspired by the previous works [7], [17], [18], [19], an efficient anonymous data collection protocol is proposed for practical environments. To encourage the end device of IoT systems to participate in the sensing task and to maintain the equipment status, a reward mechanism is also added. For more information about anonymous data collection, readers are referred to the paper in [20]. The main contributions of this paper include:
- (1)
An anonymous data collection scheme for IoT is proposed without employing TA, which makes it more applicable since it is difficult to deploy a high-cost TA node for practical IoT scenarios.
- (2)
Noises are not added in the collected data, nor are the collected data aggregated. The raw data is shared or released, which makes it more valuable for big data analysis.
The rest of this paper is organized as follows: Section 2 discusses the related works. The system architecture and system model are introduced in Section 3. Section 4 describes our proposed scheme. The approach toward the dynamic change of participants is discussed in Section 5. Section 6 shows the privacy characters, compared with some related works. In addition, the performance by simulation is presented in Section 7. Finally, we conclude this paper in Section 8.
Section snippets
Related work
Cryptography is a very useful tool to ensure privacy for data collection in the IoT environment. In Yang et al.’s scheme [21], a data collection scheme has been proposed using the ElGamal cryptosystem, in which there are t leaders, and all participants encrypt their data with t leaders’ public keys. However, Yang et al.’s scheme [21] is shown to be vulnerable against the collusion attack by Brickell and Shmatikov in [22]. In the improved version of [22], each participant has two public/private
Building blocks
This section discusses the encryption tools in the protocol.
The fog-assisted raw data collection
In this section, a fog-assisted raw data collection protocol is proposed, which mainly comprises three phases: setup phase, slot generation phase, and data collection and reward phase. In the second phase, the blind signature scheme can be found in [29]. Some notions are listed in Table 1, and the overview of the proposed scheme is shown in Fig. 2.
Dynamic change of participants
When the group is not large, it is possible that all participants negotiate the slot again when participants join or leave. However, it may cause expensive communication and computation costs while the group is large. This section describes the efficient dynamic change of participants.
Analysis
In this section, the analysis and the performance simulation are presented, which demonstrate that the proposed scheme not only satisfies the claimed requirements but is also suitable for the practical environment.
Communication overhead
According to [32], communication is more energy-hungry than computation, and the main communication of this protocol includes: pi↔pj, pi↔ FN, FN ↔ CS.
Conclusion
In this paper, a practical IoT sensing collection protocol is proposed without a TA, which not only preserves the raw data but also unlinks the data with its contributor. With the increasing demands of data sharing, our protocol guarantees the utility of the entirety of the data and the privacy of individual data at the same time. This not only increases the value of the data but also eliminates the concern of privacy leakage. In addition, the simulation with 1000 IoT devices was executed to
Acknowledgments
The authors would like to thank anonymous reviewers for their valuable comments and thank Prof. Frank Piessens for helping us to improve the quality of the revision. In addition, we thank Zhiwen Bai for his contribution to the performance simulation.
This work was partly supported by National Natural Science Foundation of China under grants Nos. 61662016 and 61772224, GUET Excellent Graduate Thesis Program No. 16YJPYSS17, Key projects of Guangxi Natural Science Foundation under grant no.
Yining Liu is currently a professor in School of Computer and Information Security, Guilin University of Electronic Technology, China. He received the B.S. degree in Applied Mathematics from Information Engineering University, Zhengzhou, China, in 1995, the M.E. in Computer Software and Theory from Huazhong University of Science and Technology, Wuhan, China, in 2003, and the Ph.D. degree in Mathematics from Hubei University, Wuhan, China, in 2007. His research interests include applied
References (32)
- et al.
On perspective of security and privacy-preserving solutions in the internet of things
Comput. Netw.
(2016) - et al.
A lightweight authenticated communication scheme for smart grid
IEEE Sens. J.
(2016) - et al.
Lightweight and efficient privacy-preserving data aggregation approach for the smart grid
Ad Hoc Netw.
(2017) - et al.
Efficient and privacy-preserving data aggregation in mobile sensing
- et al.
An anonymous data reporting strategy with ensuring incentives for mobile crowd-sensing
J. Ambient Intell. Hum. Comput.
(2017) - et al.
Anonymous sensory data collection approach for mobile participatory sensing
- et al.
Lightweight secure group communications for resource constrained devices
Int. J. Space-Based Situated Comput.
(2015) - et al.
An improved authenticated group key transfer protocol based on secret sharing
IEEE Trans. Comput.
(2013) - et al.
Certificateless public integrity checking of group shared data on cloud storage
IEEE Trans. Serv. Comput.
(2018) - et al.
A privacy-aware framework for participatory sensing
ACM SIGKDD Explor. Newslett.
(2011)
Privacy-preserving data aggregation in mobile phone sensing
IEEE Trans. Inf. Forens. Secur.
A Practical privacy-preserving data aggregation (3PDA) scheme for smart grid
IEEE Trans. Industr. Inf.
Data mining with differential privacy
KIPDA: k-indistinguishable privacy-preserving data aggregation in wireless sensor networks
Efficient and privacy-preserving min and kth min computations in mobile sensing systems
IEEE Trans. Depend. Secure Comput.
Medians and beyond: new aggregation techniques for sensor networks
Cited by (79)
Automated trusted collaborative processes through blockchain & IoT integration: The fraud detection case
2024, Internet of Things (Netherlands)Data anonymization evaluation for big data and IoT environment
2022, Information SciencesCitation Excerpt :The connected IoT devices are able to generate, transmit, process, and store information in an intelligent way [1]. Specifically in the emerging IoT applications, e.g., in self-driving vehicles, smart sensors are able to sense, process and disseminate real-time information and play an extremely important role [4]. This feature in the IoT has not only been applied in academic research and industrial fields, but also has been widely used in daily life, such as smart grid, electronic health, electronic home, environmental monitoring, smart city, etc. [5].
A new secure arrangement for privacy-preserving data collection
2022, Computer Standards and InterfacesCitation Excerpt :Due to it is difficult to find an authority that is trusted by all participants, the central dependency should be avoided as much as possible in practical application. Shuffle: Shuffle is another solution for secure arrangement [14]. Assumed that there is a transfer sequence, and every participant knows this order.
Using data mining techniques to explore security issues in smart living environments in Twitter
2021, Computer CommunicationsSecurity-enhanced firmware management scheme for smart home IoT devices using distributed ledger technologies
2024, International Journal of Information Security
Yining Liu is currently a professor in School of Computer and Information Security, Guilin University of Electronic Technology, China. He received the B.S. degree in Applied Mathematics from Information Engineering University, Zhengzhou, China, in 1995, the M.E. in Computer Software and Theory from Huazhong University of Science and Technology, Wuhan, China, in 2003, and the Ph.D. degree in Mathematics from Hubei University, Wuhan, China, in 2007. His research interests include applied cryptography, and data privacy protocol.
Yanping Wang is now pursuing her M.E. degree in Computer Science in Guilin University of Electronic Technology, Guilin, China. He has received her B.E. degree in Information Security from Guilin University of Electronic Technology, Guilin, China, in 2016. Her research focuses on the data privacy protocol.
Xiaofen Wang received her Ph.D. and M.S. degrees from Xidian University, Xi'an, China, in 2009 and 2006, respectively. Dr. Wang is currently an associate professor at School of Computer Science and Engineering, and The Center for Cyber Security, University of Electronic Science and Technology of China, Chengdu, China. She is also currently a visiting research fellow in Centre for Computer and Information Security, University of Wollongong. Her research interests are public key cryptography and its applications in wireless networks, smart grid, and cloud computing.
Zhe Xia is an Associate Professor in Department of Computer Science at Wuhan University of Technology (WHUT), China. He received the PhD degree from University of Surrey (UK) in 2009, supervised by Prof. Steve Schneider. He has worked as a Research Fellow in Department of Computing at University of Surrey between 2009 and 2013 before joining WHUT. His research interests include secure e-voting protocols, secret sharing and secure multiparty computation. Dr. Xia serves as the Associate Editor for Journal of Information Security Applications (JISA). He also serves on the program committees for many international conferences, such as NSS, EVT, VOTE-ID, DCIT.
Jingfang Xu (Chingfang Hsu) received the M.Eng. and the Ph.D. degrees in information security from the Huazhong University of Science and Technology, Wuhan, China, in 2006 and 2010 respectively. From Sep. 2010 to Mar. 2013, she was a Research Fellow at Huazhong University of Science and Technology. She is currently an Assistant Professor at Central China Normal University, Wuhan, China. Her research interests are in cryptography and network security, especially in secret sharing and its applications.