Elsevier

Computer Networks

Volume 148, 15 January 2019, Pages 340-348
Computer Networks

Privacy-preserving raw data collection without a trusted authority for IoT

https://doi.org/10.1016/j.comnet.2018.11.028Get rights and content

Abstract

With the rapid developments of IoT technologies, a large amount of real-time data is collected and shared, having important impacts on many applications, such as business advertisement and decision-making assistance. However, most users are unwilling to share their personal data directly to any third party for either academic research or commercial analysis because personal data contains private or sensitive information, such as economic status or living habits. Balancing the utility of big data and users’ privacy is a vital issue in academia and industry. In this paper, a privacy-preserving raw data collection scheme for IoT is proposed, in which the participant's data is collected and obfuscated with the other participants’ data within a group in order to mask the individual's privacy. Specifically, individual data is kept in its raw format to enhance its value for the data consumer, while no other users outside of the user herself know the source of the collected data. In addition, no trusted authority (TA) is needed in our proposed scheme, which is more suitable for real-world applications. Moreover, efficiency analysis is performed by simulation, and the result shows that our proposed scheme is practical for IoT systems.

Introduction

The research effort on the Internet of Things (IoT) is one of the most important areas in information technologies, as it makes data collection more convenient. Currently, the sensors, which can sense, process, and disseminate the real-time information, are widely deployed to monitor and track various objects, such as animals, vehicles, and physical phenomena. IoT applications have proliferated due to the advancements of these wireless sensors. IoT utilizes and further extends the benefits (e.g., data sharing, always-on feature) of the Internet, which facilitates the data collection to achieve more accurate management. However, because the protocols’ standards are publicly known, the IoT systems are facing ever-increasing attacks. To address these security threats, many security protocols using cryptographic techniques have been proposed to guarantee confidentiality, authentication, and integrity. For example, IPSec, Host Identity Protocol (HIP), and Extensible Authentication Protocol (EAP) provide point-to-point authentication for IoT nodes [1], [2]. In addition, the security group communication methods [3], [4], [5] are also useful tools to ensure that the messages are efficiently transmitted among the IoT devices.

After a large amount of data is collected and stored, it should be utilized as soon as possible. Furthermore, more data will bring greater value, and data sharing is necessary. In the data sharing scenarios, the above security protocols are inadequate to address the practical requirements. Specifically, the traditional security protocols only guarantee that the transmitted message is shared among the authorized users, and public data sharing is not allowed. For example, if secure channels are already established using the cryptographic techniques among the users u1, u2, ⋅⋅⋅, un in the group, then  u1, u2, ⋅⋅⋅, un send the message m1, m2, ⋅⋅⋅, mn independently. The security protocol guarantees that m1, m2, ⋅⋅⋅, mn are only shared among the group members, and meanwhile, the others can learn nothing. If these data are directly shared with the public, the user's privacy will be violated. For example, the real time electricity consumption report from a smart meter can be used to infer users’ living habits and economic incomes, while the health data from a wearable device contains sensitive physical status information. Hence, the user's data should be processed in order to protect its privacy before being uploaded to the big data center. Denote the IoT devices as u1, u2, ⋅⋅⋅, un and the corresponding collected data as m1, m2, ⋅⋅⋅, mn. The data m1, m2, ⋅⋅⋅, mn are processed, collected, and published for public utility and data analysis, which should satisfy two requirements: 1) the statistical features of the processed data are an approximation to the raw data; and 2) the published data cannot be linked to the corresponding user or the IoT device. Therefore, the privacy concerns are different from the traditional security goals, and in this paper, we assume that the secure channel among the entities in IoT systems is already established by using secure group communication or similar methods. Instead, we emphasize how to process and collect the data m1, m2, ⋅⋅⋅, mn.

Privacy-preserving mechanisms are proposed such as generalization [6], aggregation [7], [8], differential privacy [9], etc. However, in order to protect participants’ privacy, these mechanisms deliberately destroy the integrity of the sensed data. For example, data aggregation is the main method to mask single data values by computing the sum, the median or max/min value in an area [10], [11], [12], [13], which is very useful in many situations, such as smart meter reading, environmental monitoring, etc. However, in some other cases, data aggregation is not as useful, such as in collecting blood pressure or other health data. Moreover, most of the privacy-preserving aggregation schemes need to employ a trusted authority (TA). Recently, differential privacy has been used to protect the individual information in data collection, in which some randomness, such as Laplace noise, is added to the collected data. Later, the approximate aggregation data is obtained by removing the noise. Although the aggregation using differential privacy does not rely on TA, its result is not as accurate as the other methods.

Another method to achieve privacy protection data collection is k-anonymity, introduced by Sweeney and Samarati in 1998, in which the k-anonymity property is defined such that each person cannot be distinguished from at least other k - 1 individuals [14], [15]. For example, if the targeting attributes are age and gender, then a k-anonymized data set needs to have at least k records, and some transforms need to be performed on the original attributes which are a combination of age and gender, such as generalization, global recoding, and suppression. However, k-anonymity has some disadvantages: 1) some attributes of the data need to be deleted or modified; and 2) the individual data can be re-identified when several related data sets are associated together.

Another efficient and novel idea for data collection is n-source anonymity [7], which uses the cryptographic tools to unlink the data from its sender in a set with n members. In fact, it is different from the k-anonymity method. In this paper, our design is similar to the n-source anonymity scheme, which simultaneously realizes two goals: 1) the raw data is collected, and 2) the property of unlinkability between the data and its sender is achieved. Assuming that a data consumer recruits a number of IoT devices to collect the time series sensing data [11], [16], she needs the collaboration of a data collector, which is an entity in IoT systems.

Inspired by the previous works [7], [17], [18], [19], an efficient anonymous data collection protocol is proposed for practical environments. To encourage the end device of IoT systems to participate in the sensing task and to maintain the equipment status, a reward mechanism is also added. For more information about anonymous data collection, readers are referred to the paper in [20]. The main contributions of this paper include:

  • (1)

    An anonymous data collection scheme for IoT is proposed without employing TA, which makes it more applicable since it is difficult to deploy a high-cost TA node for practical IoT scenarios.

  • (2)

    Noises are not added in the collected data, nor are the collected data aggregated. The raw data is shared or released, which makes it more valuable for big data analysis.

The rest of this paper is organized as follows: Section 2 discusses the related works. The system architecture and system model are introduced in Section 3. Section 4 describes our proposed scheme. The approach toward the dynamic change of participants is discussed in Section 5. Section 6 shows the privacy characters, compared with some related works. In addition, the performance by simulation is presented in Section 7. Finally, we conclude this paper in Section 8.

Section snippets

Related work

Cryptography is a very useful tool to ensure privacy for data collection in the IoT environment. In Yang et al.’s scheme [21], a data collection scheme has been proposed using the ElGamal cryptosystem, in which there are t leaders, and all participants encrypt their data with t leaders’ public keys. However, Yang et al.’s scheme [21] is shown to be vulnerable against the collusion attack by Brickell and Shmatikov in [22]. In the improved version of [22], each participant has two public/private

Building blocks

This section discusses the encryption tools in the protocol.

The fog-assisted raw data collection

In this section, a fog-assisted raw data collection protocol is proposed, which mainly comprises three phases: setup phase, slot generation phase, and data collection and reward phase. In the second phase, the blind signature scheme can be found in [29]. Some notions are listed in Table 1, and the overview of the proposed scheme is shown in Fig. 2.

Dynamic change of participants

When the group is not large, it is possible that all participants negotiate the slot again when participants join or leave. However, it may cause expensive communication and computation costs while the group is large. This section describes the efficient dynamic change of participants.

Analysis

In this section, the analysis and the performance simulation are presented, which demonstrate that the proposed scheme not only satisfies the claimed requirements but is also suitable for the practical environment.

Communication overhead

According to [32], communication is more energy-hungry than computation, and the main communication of this protocol includes: pipj, pi↔ FN, FN ↔ CS.

Conclusion

In this paper, a practical IoT sensing collection protocol is proposed without a TA, which not only preserves the raw data but also unlinks the data with its contributor. With the increasing demands of data sharing, our protocol guarantees the utility of the entirety of the data and the privacy of individual data at the same time. This not only increases the value of the data but also eliminates the concern of privacy leakage. In addition, the simulation with 1000 IoT devices was executed to

Acknowledgments

The authors would like to thank anonymous reviewers for their valuable comments and thank Prof. Frank Piessens for helping us to improve the quality of the revision. In addition, we thank Zhiwen Bai for his contribution to the performance simulation.

This work was partly supported by National Natural Science Foundation of China under grants Nos. 61662016 and 61772224, GUET Excellent Graduate Thesis Program No. 16YJPYSS17, Key projects of Guangxi Natural Science Foundation under grant no.

Yining Liu is currently a professor in School of Computer and Information Security, Guilin University of Electronic Technology, China. He received the B.S. degree in Applied Mathematics from Information Engineering University, Zhengzhou, China, in 1995, the M.E. in Computer Software and Theory from Huazhong University of Science and Technology, Wuhan, China, in 2003, and the Ph.D. degree in Mathematics from Hubei University, Wuhan, China, in 2007. His research interests include applied

References (32)

  • Y. Zhang et al.

    Privacy-preserving data aggregation in mobile phone sensing

    IEEE Trans. Inf. Forens. Secur.

    (2016)
  • Y.N. Liu et al.

    A Practical privacy-preserving data aggregation (3PDA) scheme for smart grid

    IEEE Trans. Industr. Inf.

    (2018)
  • A. Friedman et al.

    Data mining with differential privacy

  • M.M. Groat et al.

    KIPDA: k-indistinguishable privacy-preserving data aggregation in wireless sensor networks

  • Y. Zhang et al.

    Efficient and privacy-preserving min and kth min computations in mobile sensing systems

    IEEE Trans. Depend. Secure Comput.

    (2017)
  • N. Shrivastava et al.

    Medians and beyond: new aggregation techniques for sensor networks

  • Cited by (79)

    • Data anonymization evaluation for big data and IoT environment

      2022, Information Sciences
      Citation Excerpt :

      The connected IoT devices are able to generate, transmit, process, and store information in an intelligent way [1]. Specifically in the emerging IoT applications, e.g., in self-driving vehicles, smart sensors are able to sense, process and disseminate real-time information and play an extremely important role [4]. This feature in the IoT has not only been applied in academic research and industrial fields, but also has been widely used in daily life, such as smart grid, electronic health, electronic home, environmental monitoring, smart city, etc. [5].

    • A new secure arrangement for privacy-preserving data collection

      2022, Computer Standards and Interfaces
      Citation Excerpt :

      Due to it is difficult to find an authority that is trusted by all participants, the central dependency should be avoided as much as possible in practical application. Shuffle: Shuffle is another solution for secure arrangement [14]. Assumed that there is a transfer sequence, and every participant knows this order.

    View all citing articles on Scopus

    Yining Liu is currently a professor in School of Computer and Information Security, Guilin University of Electronic Technology, China. He received the B.S. degree in Applied Mathematics from Information Engineering University, Zhengzhou, China, in 1995, the M.E. in Computer Software and Theory from Huazhong University of Science and Technology, Wuhan, China, in 2003, and the Ph.D. degree in Mathematics from Hubei University, Wuhan, China, in 2007. His research interests include applied cryptography, and data privacy protocol.

    Yanping Wang is now pursuing her M.E. degree in Computer Science in Guilin University of Electronic Technology, Guilin, China. He has received her B.E. degree in Information Security from Guilin University of Electronic Technology, Guilin, China, in 2016. Her research focuses on the data privacy protocol.

    Xiaofen Wang received her Ph.D. and M.S. degrees from Xidian University, Xi'an, China, in 2009 and 2006, respectively. Dr. Wang is currently an associate professor at School of Computer Science and Engineering, and The Center for Cyber Security, University of Electronic Science and Technology of China, Chengdu, China. She is also currently a visiting research fellow in Centre for Computer and Information Security, University of Wollongong. Her research interests are public key cryptography and its applications in wireless networks, smart grid, and cloud computing.

    Zhe Xia is an Associate Professor in Department of Computer Science at Wuhan University of Technology (WHUT), China. He received the PhD degree from University of Surrey (UK) in 2009, supervised by Prof. Steve Schneider. He has worked as a Research Fellow in Department of Computing at University of Surrey between 2009 and 2013 before joining WHUT. His research interests include secure e-voting protocols, secret sharing and secure multiparty computation. Dr. Xia serves as the Associate Editor for Journal of Information Security Applications (JISA). He also serves on the program committees for many international conferences, such as NSS, EVT, VOTE-ID, DCIT.

    Jingfang Xu (Chingfang Hsu) received the M.Eng. and the Ph.D. degrees in information security from the Huazhong University of Science and Technology, Wuhan, China, in 2006 and 2010 respectively. From Sep. 2010 to Mar. 2013, she was a Research Fellow at Huazhong University of Science and Technology. She is currently an Assistant Professor at Central China Normal University, Wuhan, China. Her research interests are in cryptography and network security, especially in secret sharing and its applications.

    View full text