Privacy-preserving statistical analysis over multi-dimensional aggregated data in edge computing-based smart grid systems

https://doi.org/10.1016/j.sysarc.2022.102508Get rights and content

Abstract

Smart grid systems enable bidirectional data communication between users and a smart grid control center (CC), by utilizing various communication infrastructures and embedded devices. To extract valuable information from users’ power consumption data efficiently, multi-dimensional data of users are required to be analyzed deeply. To protect users’ privacy, power consumption data are usually encrypted before transmission, which simultaneously makes it difficult to conduct statistical analysis. In this paper, we propose a scheme which enables privacy-preserving statistical analysis over multi-dimensional aggregated data (SA-MAD) in smart grid systems equipped with edge computing. We modify Boneh–Goh–Nissim (BGN) public key cryptosystem to a dual-message encryption mode, combining with two special superincreasing sequences to deal with multi-dimensional encrypted data aggregation. Besides, we design an identity-based aggregate signature to ensure encrypted data integrity in smart grid systems, and employ shamir secret sharing technique to support transmission fault-tolerance mechanism from smart meters to corresponding edge servers. SA-MAD enables CC to flexibly conduct privacy-preserving statistical analysis (e.g., sum, average, and variance) over aggregated data, and it could be easily extended to support covariance and linear regression computation. The performance evaluation demonstrates the feasibility of SA-MAD in edge computing-based smart grid systems.

Introduction

Traditional power grids have already no longer been satisfied with the growing demands for continuous stable electricity generation and distribution. Some related reports have indicated such trend, for instance, the electrical blackouts occurred in North America in 2003 and European in 2006 due to load imbalance and lack of effective real-time diagnosis in power grids [1], [2]. By contrast, smart grids integrate various advanced communication and information technologies, e.g., cloud computing, artificial intelligence, and other emerging techniques, to enable a cost-effective bidirectional communication between smart meters and a grid control center, which aims at making up for shortcomings in traditional power grids [3], [4].

In smart grids, a smart meter (SM) is a critical embedded device which plays an important role in the achievement of bidirectional communication mechanism. Usually, smart meters are utilized to periodically collect and send users’ power consumption data to a smart grid control center. As the collected power consumption data are closely relevant to users’ living habits and household security, an adversary in the smart grid system may intercept transmitted data and seek to extract useful information with various attacking methods, e.g., machine learning, non-intrusive load monitoring (NILM), thereby violating users’ privacy [5], [6], [7], [8], [9], [10], [11]. As such, it is indispensable to preprocess users’ data to preserve privacy against these attacks. Additionally, the integrity of power consumption data is also demanded for smart grids. This is because some malicious users may try to tamper with the data to avoid subsequent power usage billing, an external adversary may also try to tamper with or replace the data to cause system chaos or even collapse [12], [13], [14]. As smart meters are installed in the proximity of users’ houses with few protection measures, the condition always occurs that smart meter malfunctions as a result of system crash or factitious destruction [15], [16]. Consequently, a practical smart grid system should be as robust as possible and provide transmission fault-tolerance functionality for smart meters.

Data aggregation is an effective approach to preserve users’ power consumption data privacy as well as largely reduce communication overhead in various distributed environments. In smart grids, smart meters collect multi-dimensional power consumption data for process and analysis periodically, an intermediate device could be exploited as a data aggregator or communication relay between smart meters and a smart grid control center. In order to enable a control center to conduct subsequent privacy-preserving statistical analysis over those aggregated power consumption data, the homomorphic encryption technique is frequently employed to encrypt users’ data in prior to transmission. Accordingly, specific types of computations could be carried out on ciphertexts, and an encrypted result could match the result of operations performed on the plaintexts. Existing Paillier homomorphic cryptosystem [17] and BGN homomorphic cryptosystem [18] are two typical techniques in data aggregation. With various increasing big data analysis requirements in smart grids, it is necessary to introduce new techniques to construct data aggregation schemes that could support abundant privacy-preserving statistical analysis, e.g., sum, average, variance, covariance, linear regression computation.

In addition to the advantages of privacy-preserving data aggregation in statistical analysis over power consumption data, we observe that emerging edge computing technique could be introduced in the deployment of smart grids. Edge computing is a computing paradigm that has gained considerable popularity in academic and industrial areas [19], [20], [21], [22]. It brings the service and utilities of cloud computing closer to the end user and is characterized by fast processing and quick application response time. As such, it is specially suitable for delay-sensitive Internet-of-thing (IoT) applications (e.g., smart grids) which require low-latency, mobility and location awareness support. Integrating edge computing framework into smart grid systems could improve the ability of smart grids to process massive heterogeneous data and improve the real-time performance of transactions processing [23], [24], [25].

In this paper, we present a data aggregation scheme for smart grid systems based on edge computing paradigm, which enables a grid control center to conduct privacy-preserving statistical analysis over multi-dimensional aggregated data (SA-MAD). Specifically, the contributions of this work are threefold which are elaborated as follows.

  • We extend the origin BGN homomorphic encryption to a BGN dual-message encryption mode, where the order of underlying bilinear map groups is constructed with four distinct primes. We design two superincreasing sequences and combine with the modified four-prime BGN encryption algorithm to enable smart meters to encrypt multi-dimensional power consumption data and corresponding square values into one ciphertext. Once all the ciphertexts from different users are aggregated to a single ciphertext, the sum of users’ multi-dimensional data and the sum of square of multi-dimensional data could be recovered by the control center with corresponding private key. Additionally, this approach could be easily extended to compute covariance and linear regression of two data types with slight modification.

  • To improve system robustness and support transmission fault tolerance mechanism, we exploit shamir secret sharing technique to split a preset parameter into slices, where is the number of smart meters in a single grid area. Each smart meter utilizes this slice to further obfuscate the ciphertext such that even if the decryption private keys of the control center are leaked to an adversary, the confidentiality of each user’s power consumption data is still preserved. As long as the number of smart meters in the grid area is greater than or equal to the threshold value, the obfuscation factor in the ciphertext could be canceled by corresponding edge server during data aggregation process. Accordingly, the subsequent data aggregation and statistical analysis functionalities could also be achieved. Besides, we design a new identity-based aggregate signature algorithm to ensures the data integrity of multi-dimensional power consumption data in the whole smart grid systems equipped with edge computing.

  • We strictly demonstrate the correctness of SA-MAD, including average, variance, covariance and linear regression with the help of modified four-prime encryption and superincreasing sequences. We prove that the modified four-prime BGN encryption algorithm achieves ciphertext indistinguishability under the composite order subgroup decision assumption, and also prove that SA-MAD achieves fault-tolerance and preserves the integrity of encrypted data. We finally conduct comprehensive performance evaluation to demonstrate its feasibility and outstanding advantages in the deployment of edge computing-based smart grids.

Homomorphic encryption is a common technique used by existing data aggregation schemes [26], [27], [28], [29], [30], [31], [32], [33], [34], [35], [36]. By utilizing the homomorphic properties of encryption algorithms, data aggregation enables a control center of smart grids to compute some statistical results directly over aggregated ciphertext without decrypting each ciphertext from different smart meters, thereby protecting users’ privacy. In 2012, Lu et al. [26] exploited Paillier homomorphic cryptosystem [17] to design a privacy preserving aggregation scheme for secure smart grids. Their scheme supports sum computation for the control center due to the additive homomorphic property of Paillier cryptosystem, but does not support variance and other more complex statistical computations. Li et al. [27] also employed Paillier homomorphic cryptosystem and combine it with superincreasing sequences to propose a privacy preserving multi-subset aggregation scheme for smart grids. Their scheme can achieve the computation of total usage of power consumption in different ranges, and the computation of total users in corresponding range. However, the data integrity preservation is not considered in their scheme, which is also a critical security requirement. Wang et al. [28] extended the methods in [27] and added a procedure to update the blinding factors used by smart meters to achieve stronger security. Besides, their scheme also achieves fault tolerance to solve the situation that device malfunction or battery exhaustion. Wang et al. [37] proposed an identity-based data aggregation scheme for smart grids, they presented a new definition of an identity-based encryption and signature scheme, in which an identity-based encryption scheme and an identity-based signature scheme are combined into a single scheme that could share the same private and public parameters. The identity-based additive homomorphic encryption scheme in [29] is utilized in their scheme to be a basic component to encrypt and aggregate users’ data. Merad-Boudia et al. [30] proposed an efficient and secure multi-dimensional data aggregation scheme for fog computing-based smart grids. They also used Paillier cryptosystem to encrypt users’ power consumption data, and a binary encoding algorithm is integrated to aggregate multi-dimensional data. Liu et al. [31] proposed a privacy preserving aggregation scheme based on a double trapdoor decryption cryptosystem proposed in [38] and arithmetic circuit technique. In their scheme, the encrypted power data consumption can be outsourced to clouds while being efficiently aggregated by fog nodes. With the encrypted data, their scheme allows a requester (the control center or a service provider) to launch function query for its particular applications. Alsharif et al. [39] utilized masking technique to propose a privacy-preserving multi-dimensional and multi-subset data aggregation scheme. Their scheme can simultaneously achieve multi-dimensional data aggregation with multi-subset for each dimension. Chen et al. [32] proposed a data aggregation scheme for electric vehicle networks by designing an identity-based sequential aggregate signed data based on factoring and a threshold Paillier homomorphic encryption scheme. Lyu et al. [33] utilized the differential privacy and homomorphic encryption techniques to present a data aggregation scheme for smart grids with the aid of fog computing architecture. They used the Gaussian mechanism to distribute noise generation among parties to offer provable differential privacy guarantees of the aggregate statistic on both fog level and cloud level. Ding et al. [40] constructed an identity-based metering data aggregation scheme to achieve more efficient data integrity batch verification by a collector and the power service provider. Zhao et al. [34] introduced a somewhat homomorphic encryption scheme into the fog computing-based smart grids to present a data aggregation scheme with smart pricing and packing method. In recent, Zhang et al. [35] proposed an encrypted data aggregation scheme with lightweight verification in fog-assisted smart grids, which could address key-leakage resilient issue.

The remainder of the paper is organized as follows. Section 2 gives necessary preliminaries. Section 3 introduces the system framework and design goals. Section 4 gives modified four-prime BGN encryption algorithm and detailed construction of SA-MAD. In Section 5, we provide detailed security analysis, including privacy-preserving property and data integrity guarantee. In Section 6, we further provide the extension of SA-MAD. Section 7 gives a functionality comparison and comprehensive performance evaluation compared with existing schemes. Finally, in Section 8, we conclude our work.

Section snippets

Bilinear pairing

Bilinear pairing is an important cryptographic technique, and has been widely adopted in many cryptographic schemes. Let G1 and G2 be two cyclic multiplicative groups with the same prime order q0. A bilinear pairing is a mapping ẽ:G1×G1G2, which satisfies the following properties:

  • 1.

    Bilinearity: For any R1,R2G1 and a,bZq0, we have ẽ(R1a,R2b)=ẽ(R1,R2)ab.

  • 2.

    Non-degeneracy: There exist R1,R2G1 such that ẽ(R1,R2)1, where 1 is the identity element of the multiplicative cyclic group G2.

  • 3.

System framework and design goals

Statistical analysis over multi-dimensional encrypted aggregated data

In this section, we first extend the original BGN cryptosystem to a dual-message encryption mode in which the composite order n of a cyclic group G is divided into four distinct primes, and two different decrypted private keys are introduced to recover two types of plaintexts on demand, respectively. Then, we combine the modified four-prime BGN encryption algorithm with two superincreasing sequences to ensure multi-dimensional encrypted data aggregation, such that the control center could

Security analysis

In this section, we provide security analysis of SA-MAD, in terms of data confidentiality and integrity preservation. We first prove that the modified BGN cryptosystem achieves ciphertext indistinguishability under the composite order subgroup decision assumption, as analyzed by Boneh et al. [18].

Theorem 1

The modified four-prime BGN encryption algorithm achieves ciphertext indistinguishability under the composite order subgroup decision assumption.

Proof

The subgroup decision problem states that a uniform

Further extension of SA-MAD

As the proposed modified BGN cryptosystem could encrypt dual messages during a single round data transmission, SA-MAD could be extended to enable CC to conduct more complex statistical analysis, e.g., covariance computation, linear regression computation.

  • 1.

    Covariance Computation Enabled Aggregation: The covariance cov(X,Y) is a measure of the strength of the correlation between two variables X and Y, it is define as: cov(X,Y)=1ħi=1ħxiyi1ħi=1nxi1ħi=1ħyi=1ħi=1ħxiyi1ħ2i=1ħxii=1ħyi.Here, ħ

Functionality comparison and performance evaluation

In this section, we first compare various functionalities of SA-MAD with existing schemes [28], [30], [31], [40]. Then, we compare the performance in terms of computational costs and communication overhead with existing schemes [30], [31], [40], which have also achieved the integrity of the encrypted power consumption data in the whole smart grids. We implement these data aggregation schemes in experiments, which are run on a Windows 10 system with Inter(R) Core(TM) I52320 CPU 3.00 GHz and

Conclusions

In this paper, we have proposed a privacy-preserving statistical analysis scheme over multi-dimensional aggregated data (SA-MAD) in edge computing-based smart grid systems. SA-MAD combines the modified four-prime BGN encryption algorithm with two superincreasing sequences to ensure multi-dimensional encrypted data aggregation, such that the control center could further conduct privacy-preserving multi-functional statistical analysis over those aggregated data, such as sum, average, and

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

This work was supported in part by the National Natural Science Foundation of China under Grants 61902327, U1636217, and 62002050; in part by the China Postdoctoral Science Foundation under Grant 2020M681316; in part by the Youth Scientific and Technology Innovation Team Project of SWPU, China under Grant 2019CXTD05; and in part by the Chengdu Key R & D project, China under Grant 2021-YF05-00965-SN.

Xiaojun Zhang received the Ph.D. degree in information security from the University of Electronic Science Technology of China (UESTC) in 2015. He was a research scholar from 2018 to 2019 in the School of Physical and Mathematical Sciences, Nanyang Technological University, Singapore. He worked as a Postdoctoral Fellow from 2016 to 2019 in University of Electronic Science Technology of China from 2016. He is currently an associate professor in the School of Computer Science, Southwest Petroleum

References (42)

  • XiaoJ. et al.

    Observation of security region boundary for smart distribution grid

    IEEE Trans. Smart Grid

    (2017)
  • HeJ. et al.

    An efficient and accurate nonintrusive load monitoring scheme for power consumption

    IEEE Internet Things J.

    (2019)
  • AkaberP. et al.

    Cases: Concurrent contingency analysis-based security metric deployment for the smart grid

    IEEE Trans. Smart Grid

    (2020)
  • DengR. et al.

    CCPA: COordinated cyber–physical attacks and countermeasures in smart grid

    IEEE Trans. Smart Grid

    (2017)
  • ShenH. et al.

    Efficient privacy-preserving cube-data aggregation scheme for smart grids

    IEEE Trans. Inf. Forensics Secur.

    (2017)
  • RahmanM.A. et al.

    Secure and private data aggregation for energy consumption scheduling in smart grids

    IEEE Trans. Dependable Secure Comput.

    (2017)
  • BaoH. et al.

    A new differentially private data aggregation with fault tolerance for smart grid communications

    IEEE Internet Things J.

    (2015)
  • PaillierP.

    Public-key cryptosystems based on composite degree residuosity classes

  • BonehD. et al.

    Evaluating 2-DNF formulas on ciphertexts

  • LiuY. et al.

    Toward edge intelligence: multiaccess edge computing for 5G and internet of things

    IEEE Internet Things J.

    (2020)
  • LiY. et al.

    Learning-aided computation offloading for trusted collaborative mobile edge computing

    IEEE Trans. Mob. Comput.

    (2020)
  • Cited by (9)

    • Multi-dimensional Data Aggregation Scheme Supporting Fault-Tolerant Mechanism in Smart Grid

      2023, International Journal of Advanced Computer Science and Applications
    View all citing articles on Scopus

    Xiaojun Zhang received the Ph.D. degree in information security from the University of Electronic Science Technology of China (UESTC) in 2015. He was a research scholar from 2018 to 2019 in the School of Physical and Mathematical Sciences, Nanyang Technological University, Singapore. He worked as a Postdoctoral Fellow from 2016 to 2019 in University of Electronic Science Technology of China from 2016. He is currently an associate professor in the School of Computer Science, Southwest Petroleum University, China. He has also worked as a Postdoctoral Fellow since 2020 in the School of Cyber Science and Engineering, Shanghai Jiao Tong University, China His research interests include applied cryptography, network security, cloud computing security, the security and privacy for smart grids.

    Chao Huang received the M.S. degree in computer science and technology from Southwest Petroleum University in 2021. He is currently a postgraduate student for Ph.D. degree in the School of Cyber Science and Technology, Beihang University, China. He is now presently engaging in cryptography, network security, cloud computing security and data security of smart grid systems.

    Dawu Gu is a full professor at Shanghai Jiao Tong University (SITU) in School of Cyber Science and Engineering. He received rom Xidian University of China his B.S. degree in applied mathematics in 1992, M.S. in 1995, and Ph.D. degree in 1998 both in cryptography. His current research interests include cryptography, side channel attack, and software security. He leads the Laboratory of Cryptology and Computer Security (LoCCS) at SJTU.

    Jingwei Zhang received the B.E. degree in computer network engineering from Southwest Petroleum University, Chengdu, China, in 2019, where he is currently pursuing the M.S. degree in computer science and technology with the School of Computer Science. His current research interests include cryptography, cloud computing security, and big data security.

    Jingting Xue received the B.Sc. degree and Ph.D. degree from the University of Electronic Science Technology of China, China, in 2014 and 2020. As a visiting Ph.D student, she had a one-year study and research at Nanyang Technological University in 2019, Singapore. She is currently a lecturer in the School of Computer Science, Southwest Petroleum University. Her research interests are applied cryptography, cloud storage, and blockchain technology.

    Huaxiong Wang received the Ph.D. degree in mathematics from the University of Haifa, Israel, in 1996 and the Ph.D. degree in computer science from the University of Wollongong, Australia, in 2001. He joined Nanyang Technological University in 2006 and is currently an associate professor in the Division of Mathematical Sciences. He is also an honorary fellow at Macquarie University, Australia. His research interests include cryptography, information security, coding theory, combinatorics, and theoretical computer science. He has been on the editorial board of three international journals: Designs, Codes and Cryptography(2006–2011), the Journal of Communications (JCM), and Journal of Communications and Networks. He was the program cochair of Ninth Australasian Conference on Information Security and Privacy (ACISP 04) in 2004 and Fourth International Conference on Cryptology and Network Security (CANS 05) in 2005, and has served in the program committee for more than 70 international conferences. He received the inaugural Award of Best Research Contribution from the Computer Science Association of Australasia in 2004.

    View full text