Carp: A cost-aware relaxed protocol for encrypted data stores

https://doi.org/10.1016/j.jisa.2020.102501Get rights and content

Abstract

Distributed data stores are critical to the success of applications in cloud. Massive volumes of user data are stored and processed with the support of underlying distributed data stores. With large amounts of data stored remotely in the cloud, security becomes a major concern. Authentication and access control are provided by cloud storage providers. But even with proper authentication and access control policies, storage systems are still vulnerable to attackers who have direct access to storage devices such as disks. Encryption makes it computational difficult to retrieve the original data even when the attackers have the access to the disks. However, there are many challenges in designing an encrypted distributed data store that is highly secure and cost-aware.

In this paper, we show that security flexibility and cost efficiency can be achieved at the same time. We present Carp, a cost- aware relaxed protocol for encrypted data stores. Carp is a heuristic solution instead of an optimal one. The key idea is to reduce additional encryption operations for frequently accessed data. It is achieved by allowing data objects stay unencrypted for a short time period after the data are accessed. Reducing encryption operations eventually means reducing the computational cost and power consumption in the data store. Unlike conventional encrypted file systems which store data encryption keys on disks, we present a hybrid design of key generation and caching. Data encryption keys are generated for individual objects or a group of them using cryptographic hashing. We develop a prototype data store and conduct experiments. The experimental results show that Carp can reduce up to 20% encryption operations with high-level security.

Introduction

Large-scale web applications rely on back-end distributed storage [1], [2]. Distributed storage systems provide high data reliability and availability for web applications [3,4]. The types of data stored in the system vary, such as web pages (Google), photos (Instagram), videos (YouTube), and etc. Massive volumes of data are uploaded to cloud storage systems. For instance, users upload over 300 millions of photos to Facebook per day [5]. Most of these data are stored on the disk of thousands or more servers in the back-end data centers [6].

Besides availability and reliability, users also expect such systems to protect their data after the data are uploaded [7,8]. To protect data against attacks, techniques such as authentication and access control are used to resist malicious activities over the networks [9,10]. However, authentication and access control are vulnerable to insiders who have direct access to the disks. Thus risks still exist when data are unencrypted on disk.

Encryption is a common technique to provide strong security for data on disk. In encrypted data stores, data are always encrypted when they are stored on the server. When client programs need to access the data, they have to decrypt them first. Cloud storage providers, including IBM, Amazon, and Microsoft, are aware of the importance of this and offer encryption services for their customers [11], [12], [13].

Many research work have been done in the area of designing encrypted storage systems [14], [15], [16]. The major concern is: encryption has overhead and results in additional cost in the system. For large-scale data stores secured by encryption, frequent accesses to data objects can cause large numbers of encryption/decryption operations in a short time. This further means high computational cost and power consumption in the data stores. The overhead of key management for large amounts of encrypted data is also non-trivial. Hierarchical key organization with multiple levels of key protection is a popular alternative. Horus [17]. used a keyed hash tree for key management, where keys are computed at real-time. Only a root key is stored on disk therefore the disk space is saved. Plutus [18]. used a client-centric key management where keys are maintained by clients not servers.

We observe the following challenges when applying encryption-based security to cloud storage: 1) The overhead (computation and communication cost) of frequent encryption/decryption on large volumes of data. 2) The coarse-grained design for data encryption and cost efficiency.

In this paper, we present Carp, a cost-aware relaxed protocol for encrypted data stores. The main goals are to minimize the overhead of encryption for frequent I/Os and to provide a good balance between data security and cost efficiency. As its core design, Carp allows plain data objects to exist in the system for various intervals after they are accessed. It also provides flexibility to adjust the intervals based on different levels of security demands. In a secure distributed data store, every data object is encrypted by a data key when the object is stored on disk or in memory. The key can be unique or shared by multiple objects. Keys are not stored on disk. They are either kept in memory or generated in runtime. The encryption status of each object is tracked and analyzed by the system.

The contributions of this paper include:

  • A cost-aware relaxed protocol is introduced to reduce the overhead of encryption on frequently accessed data objects.

Popular data objects can be unencrypted for a relaxed interval based on a pre-defined security level.

  • We define encryption rate to quantify security levels of data objects, and allow flexible grouping of data objects based on their security levels.

  • We use encryption bits to track the security status of data objects. The overhead of the monitoring is minimized.

  • A hybrid key management with caching is presented to ensure the safety of data encryption keys while reducing overheads.

  • We implement a prototype data store, conduct evaluations, and run simulated workloads on our data store. The results show Carp can achieve both data security and cost-efficiency in encrypted data stores.

The rest of the paper is organized as: Section 2 motivates the proposed work. Section 3 gives an overview of the storage system and its data-sharing model. Section 4 and 5 explain the core algorithms of Carp and the hybrid key management design. Section 6 presents our implementation and evaluates Carp with different workloads. Section 8 provides a literature review of secure data stores. Section 9 concludes the paper.

Section snippets

Parallel access to encrypted data

In encrypted cloud storage, data files need decryption before a client can access them. After each access, the data are encrypted again. The encryption process can be done on both client-side and server-side. For data that are used by individuals or rarely shared, the traditional encryption process has limited overheads. However, for shared data that are concurrently accessed by different clients, the shortness of traditional approaches becomes obvious. One common way is to handle requests to

Encrypted storage system

The architecture of our encrypted storage system follows a common master-worker design [3], [4], [5], [6], [7], [8], [9], [10], [11], [12], [13], [14], [15], [16], [17], [18], [19]. It consists of three major components: MetaServer (master), DataServer (worker), and Fig. 1 KeyServer. Fig. 2 illustrates how the three servers collaborate with each other to achieve data encryption. Some key notations are listed in Table 1. The MetaServer (M) is responsible for managing metadata, such as locations

Carp overview

We present Carp, a novel cost-aware relaxed protocol for encrypted data stores. We will first define the key concepts including reduced encryption operations, secure group, and encryption rate and then illustrate the details of the Carp algorithms.

Key management

We present a novel hybrid key management design in this section. Most distributed data stores provide server-side encryption without storing per-data keys. It usually involves a master key or a root key. They either use the root key to encrypt/decrypt data keys from clients [12], [13], [14], [15], [16], [17], [18], [19], [20], or use the root key to generate data keys [17]. Our approach tries to generate data encryption key at runtime with multiple hash functions. Instead of letting clients

Evaluation

As explained previously, reducing encryption operation can help achieve cost efficiency in encrypted data stores. In our evaluation, we try to use reduced encryption operations to show the cost awareness of Carp. We run Carp on a customized encrypted data store with simulated workloads. Through the evaluation, we observe that Carp can reduce 10% to 20% encryption operations based on various encryption rates. The evaluations also show the effectiveness of Carp when the system is handling

Security analysis of Carp

Carp is a heuristic solution instead of an optimal one. We try to achieve better read/write performance in encrypted data stores with best-effort security. As mentioned in Section 4.1, we use encryption rate to quantify the security level of the system. Our

Assumption is: as the system runs for time T, the average unencrypted time for an object is t, and the improved operation latency is P. In the scenarios where data are not required to be encrypted strictly, to achieve a better improvement of

Related work

The proposed work is motivated by many previous research work in secure distributed systems [27], [28], [29]. Horus [17]. aimed to providing a fine-grained encryption-based security for large-scale storage system. It has several contributions: 1) it uses a keyed hash tree design to provide efficient key management for large-scale systems; 2) it uses a root key and access location of the file to calculate the leaf key for encryption and decryption of file blocks, which saves disk spaces for

Conclusions

Encryption techniques are critical to the safety of distributed data stores and their applications. Server-side encryption and key management bring overhead to the data stores, especially for frequently accessed data. Existing approaches lack the flexibility in data encryption and cost reduction. Carp allows the adjustment of encryption intervals to reduce the costs of data encryption. It also uses secure groups and hybrid key management for a better performance. Based on the evaluated results,

Declaration of Competing Interest

None.

CRediT authorship contribution statement

Longbin Chen: Conceptualization, Methodology, Software, Writing - original draft. Li-Chiou Chen: Supervision, Formal analysis, Writing - review & editing. Nader Nassar: Writing - review & editing.

References (36)

  • W. Dai et al.

    Dass: A web-based fine-grained data access system for smartphones

  • L. Chen et al.

    An analysis of server-side design for seed-based mobile authentication

  • I. Cloud, “IBM multi-cloud data encryption,”...
  • A. web service, “protecting data using encryption,”...
  • M. Azure, “Azure storage service encryption for data at rest,”...
  • W. Dai et al.

    A privacy-protection data separation approach for fine-grained data access management

  • L. Chen et al.

    A design for scalable and secure key-value stores

  • L. Chen et al.

    Seed-based authentication for mobile clients across multiple devices

    Inf Security J

    (2018)
  • View full text