skip to main content
10.1145/3135974.3135985acmconferencesArticle/Chapter ViewAbstractPublication PagesmiddlewareConference Proceedingsconference-collections
research-article

Ginja: one-dollar cloud-based disaster recovery for databases

Published: 11 December 2017 Publication History

Abstract

Disaster Recovery (DR) is a crucial feature to ensure availability and data protection in modern information systems. A common DR approach requires the replication of services in a set of virtual machines running in the cloud as backups. This leads to considerable monetary costs and managing efforts to keep such cloud VMs. We present Ginja, a DR solution for transactional database management systems (DBMS) that uses only cloud storage services such as Amazon S3. Ginja works at file-system level to efficiently capture and replicate data updates to a remote cloud storage service, achieving three important goals: (1) reduces the costs for maintaining a cloud-based DR to less than one dollar per month for relevant databases' sizes and workloads (up to 222 x less than the traditional approach of having a DBMS replica in a cloud VM); (2) allows a precise control of the operational costs, durability and performance trade-offs; and (3) introduces a small performance overhead to the DBMS (e.g., less than 5% overhead for the TPC-C workload with ≈ 10 seconds of data loss in case of disasters).

References

[1]
2016. Business continuity trends and challenges 2016. (Jan. 2016). http://www.continuitycentral.com/index.php/news/business-continuity-news/776--business-continuity-trends-and-challenges-2016.
[2]
2017. Amazon EC2 Instance Types. (2017). https://aws.amazon.com/ec2/instance-types/.
[3]
2017. Amazon RDS Multi-AZ Deployments. (2017). https://aws.amazon.com/rds/details/multi-az/.
[4]
2017. Amazon S3 pricing. (2017). https://aws.amazon.com/s3/pricing/.
[5]
2017. BenchmarkSQL. (2017). https://bitbucket.org/openscg/benchmarksql.
[6]
2017. FUSE-J. (2017). http://fuse-j.sourceforge.net/.
[7]
2017. Java TPC-C. (2017). https://github.com/AgilData/tpcc.
[8]
2017. Microsoft Azure Site Recovery. (2017). https://azure.microsoft.com/en-us/services/site-recovery/.
[9]
2017. MySQL-The InnoDB Storage Engine. (2017). http://dev.mysql.com/doc/refman/5.7/en/innodb-storage-engine.html.
[10]
2017. MySQL 5.7 Documentation. (2017). http://dev.mysql.com/doc/refman/5.7/en/.
[11]
2017. MySQL Replication. (2017). http://dev.mysql.com/doc/refman/5.7/en/replication.html.
[12]
2017. Oracle GoldenGate. (2017). http://www.oracle.com/technetwork/middleware/goldengate/overview/
[13]
2017. PostgreSQL. (2017). http://www.postgresql.org/.
[14]
2017. PostgreSQL Documentation. (2017). http://www.postgresql.org/docs/.
[15]
2017. TPC-C Benchmark. (2017). http://www.tpc.org/tpcc/.
[16]
2017. VMware vCloud Air Disaster Recovery. (2017). https://www.vmware.com/cloud-services/infrastructure/vcloud-air-disaster-recovery.
[17]
2017. Zmanda Recovery Manager for MySQL. (2017). http://www.zmanda.com/.
[18]
BBC News. 2017. WannaCry ransomware cyber-attacks slow but fears remain. (May 2017). http://www.bbc.com/news/technology-39920141.
[19]
Alysson Bessani, Miguel Correia, Bruno Quaresma, Fernando Andreé, and Paulo Sousa. 2013. DepSky: dependable and secure storage in a cloud-of-clouds. ACM Transactions on Storage 9, 4 (2013).
[20]
Alysson Bessani, Ricardo Mendes, Tiago Oliveira, Nuno Neves, Miguel Correia, Marcelo Pasin, and Paulo Verissimo. 2014. SCFS: a shared cloud-backed file system. In Proceedings of the 2014 USENIX Annual Technical Conference (ATC'14).
[21]
Matthias Brantner, Daniela Florescu, David Graf, Donald Kossmann, and Tim Kraska. 2008. Building a database on S3. In Proceedings of the 2008 ACM SIGMOD international conference on Management of data (SIGMOD'08).
[22]
Peter Brouwer. 2011. The Art of Data Replication. (2011). Oracle Technical White Paper.
[23]
Emmanuel Cecchet, George Candea, and Anastasia Ailamaki. 2008. Middleware-based Database Replication: The Gaps Between Theory and Practice. In Proceedings of the 2008 ACM SIGMOD international conference on Management of data (SIGMOD'08).
[24]
Rafal Cegiela. 2006. Selecting technology for disaster recovery. In International Conference on Dependability of Computer Systems (DepCos-RELCOMEX'06).
[25]
J. Coburn, T. Bunker, M. Schwarz, R. Gupta, and S. Swanson. 2013. From ARIES to MARS: Transaction Support for Next-generation, Solid-State Drives. In Proceedings of 24th ACM/SIGOPS Symposium on Operating Systems Principles (SOSP'13).
[26]
Brendan Cully, Geoffrey Lefebvre, Dutch Meyer, Mike Feeley, Norm Hutchinson, and Andrew Warfield. 2008. Remus: High availability via asynchronous virtual machine replication. In Proceedings of the 5th USENIX Symposium on Networked Systems Design and Implementation (NSDI'08).
[27]
Sharon Fisher. 2014. On the Quest for the Mysterious Source of the "Data Loss Causes Company Failure" Statistic. (Feb. 2014). http://itknowledgeexchange.techtarget.com/storage-disaster-recovery/on-the-quest-for-the-mysterious-source-of-the-data-loss-causes-company-failure-statistic/.
[28]
H. S. Gunawi, M. Hao, R. O. Suminto, A. Laksono, A. D. Satria, J. Adityatama, and K. J. Eliazar. 2016. Why does the Cloud Stop Computing? Lessons from Hundreds of Service Outages. In Proceedings of the 7th ACM Symposium on Cloud Computing (SoCC'16).
[29]
Pedro Hernandez. 2014. Small Business IT Survey: No Backup, No Data, No Business. (May 2014). http://www.smallbusinesscomputing.com/biztools/small-business-it-survey-no-backup-no-data-no-business.html.
[30]
B. Hou, F. Chen, Z. Ou, R. Wang, and M. Mesnier. 2016. Understanding I/O Performance Behaviors of Cloud Storage from a Client's Perspective. In Proceedings of the 32th IEEE International Conference on Massive Storage Systems and Technology (MSST'16).
[31]
Minwen Ji, Alistair C Veitch, John Wilkes, and others. 2003. Seneca: remote mirroring done write. In Proceedings of the 2003 USENIX Annual Technical Conference (ATC'03).
[32]
Kimberly Keeton, Cipriano A Santos, Dirk Beyer, Jeffrey S Chase, and John Wilkes. 2004. Designing for disasters. In Proceedings of the 3rd USENIX Conference on File and Storage Technologies (FAST'04).
[33]
Bettina Kemme, Ricardo Jimenez-Peris, and Marta Patiño-Martínez. 2010. Database Replication. Morgan & Claypool.
[34]
Richard P. King, Nagui Halim, Hector Garcia-Molina, and Christos A. Polyzois. 1991. Management of a Remote Backup Copy for Disaster Recovery. ACM Transactions on Database Systems 16, 2 (1991).
[35]
Edward Kovacs. 2014. Downtime and Data Loss Cost Enterprises $1.7 Trillion Per Year: EMC. (Dec. 2014). http://www.securityweek.com/downtime-and-data-loss-cost-enterprises-17-trillion-year-emc.
[36]
Umar Farooq Minhas, Shriram Rajagopalan, Brendan Cully, Ashraf Aboulnaga, Kenneth Salem, and Andrew Warfield. 2013. RemusDB: Transparent high availability for database systems. The VLDB Journal 22, 1 (2013).
[37]
C Mohan, Don Haderle, Bruce Lindsay, Hamid Pirahesh, and Peter Schwarz. 1992. ARIES: a transaction recovery method supporting fine-granularity locking and partial rollbacks using write-ahead logging. ACM Transactions on Database Systems 17, 1 (1992).
[38]
Tiago Oliveira, Ricardo Mendes, and Alysson Bessani. 2016. Exploring Key-Value Stores in Multi-Writer Byzantine-Resilient Register Emulations. In Proceedings of the 20th International Conference On Principles Of Distributed Systems (OPODIS'16).
[39]
Hugo Patterson, Stephen Manley, Mike Federwisch, Dave Hitz, Steve Kleiman, and Shane Owara. 2002. SnapMirror: file system based asynchronous mirroring for disaster recovery. In Proceedings of the 1st USENIX Conference on File and Storage Technologies (FAST'02).
[40]
Shriram Rajagopalan, Brendan Cully, Ryan O'Connor, and Andrew Warfield. 2012. SecondSite: disaster tolerance as a service. In Proceedings of the 8th ACM SIGPLAN/SIGOPS conference on Virtual Execution Environments (VEE'12).
[41]
Glen Robinson, Attila Narin, and Chris Elleman. 2014. Using Amazon Web Services for Disaster Recovery. (Dec. 2014). Amazon Web Services white paper.
[42]
Susan Snedaker. 2013. Business continuity and disaster recovery planning for IT professionals. Newnes.
[43]
Michael Stonebraker and Lawrence A Rowe. 1986. The design of Postgres. In Proceedings of the 1986 ACM SIGMOD international conference on Management of data (SIGMOD'86).
[44]
Symantec. 2009. SMB (Small and Medium Business) security and data protection: survey shows high concern, less action. (2009). White paper: SMB Survey.
[45]
Symantec. 2016. Ransomware and Business 2016. (2016). ISTR Special Report.
[46]
Vasily Tarasov, Abhishek Gupta, Kumar Sourav, Sagar Trehan, and Erez Zadok. 2015. Terra Incognita: On the Practicality of User-Space File Systems. In Proceedings of the 7th USENIX workshop on hot topics in Storage and File Systems (HotStorage'15).
[47]
Michael Vrable, Stefan Savage, and Geoffrey M Voelker. 2009. Cumulus: Filesystem backup to the cloud. ACM Transactions on Storage 5, 4 (2009).
[48]
Michael Vrable, Stefan Savage, and Geoffrey M. Voelker. 2012. BlueSky: A Cloud-backed File System for the Enterprise. In Proceedings of the 10th USENIX Conference on File and Storage Technologies (FAST'12).
[49]
Timothy Wood, Emmanuel Cecchet, KK Ramakrishnan, Prashant Shenoy, Jacobus Van Der Merwe, and Arun Venkataramani. 2010. Disaster recovery as a cloud service: Economic benefits & deployment challenges. In Proceedings of the 1st USENIX workshop on hot topics in cloud computing (HotCloud'10).
[50]
Timothy Wood, H Andrés Lagar-Cavilla, KK Ramakrishnan, Prashant Shenoy, and Jacobus Van der Merwe. 2011. PipeCloud: using causality to overcome speed-of-light delays in cloud-based disaster recovery. In Proceedings of the 2nd ACM Symposium on Cloud Computing (SoCC'11).

Cited By

View all
  • (2024)Decentralized FaaS over Multi-Clouds with Blockchain based Management for Supporting Emerging ApplicationsProceedings of the 39th ACM/SIGAPP Symposium on Applied Computing10.1145/3605098.3636029(122-130)Online publication date: 8-Apr-2024
  • (2020)A Scalable Multicloud Storage Architecture for Cloud-Supported Medical Internet of ThingsIEEE Internet of Things Journal10.1109/JIOT.2019.29462967:3(1641-1654)Online publication date: Mar-2020
  • (2020)An Effective Remote Data Disaster Recovery Plan for the Space TT&C SystemMachine Learning for Cyber Security10.1007/978-3-030-62460-6_4(31-41)Online publication date: 8-Oct-2020
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
Middleware '17: Proceedings of the 18th ACM/IFIP/USENIX Middleware Conference
December 2017
268 pages
ISBN:9781450347204
DOI:10.1145/3135974
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

In-Cooperation

  • USENIX Assoc: USENIX Assoc
  • IFIP

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 December 2017

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. cloud
  2. databases
  3. disaster recovery

Qualifiers

  • Research-article

Funding Sources

Conference

Middleware '17
Sponsor:
Middleware '17: 18th International Middleware Conference
December 11 - 15, 2017
Nevada, Las Vegas

Acceptance Rates

Middleware '17 Paper Acceptance Rate 20 of 85 submissions, 24%;
Overall Acceptance Rate 203 of 948 submissions, 21%

Upcoming Conference

MIDDLEWARE '25
26th International Middleware Conference
December 15 - 19, 2025
Nashville , TN , USA

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)5
  • Downloads (Last 6 weeks)2
Reflects downloads up to 03 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Decentralized FaaS over Multi-Clouds with Blockchain based Management for Supporting Emerging ApplicationsProceedings of the 39th ACM/SIGAPP Symposium on Applied Computing10.1145/3605098.3636029(122-130)Online publication date: 8-Apr-2024
  • (2020)A Scalable Multicloud Storage Architecture for Cloud-Supported Medical Internet of ThingsIEEE Internet of Things Journal10.1109/JIOT.2019.29462967:3(1641-1654)Online publication date: Mar-2020
  • (2020)An Effective Remote Data Disaster Recovery Plan for the Space TT&C SystemMachine Learning for Cyber Security10.1007/978-3-030-62460-6_4(31-41)Online publication date: 8-Oct-2020
  • (2019)Understanding I/O performance of IPFS storageProceedings of the International Symposium on Quality of Service10.1145/3326285.3329052(1-10)Online publication date: 24-Jun-2019
  • (2019)OLAP parallel query processing in clouds with C‐ParGRESConcurrency and Computation: Practice and Experience10.1002/cpe.559032:7Online publication date: 19-Dec-2019

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media