ABSTRACT
Many systems use snapshot isolation, or something similar, as defaults, and multi-version concurrency control (MVCC) remains essential to offering such point-in-time consistency. One major issue in MVCC is the timely removal of unnecessary versions of data items, especially in the presence of long-lived transactions (LLTs). We have observed that the latest versions of MySQL and PostgreSQL are still vulnerable to LLTs. Our analysis of existing proposals suggests that new solutions to this matter must provide rigorous rules for completely identifying unnecessary versions, and elaborate designs for version cleaning lest old versions required for LLTs should suspend garbage collection. In this paper, we formalize such rules into our version pruning theorem and version classification, of which all form theoretical foundations for our new version management system, vDriver, that bases its record versioning on a new principle: Single In-row Remaining Off-row (SIRO) versioning. We implemented a prototype of vDriver and integrated it with MySQL-8.0 and PostgreSQL-12.0. The experimental evaluation demonstrated that the engines with Driver continue to perform the reclamation of dead versions in the face of LLTs while retaining transaction throughput with reduced space consumption.
Supplemental Material
- Oracle Corporation and/or its affiliates. 2019 a. MySQL 8.0 Reference Manual: 15.6.3.4 Undo Tablespaces. https://dev.mysql.com/doc/refman/8.0/en/innodb-undo-tablespaces.html.Google Scholar
- Oracle Corporation and/or its affiliates. 2019 b. MySQL 8.0 Reference Manual: 15.6.6 Undo Logs. https://dev.mysql.com/doc/refman/8.0/en/innodb-undo-logs.html.Google Scholar
- Oracle Corporation and/or its affiliates. 2019 c. Oracle 19 Database Administrators Guide: 16 Managing Undo. https://docs.oracle.com/en/database/oracle/oracle-database/19/admin/managing-undo.html#GUID-2C865CF9-A8B5--4BF1-A451-E8C08D3611F0.Google Scholar
- Panagiotis Antonopoulos, Peter Byrne, Wayne Chen, Cristian Diaconu, Raghavendra Thallam Kodandaramaih, Hanuma Kodavalla, Prashanth Purnananda, Adrian-Leonard Radu, Chaitanya Sreenivas Ravella, and Girish Mitturand Venkataramanappa. 2018. Constant Time Recovery in Azure SQL Database. PVLDB, Vol. 12, 12 (Oct. 2018), 2143--2154. https://doi.org/10.14778/3352063.3352131Google ScholarDigital Library
- M. M. Astrahan, M. W. Blasgen, D. D. Chamberlin, K. P. Eswaran, J. N. Gray, P. P. Griffiths, W. F. King, R. A. Lorie, P. R. McJones, J. W. Mehl, G. R. Putzolu, I. L. Traiger, B. W. Wade, and V. Watson. 1976. System R: Relational Approach to Database Management. ACM Trans. Database Syst., Vol. 1, 2 (June 1976), 97--137. https://doi.org/10.1145/320455.320457Google ScholarDigital Library
- Hal Berenson, Phil Bernstein, Jim Gray, Jim Melton, Elizabeth O'Neil, and Patrick O'Neil. 1995. A Critique of ANSI SQL Isolation Levels. In Proceedings of the 1995 ACM SIGMOD International Conference on Management of Data (SIGMOD '95). ACM, New York, NY, USA, 1--10. https://doi.org/10.1145/223784.223785Google ScholarDigital Library
- Jan Böttcher, Viktor Leis, Thomas Neumann, and Alfons Kemper. 2019. Scalable Garbage Collection for In-Memory MVCC Systems. Proc. VLDB Endow., Vol. 13, 2 (Oct. 2019), 128-141. https://doi.org/10.14778/3364324.3364328Google ScholarDigital Library
- Kalen Delaney. 2016. SQL Server In-Memory OLTP Internals for SQL Server 2016. https://download.microsoft.com/download/8/3/6/8360731A-A27C-4684-BC88-FC7B5849A133/SQL_Server_2016_In_Memory_OLTP_White_Paper.pdfGoogle Scholar
- Cristian Diaconu, Craig Freedman, Erik Ismert, Per-Ake Larson, Pravin Mittal, Ryan Stonecipher, Nitin Verma, and Mike Zwilling. 2013. Hekaton: SQL Server's Memory-optimized OLTP Engine. In Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data (SIGMOD '13). ACM, New York, NY, USA, 1243--1254. https://doi.org/10.1145/2463676.2463710Google ScholarDigital Library
- Jim Gray. 1988. Readings in Database Systems. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, Chapter The Transaction Concept: Virtues and Limitations, 140--150. http://dl.acm.org/citation.cfm?id=48751.48761Google Scholar
- Maurice Herlihy. 1993. A Methodology for Implementing Highly Concurrent Data Objects. ACM Trans. Program. Lang. Syst., Vol. 15, 5 (November 1993), 745--770. https://doi.org/10.1145/161468.161469Google ScholarDigital Library
- Maurice Herlihy and Nir Shavit. 2008. The Art of Multiprocessor Programming .Morgan Kaufmann Publishers Inc., San Francisco, CA, USA.Google ScholarDigital Library
- Ryan Johnson, Ippokratis Pandis, Radu Stoica, Manos Athanassoulis, and Anastasia Ailamaki. 2010. Aether: A Scalable Approach to Logging. PVLDB, Vol. 3, 1--2 (Sept. 2010), 681--692. https://doi.org/10.14778/1920841.1920928Google ScholarDigital Library
- Ryan Johnson, Ippokratis Pandis, Radu Stoica, Manos Athanassoulis, and Anastasia Ailamaki. 2011. Scalability of write-ahead logging on multicore and multisocket hardware. The VLDB Journal, Vol. 21, 2 (2011), 239--263. https://doi.org/10.1007/s00778-011-0260--8Google ScholarDigital Library
- Hyungsoo Jung, Hyuck Han, Alan D. Fekete, Gernot Heiser, and Heon Y. Yeom. 2013. A Scalable Lock Manager for Multicores. In Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data (SIGMOD'13). 73--84.Google Scholar
- Hyungsoo Jung, Hyuck Han, and Sooyong Kang. 2017. Scalable Database Logging for Multicores. PVLDB, Vol. 11, 2 (Oct. 2017), 135--148. https://doi.org/10.14778/3149193.3149195Google Scholar
- Jongbin Kim, Hyunsoo Cho, Kihwang Kim, Jaesun Yu, Sooyong Kang, and Hyungsoo Jung. 2020. [Technical Report] Long-lived Transactions Made Less Harmful. https://github.com/hyu-scslab/vDriver/blob/master/vdriver_techreport.pdf.Google Scholar
- Jongbin Kim, Hyeongwon Jang, Seohui Son, Hyuck Han, Sooyong Kang, and Hyungsoo Jung. 2019. Border-Collie: A Wait-free, Read-optimal Algorithm for Database Logging on Multicore Hardware. In Proceedings of the 2019 International Conference on Management of Data (SIGMOD '19). ACM, New York, NY, USA, 723--740. https://doi.org/10.1145/3299869.3300071Google ScholarDigital Library
- Kangnyeon Kim, Tianzheng Wang, Ryan Johnson, and Ippokratis Pandis. 2016. ERMIA: Fast Memory-Optimized Database System for Heterogeneous Workloads. In Proceedings of the 2016 International Conference on Management of Data (SIGMOD '16). Association for Computing Machinery, New York, NY, USA, 1675-1687. https://doi.org/10.1145/2882903.2882905Google ScholarDigital Library
- Per-Åke Larson, Spyros Blanas, Cristian Diaconu, Craig Freedman, Jignesh M. Patel, and Mike Zwilling. 2011. High-performance Concurrency Control Mechanisms for Main-memory Databases. PVLDB, Vol. 5, 4 (Dec. 2011), 298--309. https://doi.org/10.14778/2095686.2095689Google ScholarDigital Library
- Juchang Lee, Hyungyu Shin, Chang Gyoo Park, Seongyun Ko, Jaeyun Noh, Yongjae Chuh, Wolfgang Stephan, and Wook-Shin Han. 2016. Hybrid Garbage Collection for Multi-Version Concurrency Control in SAP HANA. In Proceedings of the 2016 International Conference on Management of Data (SIGMOD '16). ACM, New York, NY, USA, 1307--1318. https://doi.org/10.1145/2882903.2903734Google ScholarDigital Library
- Justin Levandoski, David Lomet, Sudipta Sengupta, Ryan Stutsman, and Rui Wang. 2015. High Performance Transactions in Deuteronomy. In Conference on Innovative Data Systems Research (CIDR 2015). https://www.microsoft.com/en-us/research/publication/high-performance-transactions-in-deuteronomy/Google Scholar
- Justin J. Levandoski, David B. Lomet, and Sudipta Sengupta. 2013. The Bw-Tree: A B-tree for New Hardware Platforms. In Proceedings of the 2013 IEEE International Conference on Data Engineering (ICDE 2013) (ICDE '13). IEEE Computer Society, Washington, DC, USA, 302--313. https://doi.org/10.1109/ICDE.2013.6544834Google ScholarDigital Library
- Thomas Neumann, Tobias Mühlbauer, and Alfons Kemper. 2015. Fast Serializable Multi-Version Concurrency Control for Main-Memory Database Systems. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data (SIGMOD '15). ACM, New York, NY, USA, 677--689. https://doi.org/10.1145/2723372.2749436Google ScholarDigital Library
- Ippokratis Pandis, Pinar Tözün, Ryan Johnson, and Anastasia Ailamaki. 2011. PLP: Page Latch-free Shared-everything OLTP. PVLDB, Vol. 4, 10 (July 2011), 610--621. https://doi.org/10.14778/2021017.2021019Google Scholar
- Kun Ren, Jose M. Faleiro, and Daniel J. Abadi. 2016. Design Principles for Scaling Multi-core OLTP Under High Contention. In Proceedings of the 2016 International Conference on Management of Data (SIGMOD '16). ACM, New York, NY, USA, 1583--1598. https://doi.org/10.1145/2882903.2882958Google Scholar
- Kun Ren, Alexander Thomson, and Daniel J. Abadi. 2015. VLL: A Lock Manager Redesign for Main Memory Database Systems. The VLDB Journal, Vol. 24, 5 (Oct. 2015), 681--705. https://doi.org/10.1007/s00778-014-0377--7Google ScholarDigital Library
- PostgreSQL repository. 2019. README on The Transaction System. https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/access/transam/README;hb=8548ddc61b5858b6466e69f66a6b1a7ea9daef06#l334.Google Scholar
- Michael Stonebraker, Gerald Held, Eugene Wong, and Peter Kreps. 1976. The Design and Implementation of INGRES. ACM Trans. Database Syst., Vol. 1, 3 (Sept. 1976), 189--222. https://doi.org/10.1145/320473.320476Google ScholarDigital Library
- Michael Stonebraker and Lawrence A. Rowe. 1986. The Design of POSTGRES. In Proceedings of the 1986 ACM SIGMOD International Conference on Management of Data (SIGMOD '86). ACM, New York, NY, USA, 340--355. https://doi.org/10.1145/16894.16888Google Scholar
- Stephen Tu, Wenting Zheng, Eddie Kohler, Barbara Liskov, and Samuel Madden. 2013. Speedy Transactions in Multicore In-memory Databases. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles (SOSP '13). ACM, New York, NY, USA, 18--32. https://doi.org/10.1145/2517349.2522713Google ScholarDigital Library
- Tianzheng Wang and Ryan Johnson. 2014. Scalable Logging Through Emerging Non-volatile Memory. PVLDB, Vol. 7, 10 (June 2014), 865--876. https://doi.org/10.14778/2732951.2732960Google ScholarDigital Library
- Tianzheng Wang and Hideaki Kimura. 2016. Mostly-optimistic Concurrency Control for Highly Contended Dynamic Workloads on a Thousand Cores. PVLDB, Vol. 10, 2 (Oct. 2016), 49--60. https://doi.org/10.14778/3015274.3015276Google ScholarDigital Library
- Gerhard Weikum and Gottfried Vossen. 2002. Transactional Information Systems: Theory, Algorithms, and the Practice of Concurrency Control and Recovery .Morgan Kaufmann Publishers Inc., San Francisco, CA, USA.Google Scholar
Index Terms
- Long-lived Transactions Made Less Harmful
Recommendations
Diva: Making MVCC Systems HTAP-Friendly
SIGMOD '22: Proceedings of the 2022 International Conference on Management of DataMultiversion concurrency control (MVCC) and design principles thereof are ingrained in modern database management systems, thus promoting remarkable progress in managing online transaction processing (OLTP) workloads for decades. However, MVCC systems ...
Performance Analysis of Long-Lived Transaction Processing Systems with Rollbacks and Aborts
Increasing the parallelism in transaction processing and maintaining data consistency appear to be two conflicting goals in designing Distributed Database Systems (DDBS). This problem becomes especially difficult if the DDBS is serving long-lived ...
Fast Serializable Multi-Version Concurrency Control for Main-Memory Database Systems
SIGMOD '15: Proceedings of the 2015 ACM SIGMOD International Conference on Management of DataMulti-Version Concurrency Control (MVCC) is a widely employed concurrency control mechanism, as it allows for execution modes where readers never block writers. However, most systems implement only snapshot isolation (SI) instead of full ...
Comments