Skip to main content
Log in

Logless one-phase commit made possible for highly-available datastores

  • Published:
Distributed and Parallel Databases Aims and scope Submit manuscript

Abstract

Highly-available datastores are widely deployed for Internet-based applications. However, many Internet-based applications are not contented with the simple data access interface provided by highly-available datastores. Distributed transaction support is demanded by applications such as massive online payment used by Alipay, Paypal or Baidu Wallet. Current solutions to distributed transaction can spend more than half of the whole transaction processing time in distributed commit. The culprits are the multiple write-ahead logging steps and communication roundtrips in the commit process. This paper presents the HACommit protocol, a logless one-phase commit protocol for highly-available datastores. HACommit has transaction participants vote for a commit before the client decides to commit or abort the transaction; in comparison, the state-of-the-art practice for distributed commit is to have the client decide before participants vote. The change enables the removal of both the participant’s write-ahead logging and the coordinator’s write-ahead logging steps in the distributed commit process; it also makes possible that, after the client initiates the transaction commit, the transaction data is visible to other transactions within one communication roundtrip time (i.e., one phase). In the evaluation with extensive experiments, HACommit outperforms recent atomic commit solutions for highly-available datastores under different workloads. In the best case, HACommit can commit in one fifth of the time the widely-used two-phase commit (2PC) does.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  1. Shute, J., Vingralek, R., Samwel, B., Handy, B., Whipkey, C., Rollins, E., Oancea, M., Littlefield, K., Menestrina, D., Ellner, S., Cieslewicz, J., Rae, I., Stancescu, T., Apte, H.: F1: a distributed sql database that scales. Proc. VLDB Endow. 6(11), 1068–1079 (2013)

    Article  Google Scholar 

  2. Amazon cloud goes down friday night, taking netflix, instagram and pinterest with it (2012). http://www.forbes.com/sites/anthonykosner/2012/06/30/amazon-cloud-goes-down-friday-night-taking-netflix-instagram-and-pinterest-with-it/

  3. Nishtala, R., Fugal, H., Grimm, S., Kwiatkowski, M., Lee, H., Li, H.C., McElroy, R., Paleczny, M., Peek, D., Saab, P., et al.: Scaling memcache at facebook. In: NSDI, vol. 13, 385–398 (2013)

  4. Mu, S., Nelson, L., Lloyd, W., Li, J.: Consolidating concurrency control and consensus for commits under conflicts. Proceedings OSDI (2016)

  5. Nawab, F., Arora, V., Agrawal, D., El Abbadi, A.: Minimizing commit latency of transactions in geo-replicated data stores. In: Proceedings of SIGMOD’15, pp. 1279–1294. ACM (2015)

  6. Kraska, T., Pang, G., Franklin, M.J., Madden, S.: Mdcc: Multi-data center consistency. In: Eurosys (2013)

  7. Corbett, J.C., Dean, J., Epstein, M., Fikes, A., Frost, C., Furman, J., Ghemawat, S., Gubarev, A., Heiser, C., Hochschild, P., et al.: Spanner: Google’s globally-distributed database. In: Proceedings of OSDI p. 1 (2012)

    Article  Google Scholar 

  8. Lee, J., Muehle, M., May, N., Faerber, F., Sikka, V., Plattner, H., Krueger, J., Grund, M.: High-performance transaction processing in SAP HANA. IEEE Data Eng. Bull. 36(2), 28–33 (2013)

    Google Scholar 

  9. Diaconu, C., Freedman, C., Ismert, E., Larson, P.A., Mittal, P., Stonecipher, R., Verma, N., Zwilling, M.: Hekaton: Sql server’s memory-optimized oltp engine. In: Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, pp. 1243–1254. ACM (2013)

  10. Paypal. https://www.paypal.com/

  11. Alipay. https://www.alipay.com/

  12. Baidu wallet. https://www.baifubao.com/

  13. Peng, D., Dabek, F.: Large-scale incremental processing using distributed transactions and notifications. OSDI 10, 1–15 (2010)

    Google Scholar 

  14. Goldstein, J., Larson, P.Å.: Optimizing queries using materialized views: a practical, scalable solution. In: ACM SIGMOD Record, vol. 30, pp. 331–342. ACM (2001)

  15. Bernstein, P.A., Hadzilacos, V., Goodman, N.: Concurrency Control and Recovery in Database Systems, vol. 370. Addison-Wesley, New York (1987)

    Google Scholar 

  16. Zhang, I., Sharma, N.K., Szekeres, A., Krishnamurthy, A., Ports, D.R.: Building consistent transactions with inconsistent replication. In: Proceedings of SOSP ’15. ACM, New York (2015)

  17. Glendenning, L., Beschastnikh, I., Krishnamurthy, A., Anderson, T.: Scalable consistency in scatter. In: Proceedings of SOSP, pp. 15–28. ACM (2011)

  18. Mahmoud, H.A., Pucher, A., Nawab, F., Agrawal, D., Abbadi, A.E.: Low latency multi-datacenter databases using replicated commits. In: Proceedings of the VLDB Endowment (2013)

  19. Mohan, C., Lindsay, B., Obermarck, R.: Transaction management in the r* distributed database management system. ACM Trans. Database Syst. 11(4), 378–396 (1986)

    Article  Google Scholar 

  20. Abdallah, M., Guerraoui, R., Pucheral, P.: One-phase commit: does it make sense? In: IEEE Proceedings of International Conference on Parallel and Distributed Systems, pp. 182–192 (1998)

  21. Jones, E.P., Abadi, D.J., Madden, S.: Low overhead concurrency control for partitioned main memory databases. In: Proceedings of SIGMOD, pp. 603–614. ACM (2010)

  22. Pang, G., Kraska, T., Franklin, M.J., Fekete, A.: Planet: making progress with commit processing in unpredictable environments. In: Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data, pp. 3–14. ACM (2014)

  23. Bodik, P., Fox, A., Franklin, M.J., Jordan, M.I., Patterson, D.A.: Characterizing, modeling, and generating workload spikes for stateful services. In: Proceedings of the 1st ACM symposium on Cloud computing, pp. 241–252. ACM (2010)

  24. Schad, J., Dittrich, J., Quiané-Ruiz, J.A.: Runtime measurements in the cloud: observing, analyzing, and reducing variance. Proc. VLDB Endow. 3(1–2), 460–471 (2010)

    Article  Google Scholar 

  25. Cristian, F.: Synchronous and asynchronous. Commun. ACM 39(4), 88–97 (1996)

    Article  Google Scholar 

  26. Aguilera, M.K.: Stumbling over consensus research: misunderstandings and issues. In: Replication, pp. 59–72. Springer, Berlin (2010)

  27. Lamport, L.: The part-time parliament. ACM Trans. Comput. Syst. 16(2), 133–169 (1998)

    Article  Google Scholar 

  28. Burrows, M.: The chubby lock service for loosely-coupled distributed systems. In: Proceedings of OSDI, pp. 335–350. USENIX Association (2006)

  29. Chandra, T., Griesemer, R., Redstone, J.: Paxos made live-an engineering perspective (2006 invited talk). In: Proceedings of PODC’07, vol. 7 (2007)

  30. Guerraoui, R.: Revisiting the relationship between non-blocking atomic commitment and consensus. In: Distributed Algorithms, pp. 87–100. Springer, Berlin (1995)

    Chapter  Google Scholar 

  31. Lamport, L.: Paxos made simple. ACM Sigact News 32(4), 18–25 (2001)

    Google Scholar 

  32. Dean, J., Barroso, L.A.: The tail at scale. Commun. ACM 56(2), 74–80 (2013)

    Article  Google Scholar 

  33. Gray, J., Reuter, A.: Transaction Processing. Morgan Kaufíann Publishers, San Francisco (1993)

    MATH  Google Scholar 

  34. Malviya, N., Weisberg, A., Madden, S., Stonebraker, M.: Rethinking main memory oltp recovery. In: Proceedings of ICDE, pp. 604–615 (2014)

  35. Baker, J., Bond, C., Corbett, J., Furman, J., Khorlin, A., Larson, J., Léon, J.M., Li, Y., Lloyd, A., Yushprakh, V.: Megastore: Providing scalable, highly available storage for interactive services. In: Proceedings of CIDR, pp. 223–234 (2011)

  36. Leach, P., Mealling, M., Salz, R.: Rfc 4122—a universally unique identifier (UUID) URN namespace (2005). Internet Engineering Task Force

  37. Reynal, M.: A short introduction to failure detectors for asynchronous distributed systems. ACM SIGACT News 36(1), 53–70 (2005)

    Article  Google Scholar 

  38. Cully, B., Lefebvre, G., Meyer, D., Feeley, M., Hutchinson, N., Warfield, A.: Remus: high availability via asynchronous virtual machine replication. In: Proceedings of NSDI’08, pp. 161–174. San Francisco (2008)

  39. Berenson, H., Bernstein, P., Gray, J., Melton, J., O’Neil, E., O’Neil, P.: A critique of ansi sql isolation levels. ACM SIGMOD Record 24(2), 1–10 (1995)

    Article  Google Scholar 

  40. An implementation of the mdcc protocol. https://github.com/hiranya911/mdcc

  41. Cooper, B.F., Silberstein, A., Tam, E., Ramakrishnan, R., Sears, R.: Benchmarking cloud serving systems with ycsb. In: Proceedings of the 1st SoCC. ACM (2010)

  42. Gray, J., Lamport, L.: Consensus on transaction commit. ACM Trans. Database Syst. 31(1), 133–160 (2006)

    Article  Google Scholar 

  43. Guerraoui, R., Larrea, M., Schiper, A.: Reducing the cost for non-blocking in atomic commitment. In: IEEE Proceedings of ICDCS, pp. 692–697 (1996)

  44. Guerraoui, R., Schiper, A.: The decentralized non-blocking atomic commitment protocol. In: Proceedings of IEEE Symposium on Parallel and Distributed Processing, pp. 2–9 (1995)

  45. Sovran, Y., Power, R., Aguilera, M.K., Li, J.: Transactional storage for geo-replicated systems. In: Proceedings of SOSP’11, pp. 385–400

  46. Mu, S., Cui, Y., Zhang, Y., Lloyd, W., Li, J.: Extracting more concurrency from distributed transactions. In: Proceedings of OSDI (2014)

  47. Skeen, D.: Nonblocking commit protocols. In: Proceedings of SIGMOD, pp. 133–142. ACM (1981)

  48. Stamos, J.W., Cristian, F.: Coordinator log transaction execution protocol. Distrib. Parallel Databases 1(4), 383–408 (1993)

    Article  Google Scholar 

  49. Nawab, F., Agrawal, D., Abbadi, A.E.: Message futures: Fast commitment of transactions in multi-datacenter environments. In: Proceedings of CIDR (2013)

  50. Fischer, M.J., Lynch, N.A., Paterson, M.S.: Impossibility of distributed consensus with one faulty process. J. ACM 32(2), 374–382 (1985)

    Article  MathSciNet  Google Scholar 

  51. Harizopoulos, S., Abadi, D.J., Madden, S., Stonebraker, M.: Oltp through the looking glass, and what we found there. In: Proceedings of SIGMOD, pp. 981–992. ACM (2008)

Download references

Acknowledgements

This work is supported in part by the State Key Development Program for Basic Research of China (Grant No. 2014CB340402), the National Key R&D Program of China (No. 2016YFB1000201), the National Natural Science Foundation of China (Grant No. 61303054 and 61420106013), and Youth Innovation Promotion Association of Chinese Academy of Sciences.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yuqing Zhu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhu, Y., Yu, P.S., Yi, G. et al. Logless one-phase commit made possible for highly-available datastores. Distrib Parallel Databases 38, 101–126 (2020). https://doi.org/10.1007/s10619-019-07261-2

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10619-019-07261-2

Keywords

Navigation