Skip to main content

Fast Follower Recovery for State Machine Replication

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10366))

Abstract

The method of state machine replication, adopting a single strong Leader, has been widely used in the modern cluster-based database systems. In practical applications, the recovery speed has a significant impact on the availability of the systems. However, in order to guarantee the data consistency, the existing Follower recovery protocols in Paxos replication (e.g., Raft) need multiple network trips or extra data transmission, which may increase the recovery time. In this paper, we propose the Follower Recovery using Special mark log entry (FRS) algorithm. FRS is more robust and resilient to Follower failure and it only needs one network round trip to fetch the least number of log entries. This approach is implemented in the open source database system OceanBase. We experimentally show that the system adopting FRS has a good performance in terms of recovery time.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    There is a optimization in Raft for reducing the number of network interactions, but the optimized approach does not find the divergent point directly yet.

  2. 2.

    https://github.com/alibaba/oceanbase/.

References

  1. Ongaro, D., Ousterhout, J.: In search of an understandable consensus algorithm. In: ATC, pp. 305–320 (2014)

    Google Scholar 

  2. Rao, J., Shekita, E.J., Tata, S.: Using Paxos to build a scalable, consistent, and highly available datastore. In: VLDB, pp. 243–254 (2011)

    Google Scholar 

  3. Cooper, B.F., Silberstein, A., Tam, E., et al.: Benchmarking cloud serving systems with YCSB. In: Socc, pp. 143–154 (2010)

    Google Scholar 

  4. Schneider, F.B.: Implementing fault-tolerant services using the state machine approach: a tutorial. ACM Comput. Surv. 22(4), 299–319 (1990)

    Article  Google Scholar 

  5. Gray, J., Helland, P., O’Neil, P., et al.: The dangers of replication and a solution. SIGMOD Rec. 25(2), 173–182 (1996)

    Article  Google Scholar 

  6. Chandra, T.D., Griesemer, R., Redstone, J.: Paxos made live: an engineering perspective. In: PODC, pp. 398–407 (2007)

    Google Scholar 

  7. ZooKeeper website. http://zookeeper.apache.org/

  8. CockroachDB website. https://www.cockroachlabs.com/

  9. TiDB website. https://github.com/pingcap/tidb

  10. Oki, B.M., Liskov, B.H.: Viewstamped replication: a new primary copy method to support highly-available distributed systems. In: PODC, pp. 8–17 (1988)

    Google Scholar 

  11. Junqueira, F.P., Reed, B.C., Serafini, M.: Zab: high-performance broadcast for primary-backup systems. In: DSN, pp. 245–256 (2011)

    Google Scholar 

  12. Van Renesse, R., Schiper, N., Schneider, F.B.: IEEE TDSC 12(4), 472–484 (2015)

    Google Scholar 

Download references

Acknowledgments

This work is partially supported by National High-tech R&D Program (863 Program) under grant number 2015AA015307, National Science Foundation of China under grant numbers 61432006 and 61672232, and Guangxi Key Laboratory of Trusted Software (kx201602).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Peng Cai .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Guo, J., Wang, J., Cai, P., Qian, W., Zhou, A., Zhu, X. (2017). Fast Follower Recovery for State Machine Replication. In: Chen, L., Jensen, C., Shahabi, C., Yang, X., Lian, X. (eds) Web and Big Data. APWeb-WAIM 2017. Lecture Notes in Computer Science(), vol 10366. Springer, Cham. https://doi.org/10.1007/978-3-319-63579-8_24

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-63579-8_24

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-63578-1

  • Online ISBN: 978-3-319-63579-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics