Skip to main content
Log in

Fault-tolerant precise data access on distributed log-structured merge-tree

  • Research Article
  • Published:
Frontiers of Computer Science Aims and scope Submit manuscript

Abstract

Log-structured merge tree has been adopted by many distributed storage systems. It decomposes a large database into multiple parts: an in-writing part and several read-only ones. Records are firstly written into a memory-optimized structure and then compacted into in-disk structures periodically. It achieves high write throughput. However, it brings side effect that read requests have to go through multiple structures to find the required record. In a distributed database system, different parts of the LSM-tree are stored in distributed fashion. To this end, a server in the query layer has to issues multiple network communications to pull data items from the underlying storage layer. Coming to its rescue, this work proposes a precise data access strategy which includes: an efficient structure with low maintaining overhead designed to test whether a record exists in the in-writing part of the LSM-tree; a lease-based synchronization strategy proposed to maintain consistent copies of the structure on remote query servers.We further prove the technique is capable of working robustly when the LSM-Tree is re-organizing multiple structures in the backend. It is also fault-tolerant, which is able to recover the structures used in data access after node failures happen. Experiments using the YCSB benchmark show that the solution has 6x throughput improvement over existing methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Chen J C, Chen Y G, Du X Y, Li C P, Lu J H, Zhao S Y, Zhou X. Big data challenge: a data management perspective. Frontiers of Computer Science, 2013, 7(2): 157–164

    Article  MathSciNet  Google Scholar 

  2. ONeil P E, Cheng E, Gawlick D, O’Neil E J. The log–structured merge–tree. Acta Informatica, 1996, 33(4): 351–385

    Article  MATH  Google Scholar 

  3. Chang F, Dean J, Ghemawat S, Hsieh W C, Wallach D A, Burrows M, Chandra T, Fikes A, Gruber R E. Bigtable: a distributed storage system for structured data. ACM Transactions on Computer Systems, 2008, 26(2): 4

    Article  Google Scholar 

  4. Lakshman A, Malik P. Cassandra: a decentralized structured storage system. ACM SIGOPS Operating Systems Review, 2010, 44(2): 35–40

    Article  Google Scholar 

  5. Ghemawat S, Gobioff H, Leung S T. The Google file system. In: Proceedings of ACM Symposium on Operating Systems Principles. 2003, 29–43

    Google Scholar 

  6. Baker J, Bond C, Corbett J C, Furman J J, Khorlin A, Larson J, Leon J M, Li Y W, Lloyd A, Yushprakh V. Megastore: providing scalable, highly available storage for interactive services. In: Proceedings of the 5th Biennial Conference on Innovative Data System Research. 2011, 223–234

    Google Scholar 

  7. Peng D, Dabek F. Large–scale incremental processing using distributed transactions and notifications. In: Proceedings of USENIX Symposium on Operating Systems Design and Implementation. 2010, 1–15

    Google Scholar 

  8. Sears R, Ramakrishnan R. BLSM: a general purpose log structured merge tree. In: Proceedings of ACM SIGMOD International Conference on Management of Data. 2012, 217–228

    Google Scholar 

  9. Bloom B H. Space/time trade–offs in hash coding with allowable errors. Communications of the ACM, 1970, 13(7): 422–426

    Article  MATH  Google Scholar 

  10. Herlihy M,Wing J M. Linearizability: a correctness condition for concurrent objects. ACM Transactions on Programming Languages and Systems, 1990, 12(3): 463–492

    Article  Google Scholar 

  11. Levandoski J J, Lomet D B, Sengupta S. The Bw–Tree: a B–tree for new hardware platforms. In: Proceedings of the 29th IEEE International Conference on Data Engineering. 2013, 302–313

    Google Scholar 

  12. Mohan C, Haderle D J, Lindsay B G, Pirahesh H, Schwarz P M. Aries: a transaction recovery method supporting fine–granularity locking and partial rollbacks using write–ahead logging. ACM Transactions on Database Systems, 1992, 17(1): 94–162

    Article  Google Scholar 

  13. DeWitt D J, Katz R H, Olken F, Shapiro L D, Stonebraker M, Wood D A. Implementation techniques for main memory database systems. In: Proceedings of ACM SIGMOD International Conference on Management of Data. 1984, 1–8

    Google Scholar 

  14. Gray J, Helland P, ONeil P E, Shasha D E. The dangers of replication and a solution. In: Proceedings of ACM SIGMOD International Conference on Management of Data. 1996, 173–182

    Google Scholar 

  15. Tang Y, Sun H L, Wang X, Liu X D. An efficient and highly available framework of data recency enhancement for eventually consistent data stores. Frontiers of Computer Science, 2017, 11(1): 88–104

    Article  Google Scholar 

  16. Ongaro D, Ousterhout J. In search of an understandable consensus algorithm. In: Proceedings of USENIX Annual Technical Conference. 2014, 305–319

    Google Scholar 

  17. Wang D H, Cai P, Qian W N, Zhou A Y, Pang T Z, Jiang J. Fast log replication in highly available data store. In: Proceedings of Asia–Pacific Web and Web–Age Information Managemet Joint Conference on Web and Big Data. 2017, 245–259

    Google Scholar 

  18. Severance D G, Lohman G M. Differential files: their application to the maintenance of large databases. ACM Transactions on Database Systems, 1976, 1(3): 256–267

    Article  Google Scholar 

  19. Ahmad M Y, Kemme B. Compaction management in distributed keyvalue datastores. Proceedings of the VLDB Endowment, 2015, 8(8): 850–861

    Article  Google Scholar 

  20. Tan W, Tata S, Tang Y Z, Fong L L. Diff–index: differentiated index in distributed log–structured data stores. In: Proceedings of International Conference on Extending Database Technology. 2014, 700–711

    Google Scholar 

  21. Zhu T, Hu H Q, Qian W N, Zhou A Y, Liu M Z, Zhao Q. Precise data access on distributed log–structured merge–tree. In: Proceedings of Asia–Pacific Web and Web–Age Information Managemet Joint Conference on Web and Big Data. 2017, 210–218

    Google Scholar 

Download references

Acknowledgements

This work was partially supported by National Hightech R&D Program (2015AA015307), the National Natural Science Foundation of China (Grant Nos. 61702189, 61432006 and 61672232), and Youth Science and Technology - “Yang Fan” Program of Shanghai (17YF1427800).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Huiqi Hu.

Additional information

Tao Zhu is a PhD candidate in the School of Data Science and Engineering, East China Normal University, China. His research interests mainly include database system implementation, transaction processing and distributed system.

Huiqi Hu is currently a lecturer in the School of Data Science and Engineering, East China Normal University, China. He received his PhD Degree from Tsinghua University, China. His research interests mainly include database system theory and implementation, query optimization.

Weining Qian is currently a professor in computer science at East China Normal University, China. He received his MS and PhD in computer science from Fudan University, China in 2001 and 2004, respectively. He served as the co-chair of WISE 2012 Challenge, and program committee member of several international conferences, including ICDE 2009/2010/2012 and KDD 2013. His research interests include Web data management and mining of massive data sets.

Huan Zhou is a PhD candidate in the School of Data Science and Engineering, East China Normal University, China. Her research interests include in-memory database system implementation, parallel computing and transaction processing.

Aoying Zhou is a professor on computer science at East China Normal University, China where he is heading the Institute for Data Science and Engineering. He got his master and bachelor degree in computer science from Sichuan University, China in 1988 and 1985 respectively, and won his PhD degree from Fudan University, China in 1993. He is now acting as the vice-director of ACM SIGMOD China and Technology Committee on Database of China Computer Federation. He is serving as a member of the editorial boards of some prestigious academic journals, such as VLDB Journal, and WWW Journal. His research interests include Web data management, data management for data-intensive computing, and in-memory data analytics.

Electronic supplementary material

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhu, T., Hu, H., Qian, W. et al. Fault-tolerant precise data access on distributed log-structured merge-tree. Front. Comput. Sci. 13, 760–777 (2019). https://doi.org/10.1007/s11704-018-7198-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11704-018-7198-6

Keywords

Navigation