Skip to main content
Log in

Incremental join view maintenance on distributed log-structured storage

  • Research Article
  • Published:
Frontiers of Computer Science Aims and scope Submit manuscript

Abstract

Modern database systems desperate for the ability to support highly scalable transactions and efficient queries simultaneously for real-time applications. One solution is to utilize query optimization techniques on the on-line transaction processing (OLTP) systems. The materialized view is considered as a panacea to decrease query latency. However, it also involves the significant cost of maintenance which trades away transaction performance. In this paper, we examine the design space and conclude several design features for the implementation of a view on a distributed log-structured merge-tree (LSM-tree), which is a well-known structure for improving data write performance. As a result, we develop two incremental view maintenance (IVM) approaches on LSM-tree. One avoids join computation in view maintenance transactions. Another with two optimizations is proposed to decouple the view maintenance with the transaction process. Under the asynchronous update, we also provide consistency queries for views. Experiments on TPC-H benchmark show our methods achieve better performance than straightforward methods on different workloads.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Abadi D J, Madden S, Hachem N. Column-stores vs. row-stores: how different are they really? In: Proceedings of 2008 ACM International Conference on Management of Data. 2008, 967–980

  2. Zhan C Q, Su M M, Wei C X, Peng X Q, Lin L, Wang S, Chen Z, Li F F, Pan Y, Zheng F, Chai C L. Analyticdb: real-time OLAP database system at alibaba cloud. Proceedings of the VLDB Endowment, 2019, 12(12): 2059–2070

    Article  Google Scholar 

  3. Kemper A, Neumann T. Hyper: a hybrid oltp&olap main memory database system based on virtual memory snapshots. In: Proceedings of the 27th IEEE International Conference on Data Engineering. 2011, 195–206

  4. Chirkova R, Yang J. Materialized views. Foundations and Trends in Databases, 2012, 4(4): 295–405

    Article  Google Scholar 

  5. Duan H C, Hu H Q, Qian W N, Ma H X, Wang X L, Zhou A Y. Incremental materialized view maintenance on distributed log-structured merge-tree. In: Proceedings of the 23rd International Conference on Database Systems for Advanced Applications. 2018, 682–700

  6. Chang F, Dean J, Ghemawat S, Hsieh W C, Wallach D A, Burrows M, Chandra T, Fikes A, Gruber R E. Bigtable: a distributed storage system for structured data. ACM Transactions on Computer Systems, 2008, 26(2): 4

    Article  Google Scholar 

  7. Lakshman A, Malik P. Cassandra: a decentralized structured storage system. Operating Systems Review, 2010, 44(2): 35–40

    Article  Google Scholar 

  8. Huang G, Cheng X T, Wang J Y, Wang Y J, He D C, Zhang T Y, Li F F, Wang S, Cao W, Li Q. X-engine: an optimized storage engine for large-scale e-commerce transaction processing. In: Proceedings of the 2019 ACM International Conference on Management of Data. 2019, 651–665

  9. Ghemawat S, Gobioff H, Leung S. The google file system. In: Proceedings of the 19th ACM Symposium on Operating Systems Principles. 2003, 29–43

  10. Levandoski J J, Lomet D B, Sengupta S. The Bw-tree: a B-tree for new hardware platforms. In: Proceedings of the 29th IEEE International Conference on Data Engineering. 2013, 302–313

  11. Berenson H, Bernstein P A, Gray J, Melton J, O’Neil E J, O’Neil P E. A critique of ANSI SQL isolation levels. In: Proceedings of the 1995 ACM International Conference on Management of Data. 1995, 1–10

  12. Garcia-Molina H, Ullman J D, Widom J. Database System Implementation. New Jersey: Prentice Hall, 2000

    Google Scholar 

  13. Galindo-Legaria C A. Outerjoins as disjunctions. In: Proceedings of 1994 ACM International Conference on Management of Data. 1994, 348–358

  14. Bello R G, Dias K, Downing A, Jr J J F, Finnerty J L, Norcott W D, Sun H, Witkowski A, Ziauddin M. Materialized views in oracle. In: Proceedings of the 24th International Conference on Very Large Data Bases. 1998, 659–664

  15. Zaharioudakis M, Cochrane R, Lapis G, Pirahesh H, Urata M. Answering complex SQL queries using automatic summary tables. In: Proceedings of the 2000 ACM International Conference on Management of Data. 2000, 105–116

  16. Goldstein J, Larson P. Optimizing queries using materialized views: a practical, scalable solution. In: Proceedings of the 2001 ACM International Conference on Management of Data. 2001, 331–342

  17. Agrawal S, Chaudhuri S, Narasayya V R. Automated selection of materialized views and indexes in SQL databases. In: Proceedings of the 26th International Conference on Very Large Data Bases. 2000, 496–505

  18. Agrawal S, Chu E, Narasayya V R. Automatic physical design tuning: workload as a sequence. In: Proceedings of 2006 ACM International Conference on Management of Data. 2006, 683–694

  19. Chaudhuri S, Narasayya V R. Self-tuning database systems: a decade of progress. In: Proceedings of the 33rd International Conference on Very Large Data Bases. 2007, 3–14

  20. Zhou J R, Larson P, Elmongui H G. Lazy maintenance of materialized views. In: Proceedings of the 33rd International Conference on Very Large Data Bases. 2007, 231–242

  21. Agrawal P, Silberstein A, Cooper B F, Srivastava U, Ramakrishnan R. Asynchronous view maintenance for VLSD databases. In: Proceedings of 1994 ACM International Conference on Management of Data. 2009, 179–192

  22. Lomotey R K, Deters R. Terms analytics service for CouchDB: a document-based NoSQL. International Journal of Big Data Intelligence, 2015, 2(1): 23–36

    Article  Google Scholar 

  23. Larson P, Zhou J R. Efficient maintenance of materialized outer-join views. In: Proceedings of the 29th IEEE International Conference on Data Engineering. 2007, 56–65

  24. Katsis Y, Ong K W, Papakonstantinou Y, Zhao K K. Utilizing IDs to accelerate incremental view maintenance. In: Proceedings of 2015 ACM International Conference on Management of Data. 2015, 1985–2000

  25. Ahmad Y, Kennedy O, Koch C, Nikolic M. Dbtoaster: higher-order delta processing for dynamic, frequently fresh views. Proceedings of the VLDB Endowment, 2012, 5(10): 968–979

    Article  Google Scholar 

  26. Nikolic M, Dashti M, Koch C. How to win a hot dog eating contest: distributed incremental view maintenance with batch updates. In: Proceedings of 2016 ACM International Conference on Management of Data. 2016, 511–526

  27. O’Neil P, Cheng E, Gawlick D, O’Neil E. The log-structured merge-tree (LSM-tree). Acta Informatica, 1996, 33(4): 351–385

    Article  Google Scholar 

  28. DeCandia G, Hastorun D, Jampani M, Kakulapati G, Lakshman A, Pilchin A, Sivasubramanian S, Vosshall P, Vogels W. Dynamo: amazon’s highly available key-value store. ACM SIGOPS Operating Systems Review, 2007, 41(6): 205–220

    Article  Google Scholar 

  29. Sears R, Ramakrishnan R. BLSM: a general purpose log structured merge tree. In: Proceedings of 2012 ACM International Conference on Management of Data. 2012, 217–228

  30. Tan W, Tata S, Tang Y Z, Fong L L. Diff-index: differentiated index in distributed log-structured data stores. In: Proceedings of the 17th International Conference on Extending Database Technology. 2014, 700–711

Download references

Acknowledgements

This work was partially supported by Youth Foundation of National Science Foundation (61702189), National Science Foundation (61772202).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Huiqi Hu.

Additional information

Huichao Duan received the bachelor degree from Zhejiang Normal University, China in 2015. He is currently working toward the PhD degree in East China Normal University, China. His research interests include materialized view and query optimization of distributed database.

Huiqi Hu is currently an assistant professor in the School of Data Science and Enginnering, East China Normal University, China. His research interests mainly include database system and query optimization, distributed system with new hardware. He has contributed to a number of research and industrial projects of databases.

Weining Qian is currently a professor in Computer Science at East China Normal University, China. He received his MS and PhD degrees in computer science from Fudan University, China in 2001 and 2004, respectively. He served as the co-chair of WISE 2012 Challenge, and program committee member of several international conferences, including ICDE 2009/2010/2012 and KDD 2013. His research interests include Web data management and mining of massive datasets.

Aoying Zhou is a professor at East China Normal University, China. He is now acting as a vice-director of the ACM SIGMOD China and Database Technology Committee of the China Computer Federation. He is serving as a member of the editorial boards of the VLDB Journal, WWW Journal, etc. His research interests include data management for data-intensive computing, and memory cluster computing. He is a member of the IEEE.

Electronic supplementary material

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Duan, H., Hu, H., Qian, W. et al. Incremental join view maintenance on distributed log-structured storage. Front. Comput. Sci. 15, 154607 (2021). https://doi.org/10.1007/s11704-020-9310-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11704-020-9310-y

Keywords

Navigation