Skip to main content
Log in

Parallel multisource view maintenance

  • Published:
The VLDB Journal Aims and scope Submit manuscript

Abstract.

In a distributed environment, materialized views are used to integrate data from different information sources and then store them in some centralized location. In order to maintain such materialized views, maintenance queries need to be sent to information sources by the data warehouse management system. Due to the independence of the information sources and the data warehouse, concurrency issues are raised between the maintenance queries and the local update transactions at each information source. Recent solutions such as ECA and Strobe tackle such concurrent maintenance, however with the requirement of quiescence of the information sources. SWEEP and POSSE overcome this limitation by decomposing the global maintenance query into smaller subqueries to be sent to every information source and then performing conflict correction locally at the data warehouse. Note that all these previous approaches handle the data updates one at a time. Hence either some of the information sources or the data warehouse is likely to be idle during most of the maintenance process. In this paper, we propose that a set of updates should be maintained in parallel by several concurrent maintenance processes so that both the information sources as well as the warehouse would be utilized more fully throughout the maintenance process. This parallelism should then improve the overall maintenance performance. For this we have developed a parallel view maintenance algorithm, called PVM, that substantially improves upon the performance of previous maintenance approaches by handling a set of data updates at the same time. The parallel handling of a set of updates is orthogonal to the particular maintenance algorithm applied to the handling of each individual update. In order to perform parallel view maintenance, we have identified two critical issues that must be overcome: (1) detecting maintenance-concurrent data updates in a parallel mode and (2) correcting the problem that the data warehouse commit order may not correspond to the data warehouse update processing order due to parallel maintenance handling. In this work, we provide solutions to both issues. For the former, we insert a middle-layer timestamp assignment module for detecting maintenance-concurrent data updates without requiring any global clock synchronization. For the latter, we introduce the negative counter concept to solve the problem of variant orders of committing effects of data updates to the data warehouse. We provide a proof of the correctness of PVM that guarantees that our strategy indeed generates the correct final data warehouse state. We have implemented both SWEEP and PVM in our EVE data warehousing system. Our performance study demonstrates that a manyfold performance improvement is achieved by PVM over SWEEP.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Agrawal D, El Abbadi A, Singh A, Yurek T (1997) Efficient view maintenance at data warehouses. In: Abstracts of ACM SIGMOD international conference on management of data, Tucson, AZ, 13-15 May 1997, pp 417-427

  2. Baralis E, Ceri S, Paraboschi S (1996) Conservative timestamp revised for materialized view maintenance in a data warehouse. In: Abstracts of the workshop on materialized views: techniques and applications (VIEW 1996), Montreal, 7 June 1996, in cooperation with SIGMOD conference 1996, pp 1-9

  3. Colby L, Kawaguchi A, Lieuwen D, Mumick I, Ross K (1996) Supporting multiple view maintenance policies. In: Abstracts of the ACM SIGMOD international conference on management of data, SIGMOD 1997, 13-15 May 1997, Tucson, AZ, pp 405-416

  4. Colby LS, Griffin T, Libkin L, Mumick IS, Trickey H (1996) Algorithms for deferred view maintenance. In: Abstracts of ACM SIGMOD international conference on management of data, Montreal, 4-6 June 1996

  5. Colby LS, Mumick IS (1996) Staggered maintenance of multiple Views. In: Abstracts from workshop on materialized views: techniques and applications, pp 119-128 (VIEW 1996), Montreal, 7 June 1996

  6. Ding L, Zhang X, Rundensteiner EA (1999) The MRE wrapper approach: enabling incremental view maintenance of data warehouses defined on multi-relation information sources. In: Abstracts of the ACM 2nd international workshop on data warehousing and OLAP (DOLAP’99), 6 November 1999, Kansas City, MO, pp 30-35

  7. Gupta A, Jagadish H, Mumick I (1996) Data integration using self-maintainable views. In: Abstracts of the 5th international conference on extending database technology (EDBT), Avignon, France, 25-29 March 1996, pp 140-144

  8. Gupta A, Jagadish HV, Mumick IS (1997) Maintenance and self maintenance of outer-join views. In: The 3rd international workshop on next generation information technologies and systems (NGITS ‘97), Neve Ilan, Israel

  9. Gupta A, Mumick I (1995) Maintenance of materialized views: problems, techniques, and applications. IEEE Data engineering bulletin, special issue on materialized views and warehousing, 18(2):3-19

  10. Gupta A, Mumick IS, Subrahmanian VS (1993) Maintaining views Incrementally. In: Abstracts of ACM SIGMOD international conference on management of data, Washington, D.C., 26-28 May 1993, pp 157-166

  11. Huyn N (1996) Efficient view self-maintenance. Abstracts of the workshop on materialized views: techniques and applications (VIEW 1996), Montreal, 7 June 1996, pp 17-25

    Google Scholar 

  12. Kawaguchi A, Lieuwen DF, Mumick IS, Quass D, Ross KA (1997) Concurrency control theory for deferred materialized views. Abstracts of database theory - ICDT ‘97, 6th international conference, Delphi, Greece, 8-10 January 1997, pp 306-320

  13. Kawaguchi A, Lieuwen DF, Mumick IS, Ross KA (1997) Implementing incremental view maintenance in nested data models. In: Abstracts from the 6th international workshop on database programming languages (DBPL-6), Estes Park, CO, 18-20 August 1997, pp 202-221

  14. Labio WJ, Yerneni R, García-Molina H (1999) Shrinking the warehouse updated window. In: Abstracts of ACM SIGMOD international conference on management of data, 1-3 June 1999, Philadelphia, pp 383-395

  15. Lee AJ, Nica A, Rundensteiner EA (2001) The EVE approach: view synchronization in dynamic distributed environments. IEEE Transactions on knowledge and data engineering, 14(5):931-954

    Google Scholar 

  16. Liu B, Chen S, Rundensteiner EA (2002) A transactional approach for parallel data warehouse maintenance. Technical report WPI-CS-TR-02-08, Worcester Polytechnic Institute

  17. Mohania MK, Konomi S, Kambayashi Y (1997) Incremental maintenance of materialized views. In: Abstracts of the 8th international conference on database and expert systems applications (DEXA ‘97), Toulouse, France, 1-5 September 1997, pp 551-560

  18. Nica A, Lee AJ, Rundensteiner EA (1998) The CVS algorithm for view synchronization in evolvable large-scale information systems. In: Abstracts of the international conference on extending database technology (EDBT’98), Valencia, Spain, March 1998, pp 359-373

  19. O’Gorman K, Agrawal D, Abbadi AE (1999) Posse: a framework for optimizing incremental view maintenance at data warehouses. In: Abstracts of the 1st international conference on data warehousing and knowledge discovery (DaWaK ‘99), Florence, 30 August-1 September 1999, pp 106-115

  20. Quass D, Gupta A, Mumick IS, Widom J (1996) Making views self-maintainable for data warehousing. In: Abstracts of the 4th international conference on parallel and distributed information systems, 18-20 December 1996, Miami Beach, pp 158-169

  21. Rundensteiner EA, Koeller A, Zhang X (2000) Maintaining data warehouses over changing information sources. Communications of the ACM, pp 57-62

  22. Rundensteiner EA, Koeller A, Zhang X, Lee A, Nica A, VanWyk A, Li Y (1999) Evolvable view environment. In: Abstracts of ACM SIGMOD international conference on management of data, 1-3 June 1999, Philadelphia, pp 553-555

  23. Salem K, Beyer KS, Cochrane R, Lindsay BG (2000) How to roll a join: asynchronous incremental view maintenance. In: Chen W, Naughton JF, Bernstein PA (eds) Proceedings of the 2000 ACM SIGMOD international conference on management of data, Dallas, 16-18 May 2000, 29(2):129-140

  24. Samtani S, Kumar V (1998) Maintaining consistency in partially self-maintainable views at the data warehouse. In: Abstracts of the 9th international conference on systems applications (DEXA ‘98), Vienna, 24-28 August 1998, pp 206-211

  25. Wiener JL, Gupta H, Labio W, Zhuge Y, Garcia-Molina H, Widom J (1996) A system prototype for warehouse view maintenance. Abstracts of the workshop on materialized views: techniques and applications (VIEW 1996), Montreal, 7 June 1996, pp 26-33

    Google Scholar 

  26. Zhang X, Rundensteiner EA (1999) Flexible data warehouse maintenance under concurrent schema and data updates. In: Abstracts of IEEE international conference on data engineering, special poster session, Sydney, March 1999, p 253

  27. Zhang X, Rundensteiner EA (1999) The SDCC framework for integrating existing algorithms for diverse data warehouse maintenance tasks. In: Abstracts of the international database engineering and applications symposium (IDEAS 1999), 2-4 August 1999, Montreal, pp 206-214

  28. Zhang X, Rundensteiner EA DyDa: dynamic data warehouse maintenance in a fully concurrent environment. In: Abstracts of data warehousing and knowledge discovery. Lecture notes in computer science. Springer, Berlin Heidelberg New York, September 2000, pp 94-103

  29. Zhang X, Rundensteiner EA, Ding L (2001) PVM: parallel view maintenance under concurrent data updates of distributed sources. In: Abstracts of data warehousing and knowledge discovery, Munich, September 2001, pp 230-239

  30. Zhuge Y, García-Molina H, Hammer J, Widom J (1995) View maintenance in a warehousing environment. In: Abstracts of the 1995 ACM SIGMOD international conference on management of data, San Jose, 22-25 May 1995, pp 316-327

  31. Zhuge Y, García-Molina H, Wiener JL (1996) The Strobe algorithms for multi-source warehouse consistency. In: Abstracts of of the 4th international conference on parallel and distributed information systems, 18-20 December 1996, Miami Beach, pp 146-157

  32. Zhuge Y, García-Molina H, Wiener JL (1998) Consistency algorithms for multi-source warehouse view maintenance. Distributed Parallel Databases 6(1):7-40

    Article  Google Scholar 

  33. Zhuge Y, Wiener JL, García-Molina H (1997) Multiple view consistency for data warehousing. In: Abstracts of the 13th international conference on data engineering, 7-11 April 1997 Birmingham, UK, pp 289-300

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xin Zhang.

Additional information

Received: 12 November 2001, Accepted: 18 December 2002, Published online: 31 July 2003

This work was supported in part by the NSF NYI grant IIS-979624 and NSF CISE Instrumentation grant IRIS 97-29878 and NSF grant IIS-9988776.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, X., Ding, L. & Rundensteiner, E.A. Parallel multisource view maintenance. VLDB 13, 22–48 (2004). https://doi.org/10.1007/s00778-003-0086-0

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00778-003-0086-0

Keywords:

Navigation