Abstract
The DWS (Data Warehouse Striping) technique is a round-robin data partitioning approach especially designed for distributed data warehousing environments. In DWS the fact tables are distributed by an arbitrary number of low-cost computers and the queries are executed in parallel by all the computers, guarantying a nearly optimal speed up and scale up. However, the use of a large number of inexpensive nodes increases the risk of having node failures that impair the computation of queries. This paper proposes an approach that provides Data Warehouse Striping with the capability of answering to queries even in the presence of node failures. This approach is based on the selective replication of data over the cluster nodes, which guarantees full availability when one or more nodes fail. The proposal was evaluated using the newly TPC-DS benchmark and the results show that the approach is quite effective.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Agosta, L.: Data Warehousing Lessons Learned: SMP or MPP for Data Warehousing, DM Review Magazine (2002)
Bernardino, J., Madeira, H.: A New Technique to Speedup Queries in Data Warehousing. In: ABDIS-DASFA, Symp. on Advances in DB and Information Systems, Prague (2001)
Bernardino, J., Madeira, H.: Experimental Evaluation of a New Distributed Partitioning Technique for Data Warehouses. In: IDEAS 2001, Grenoble, France (2001)
Critical Software SA, DWS, http://www.criticalsoftware.com/
DATAllegro, DATAllegro v3™, http://www.datallegro.com/
ExtenDB, ExtenDB Parallel Server for Data Warehousing, http://www.extendb.com/
IDC, Survey-Based Segmentation of the Market by Data Warehouse Size and Number of Data Sources (2004)
Kimball, R., Ross, M.: The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling, 2nd edn. J. Wiley & Sons, Inc, Chichester (2002)
Netezza, The Netezza Performance Server® Data Warehouse Appliance, http://www.netezza.com/
Sun Microsystems, Data Warehousing Performance with SMP and MPP Architectures, White Paper (1998)
Transaction Processing Performance Council, TPC BenchmarkTM DS (Decision Support) Standard Specification, Draft Version 32 (2007), available at: http://www.tpc.org/tpcds/
Lin, Y., et al.: Middleware based Data Replication providing Snapshot Isolation. In: ACM SIGMOD Int. Conf. on Management of Data, Baltimore, Maryland, USA (2005)
Patino-Martinez, M., Jimenez-Peris, R., Alonso, G.: Scalable Replication in Database Clusters. In: Herlihy, M.P. (ed.) DISC 2000. LNCS, vol. 1914, Springer, Heidelberg (2000)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Vieira, J., Vieira, M., Costa, M., Madeira, H. (2008). Redundant Array of Inexpensive Nodes for DWS. In: Haritsa, J.R., Kotagiri, R., Pudi, V. (eds) Database Systems for Advanced Applications. DASFAA 2008. Lecture Notes in Computer Science, vol 4947. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-78568-2_49
Download citation
DOI: https://doi.org/10.1007/978-3-540-78568-2_49
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-78567-5
Online ISBN: 978-3-540-78568-2
eBook Packages: Computer ScienceComputer Science (R0)