Abstract
A cheap shared-nothing context can be used to provide significant speedup on large data warehouses, but partitioning and placement decisions are important in such systems as repartitioning requirements can result in much less-than-linear speedup. This problem can be minimized if query workload and schemas are inputs to placement decisions. In this paper we analyze the problem of handling large relations in a node partitioned data warehouse (NPDW) with a basic placement strategy that partitions facts horizontally and replicates dimensions, with the help of a cost model. Then we propose a strategy to improve performance and show both analytical and TPC-H results.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Kitsuregawa, M., Tanaka, H., Motooka, T.: Application of hash to database machine and its architecture. New Generation Computing 1(1), 66–74 (1983)
Liu, C., Chen, H., Krueger, W.: A Distributed Query Processing Strategy Using Placement Dependency. In: Proc. 12th Int’l Conf. on Data Eng., February 1996, pp. 477–484 (1996)
Rao, J., Zhang, C., Megiddo, N., Lohman, G.M.: Automating physical database design in a parallel database. In: SIGMOD Conference 2002, pp. 558–569 (2002)
Shasha, D., Wang, T.-L.: Optimizing Equijoin Queries (...) where Relations are Hash-Partitioned. ACM Transactions on Database Systems 16(2), 279–308 (1991)
Transaction Processing Council Benchmarks, http://www.tpc.org
Yu, C.T., Guh, K.-C., Zhang, W., Templeton, M., Brill, D., Chen, A.L.P.: Algorithms to Process Distributed Queries in Fast Local Networks. IEEE Transactions on Computers 36(10), 1153–1164 (1987)
Zhou, S., Williams, M.H.: Data Placement in Parallel Database Systems. Parallel Database Techniques. IEEE Computer Society Press, Los Alamitos (1997)
Zilio, D.C., Jhingran, A., Padmanabhan, S.: Partitioning Key Selection for a Shared-Nothing Parallel Database System IBM Research Report RC 19820 (87739) 11/10/94 ,T. J. Watson Research Center, Yorktown Heights, NY (October 1994)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Furtado, P. (2005). Large Relations in Node-Partitioned Data Warehouses. In: Zhou, L., Ooi, B.C., Meng, X. (eds) Database Systems for Advanced Applications. DASFAA 2005. Lecture Notes in Computer Science, vol 3453. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11408079_49
Download citation
DOI: https://doi.org/10.1007/11408079_49
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-25334-1
Online ISBN: 978-3-540-32005-0
eBook Packages: Computer ScienceComputer Science (R0)