Skip to main content

Large Relations in Node-Partitioned Data Warehouses

  • Conference paper
Database Systems for Advanced Applications (DASFAA 2005)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3453))

Included in the following conference series:

Abstract

A cheap shared-nothing context can be used to provide significant speedup on large data warehouses, but partitioning and placement decisions are important in such systems as repartitioning requirements can result in much less-than-linear speedup. This problem can be minimized if query workload and schemas are inputs to placement decisions. In this paper we analyze the problem of handling large relations in a node partitioned data warehouse (NPDW) with a basic placement strategy that partitions facts horizontally and replicates dimensions, with the help of a cost model. Then we propose a strategy to improve performance and show both analytical and TPC-H results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Kitsuregawa, M., Tanaka, H., Motooka, T.: Application of hash to database machine and its architecture. New Generation Computing 1(1), 66–74 (1983)

    Article  Google Scholar 

  2. Liu, C., Chen, H., Krueger, W.: A Distributed Query Processing Strategy Using Placement Dependency. In: Proc. 12th Int’l Conf. on Data Eng., February 1996, pp. 477–484 (1996)

    Google Scholar 

  3. Rao, J., Zhang, C., Megiddo, N., Lohman, G.M.: Automating physical database design in a parallel database. In: SIGMOD Conference 2002, pp. 558–569 (2002)

    Google Scholar 

  4. Shasha, D., Wang, T.-L.: Optimizing Equijoin Queries (...) where Relations are Hash-Partitioned. ACM Transactions on Database Systems 16(2), 279–308 (1991)

    Article  MathSciNet  Google Scholar 

  5. Transaction Processing Council Benchmarks, http://www.tpc.org

  6. Yu, C.T., Guh, K.-C., Zhang, W., Templeton, M., Brill, D., Chen, A.L.P.: Algorithms to Process Distributed Queries in Fast Local Networks. IEEE Transactions on Computers 36(10), 1153–1164 (1987)

    Article  Google Scholar 

  7. Zhou, S., Williams, M.H.: Data Placement in Parallel Database Systems. Parallel Database Techniques. IEEE Computer Society Press, Los Alamitos (1997)

    Google Scholar 

  8. Zilio, D.C., Jhingran, A., Padmanabhan, S.: Partitioning Key Selection for a Shared-Nothing Parallel Database System IBM Research Report RC 19820 (87739) 11/10/94 ,T. J. Watson Research Center, Yorktown Heights, NY (October 1994)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Furtado, P. (2005). Large Relations in Node-Partitioned Data Warehouses. In: Zhou, L., Ooi, B.C., Meng, X. (eds) Database Systems for Advanced Applications. DASFAA 2005. Lecture Notes in Computer Science, vol 3453. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11408079_49

Download citation

  • DOI: https://doi.org/10.1007/11408079_49

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-25334-1

  • Online ISBN: 978-3-540-32005-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics