skip to main content
10.1145/319757.319793acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
Article
Free Access

A dynamic load balancing strategy for parallel datacube computation

Authors Info & Claims
Published:06 November 1999Publication History

ABSTRACT

In recent years, OLAP technologies have become one of the important applications in the database industry. In particular, the datacube operation proposed in [5] receives strong attention among researchers as a fundamental research topic in the OLAP technologies. The datacube operation requires computation of aggregations on all possible combinations of each dimension attribute. As the number of dimensions increases, it becomes very expensive to compute datacubes, because the required computation cost grows exponentially with the increase of dimensions. Parallelization is very important factor for fast datacube computation. However, we cannot obtain sufficient performance gain in the presence of data skew even if the computation is parallelized. In this paper, we present a dynamic load balancing strategy, which enables us to extract the effectiveness of parallizing datacube computation sufficiently. We perform experiments based on simulations and show that our strategy performs well.

References

  1. 1.S. Agarwal, R. Agrawal, P. M. Deshpande, A. Gupta, J. F. Naughton, R. Ramakrishnan and S. Sarawagi, "On the Computation of Multidimentional Aggregates", In Proceedings of the International Conference on Very Large Databases, pages 506-52 1, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. 2.K. S. Beyer and R. Ramakrishnan, "Bottom-Up Computation of Sparse and Iceberg CUBES", In Proceedings of the ACM SIGMOD Conference on Management of Data, pages 359- 370,1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. 3.P. M. Deshpande, S. Agarwal, J. F. Naughton and R. Ramakrishnan, "Computation of Multidimensional Aggregates", Technical Report 1314, University of Wisconsin, Madison, 1996.Google ScholarGoogle Scholar
  4. 4.D. J. Dewitt, J. F. Naughton, D. A. Schneider and S. Seshadri, "Practical Skew Handling in Parallel Joins", In Proceedings of the International Conference on Very Large Databases, pages 27-40, 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. 5.J. Gray, A. Bosworth, A. Layman and H. Pirahesh, "A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals", In Proceedings of the IEEE International Conference on Data Engineering, pages 152- 159, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. 6.S. Goil and A. Choudhary, "High Performance OLAP and Data Mining on parallel computers", Journal of Data Mining and Knowledge DiscoveT, 1(4):391-417, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. 7.K. A. Hua and C. Lee, "Handling Data Skew in Multiprocessor Database Computers Using Partition Tuning", In Proceedings of the International Conference on Very Large Databases, pages 525-535, 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. 8.V. Harinarayan, A. Rajaraman and J. D. Ullman, "Implementing Data Cubes Efficiently", In Proceedings of the ACM SIGMOD Conference on Management of Data, pages 205-2 16, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. 9.M. Kitsuregawa and Y. Ogawa, "Bucket Spreading Parallel Hash: A New, Robust, Parallel Hash Join Method for Data Skew in the Super Database Computer (SDC)", In Proceedings of the International Conference on Very Large Databases, pages 2 1 O-22 1, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. 10.K. A. Ross and D. Srivastava, "Fast Computation of Sparse Datacubes", In Proceedings of the International Conference on Very Large Databases, pages 116-I 25, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. 11.S. Sarawag, R. Agrawal and A. Gupta, "On Computing the Data Cube", Research Report RJ10026, IBM Almaden Research Center, San Jose, CA, 1996.Google ScholarGoogle Scholar
  12. 12.A. Shatdal and J. F. Naughton, "Adaptive Parallel Aggregation Algorithms", In Proceedings of the ACM SIGMOD Conference on Management of Data, pages 104- 114, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. 13.A. Shukla, P. Deshpande, J. F. Naughton and K. Ramasamy, "Storage Estimation for Multidimensional Aggregates in the Presence of Hierarchies", In Proceedings of the International Conference on Very Large Databases, pages 522-53 1, 1996 Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. 14.C. B. Walton, A. G. Dale and R. M. Jenevein, "A Taxonomy and Performance Model of Data Skew Effects in Parallel Joins", In Proceedings of the International Conference on Very Large Databases, pages 537-548, 1991 Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. 15.Y. Zhao, P. M. Deshpande and J. F. Naughton, "An Array- Based Algorithm for Simultaneous Multidimensional Aggregates", In Proceedings of the ACM SIGMOD Conference on Management of Data, pages 159-170, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. A dynamic load balancing strategy for parallel datacube computation

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      DOLAP '99: Proceedings of the 2nd ACM international workshop on Data warehousing and OLAP
      November 1999
      108 pages
      ISBN:1581132204
      DOI:10.1145/319757

      Copyright © 1999 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 6 November 1999

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • Article

      Acceptance Rates

      Overall Acceptance Rate29of79submissions,37%

      Upcoming Conference

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader