Abstract
Decision support systems that include on-line analytical processing and data mining have recently attracted research attention. Such applications treat data in very large databases as multidimensional data cubes. Each cell of a data cube typically is some aggregation, such as total sales volume, that is of interest to analysts. Since it may be necessary to compute many cells, and the performance is critical, we propose parallel algorithms that compute multiple aggregate queries in data cubes on a shared-nothing multiprocessor with high-bandwidth communication facilities. We evaluate the algorithms on the basis of analytical modeling and an implementation on an IBM SP2 system.
Preview
Unable to display preview. Download preview PDF.
References
Tilak Agerwala, Joanne L. Martin, Jamshed H. Mirza, David C. Sadler, Daniel M. Dias, and Marc Snir. SP2 system architecture. IBM Systems Journal, 34(2):152–184, 95.
Sameet Agrawal, Rakesh Agrawal, Prasad M. Deshpande, Ashish Gupta, Jeffrey F. Naughton, Raghu Ramakrishnan, and Sunita Sarawagi. On the computation of multidimensional aggregates. In Proceedings of the 22nd VLDB Conference, September 1996.
Dina Bitton, Haran Boral, David J. DeWitt, and W. Kevin Wilkinson. Parallel algorithms for the excecution of relational database operations. ACM Trans. on Database Systems, 8(3):324–353, September 1983.
E. F. Codd, S. B. Codd, and C. T. Salley. Beyond decision support. Computer-world, 27(30), July 1993.
Takeshi Fukuda, Yasuhiko Morimoto, Shinichi Morishita, and Takeshi Tokuyama. Constructing efficient decision trees by using optimized association rules. In Proceedings of the 22nd VLDB Conference, pages 146–155, 1996.
Takeshi Fukuda, Yasuhiko Morimoto, Shinichi Morishita, and Takeshi Tokuyama. Data mining using two-dimensional optimized association rules: Scheme, algorithms, and visualization. In Proceedings of the ACM SIGMOD Conference on Management of Data, pages 13–23, June 1996.
Takeshi Fukuda, Yasuhiko Morimoto, Shinichi Morishita, and Takeshi Tokuyama. Mining optimized association rules for numeric attributes. In Proceedings of the Fifteenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pages 182–191, June 1996.
Takeshi Fukuda, Yasuhiko Morimoto, Shinichi Morishita, and Takeshi Tokuyama. Sonar: System for optimized numeric association rules. In Proceedings of the ACM SIGMOD Conference on Management of Data, page 553, June 1996.
Goetz Graefe. Query evaluation techniques for large databases. ACM Computing Surveys, 25(2):73–170, June 1993.
Jim Gray, Adam Bosworth, Andrew Layman, and Hamid Pirahesh. Data cube: A relational aggregation operator generalizing group-by, cross-tab, and sub-totals. Technical report, Microsoft, November 1995.
Ashish Gupta, Venky Harinarayan, and Dallan Quass. Aggregate-query processing in data warehousing environments. In Proceedings of the 21st VLDB Conference, pages 358–369, 1995.
Himanshu Gupta, Venky Harinarayan, Anand Rajaraman, and Jeffrey D. Ullman. Index selection for OLAP. Working Paper, 1996.
Peter J. Haas, Jeffrey F. Naughton, S. Seshadri, and Lynne Stokes. Sampling-based estimation of the number of distinct values of an attribute. In Proceedings of the 21st VLDB Conference, pages 311–322, 1995.
Venky Harinarayan, Anand Rajaraman, and Jeffrey D. Ullman. Implementing data cubes efficiently. In Proceedings of the ACM SIGMOD Conference on Management of Data, pages 205–216, June 1996.
Theodore Johnson and Dennis Shasha. Hierarchically split cube forests for decision support: description and tuned design. Working Paper, 1996.
Message Passing Interface Forum. MPI: A Message-Passing Interface Standard, May 1994.
Yasuhiko Morimoto, Hiromu Ishii, and Shinichi Morishita. Efficient construction of regression trees with range and region splitting. In Proceedings of the 23rd VLDB Conference, pages 166–175, August 1997.
Sunita Sarawagi, Rakesh Agrawal, and Ashish Gupta. On computing the data cube. Technical Report RJ10026, IBM Almaden Research Center, 1996.
Ambuj Shatdal and Jeffrey F. Naughton. Adaptive parallel aggregation algorithms. In Proceedings of the ACM SIGMOD Conference on Management of Data, pages 104–114, May 1995.
Cralg B. Stunkel, Dennis G. Shea, Bülent Abali, Mark G. Atkins, Carl A. Bender, Don G. Grice, Peter Hochschild, Doug J. Joseph, Ben J. Nathanson, Richard A. Swetz, Robert F. Stucke, Mickey Tsao, and Philip R. Varker. The SP2 high-performance switch. IBM Systems Journal, 34(2):185–204, 95.
Kunikazu Yoda, Takeshi Fukuda, Yasuhiko Morimoto, Shinichi Morishita, and Takeshi Tokuyama. Computing optimized rectilinear regions for association rules. In Proceedings, Third International Conference on Knowledge Discovery and Data Mining, pages 96–103, August 1997.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1998 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Takeshi, F., Matsuzawa, H. (1998). Parallel processing of multiple aggregate queries on shared-nothing multiprocessors. In: Schek, HJ., Alonso, G., Saltor, F., Ramos, I. (eds) Advances in Database Technology — EDBT'98. EDBT 1998. Lecture Notes in Computer Science, vol 1377. Springer, Berlin, Heidelberg . https://doi.org/10.1007/BFb0100991
Download citation
DOI: https://doi.org/10.1007/BFb0100991
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-64264-0
Online ISBN: 978-3-540-69709-1
eBook Packages: Springer Book Archive