Skip to main content
Log in

Main Memory-Based Algorithms for Efficient Parallel Aggregation for Temporal Databases

  • Published:
Distributed and Parallel Databases Aims and scope Submit manuscript

Abstract

The ability to model the temporal dimension is essential to many applications. Furthermore, the rate of increase in database size and stringency of response time requirements has out-paced advancements in processor and mass storage technology, leading to the need for parallel temporal database management systems. In this paper, we introduce a variety of parallel temporal aggregation algorithms for the shared-nothing architecture; these algorithms are based on the sequential Aggregation Tree algorithm. We are particularly interested in developing parallel algorithms that can maximally exploit available memory to quickly compute large-scale temporal aggregates without intermediate disk writes and reads. Via an empirical study, we found that the number of processing nodes, the partitioning of the data, the placement of results, and the degree of data reduction effected by the aggregation impacted the performance of the algorithms. For distributed result placement, we discovered that Greedy Time Division Merge was the obvious choice. For centralized results and high data reduction, Pairwise Merge was preferred for a large number of processing nodes; for low data reduction, it only performed well up to 32 nodes. This led us to a centralized variant of Greedy Time Division Merge which was best for the remaining cases. We present a cost model that closely predicts the running time of Greedy Time Division Merge.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. D. Bitton, H. Boral, D.J. DeWitt, and W.K. Wilkinson, “Parallel algorithms for the execution of relational database operations,” ACM Transactions on Database Systems, vol. 8, no. 3, pp. 324–353, 1983.

    Article  Google Scholar 

  2. Ohio Supercomputer Center, LAM/MPI Parallel Computing. http://www.osc.edu/lam.html, 1998.

  3. D.J. DeWitt, S. Ghandeharizadeh, D.A. Schneider, A. Bricker, H.-I. Hsiao, and R. Rasmussen, “The gamma database machine project,” IEEE Transactions on Knowledge and Data Engineering, vol. 2, no. 1, pp. 44–62, 1990.

    Article  Google Scholar 

  4. D.J. DeWitt and J. Gray, “Parallel database systems: The future of high performance database systems,” Communications of the ACM, vol. 35, no. 6, pp. 85–98, 1992.

    Article  Google Scholar 

  5. R. Epstein, “Techniques for processing of aggregates in relational database systems,” Technical Report UCB/ERLM7918, University of California, Berkeley, CA, Feb. 1979.

    Google Scholar 

  6. J.C. Freytag and N. Goodman, “Translating aggregate queries into iterative programs,” in Proceedings of the 12th VLDB Conference, Kyoto, Japan, 1986, pp. 138–146.

  7. J.A.G. Gendrano, R. Shah, R.T. Snodgrass, and J. Yang, “University information system (UIS) dataset,” Technical Report TimeCenter CD-1, Department of Computer Science, University of Arizona, Sept. 1998.

  8. C.S. Jensen, J. Clifford, R. Elmasri, S.K. Gadia, P. Hayes, and S. Jajodia (Eds.), “A glossary of temporal database concepts,” ACM SIGMOD Record, vol. 23, no. 1, pp. 52-64, 1994.

  9. N. Kline, “Aggregation in temporal databases,” PhD dissertation, Computer Science Department, University of Arizona, May 1999.

  10. N. Kline and R.T. Snodgrass, “Computing temporal aggregates,” in Proceedings of the IEEE International Conference on Data Engineering, Taipei, Taiwan, March 1995, pp. 222–231.

  11. B. Moon, I.F. Vega López, and V. Immanuel, “Scalable algorithms for large temporal aggregation,” in Proceedings of the IEEE International Conference on Data Engineering, San Diego, CA, 2000, pp. 145–154.

  12. M. Muralikrishna and D.J. DeWitt, “Equi-depth histograms for estimating selectivity factors for multidimensional lueries,” in Proceedings of the ACM SIGMOD International Conference on Management of Data, 1988, pp. 28–36.

  13. R.T. Snodgrass and I. Ahn, “Temporal databases,” IEEE Computer, vol. 19, no. 9, pp. 35–42, 1986.

    Google Scholar 

  14. R.T. Snodgrass, S. Gomez, and E. McKenzie, “Aggregates in the temporal query language TQuel,” tkde, vol. 5, pp. 826–842, Oct. 1993.

    Google Scholar 

  15. M. Stonebraker, “The case for shared nothing,” A Quarterly Bulletin of the IEEE Computer Society Technical Committee on Database Engineering, vol. 9, no. 1, pp. 4–9, 1986.

    Google Scholar 

  16. Transaction Processing Performance Council (TPC), TPC Benchmark D (Decision Support), Standard Specification, Revision 1.3.1, Aug. 1998.

  17. P.A. Tuma, Implementing Historical Aggregates in TempIS, 1992. Master's Thesis.

  18. C. Turbyfill, C. Orji, and D. Bitton, “AS3 AP: An ANSI SQL standard scaleable and portable benchmark for relational database systems,” in J. Gray (Ed.), The Benchmark Handbook for Database and Transaction Processing Systems, Morgan Kaufmann Publishers, 1991, chapter 4, pp. 167–207.

  19. C.B. Walton, A.G. Dale, and R.M. Jenevein, “A taxonomy and performance model of data skew effects in parallel joins,” in Proceedings of the VLDB Conference, Barcelona, Spain, Sept. 1991, pp. 537–548.

  20. J. Yang and J. Widom, “Incremental computation and maintenance of temporal aggregates,” in Proceedings of the International Conference on Data Engineering, Heidelberg, Germany, 2001, pp. 51–60.

  21. J. Yang and J. Widom, “Incremental computation and maintenance of temporal aggregates,” Very Large Databases Journal, vol. 12, 2002.

  22. X. Ye and J.A. Keane, “Processing temporal aggregates in parallel,” in IEEE International Conference on Systems, Man, and Cybernetics, Orlando, FL, Oct. 1997, pp. 1373–1378.

  23. D. Zhang, A. Markowetz, V. Tsotras, D. Gunopulos, and B. Seeger, “Efficient computation of temporal aggregates with range predicates,” in Proceedings of the ACM Principles of Database Systems, Santa Barbara, CA, May, 2001.

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gao, D., Gendrano, J.A.G., Moon, B. et al. Main Memory-Based Algorithms for Efficient Parallel Aggregation for Temporal Databases. Distributed and Parallel Databases 16, 123–163 (2004). https://doi.org/10.1023/B:DAPD.0000028553.70337.e1

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/B:DAPD.0000028553.70337.e1

Navigation